Monday, January 5, 2009

Just how small is the smallest illegal number?

I've seen a number of posts recently relating to "illegal numbers", such as these ones:

Copyrighting a number, and
Converting pi to binary, which points to
Converting Pi to binary: Don't do it!

The first of these relates to the fact that files are just essentially numbers, and replicating certain numbers can be illegal (even to DMCA issues). The third (and the second which points to it) is essentially remarking on the fact that π is thought to be a normal number, so it is believed that all possible finite digit sequences occur in its expansion (and so contains all such illegal numbers).

Let's consider the first issue: some integers are definitely illegal - it appears quite possible to get yourself into trouble posting particular integers. The post at Copyrighting a number suggests that the smallest such number is incomrehensibly large.

But I suspect that this isn't necessarily the case.

Certainly it's true that the smallest integer representing a song would be huge. The smallest integer representing a book would be far, far larger very large**. But it's not only entire songs or books that are subject to copyright. Indeed, it only takes a few bars before you get into copyright trouble. And it doesn't need to be a few bars of a perfomance. In fact, I could represent in a compact ASCII form just enough of a simple tune to get myself into trouble in only a small number of characters (I think with careful choice of tune, the extracted bars and the representation, I could do it in less than one line of text here). I could represent it in some compressed form in much less space (posting a zip file of a copyright work is just as illegal as posting an uncompressed version, and so it would be with any other compression scheme). Maybe I could get into trouble posting slightly less of a book - I can quote a tiny bit of a book in certain circumstances, but if those circumstances don't hold, how little of a book can I quote before I am in copyright violation?

** [That "far, far larger" (now elided) looks like a dumb error if you compare the size of a high fidelity musical recording with a book recorded as, say an ascii file. But that's not comparing like with like. A book is like a score (and lyrics), not like the performance of it. If we compare a performance of a book with a performance of a song, then again, the book is larger (perfomed books typically take several CDs). But arguing about this is a distraction from the point here, so I have changed the text above to avoid the issue.]

So, then, we are left to wonder, what is the smallest possible illegal digit sequence?

What it the smallest possible representation of it?

(Might the πth root of 237 have that digit sequence somewhere in its first ten thousand digits, thus giving me an incredibly compact way of specifying an illegal digit sequence? I doubt it.)

Anyway, I'm inclined to think that it might be possible to get yourself into trouble with say the DMCA with some surprisingly small digit sequences.

And wait a minute - does this post constitute an encouragement to breaking the law?
I've asked for people to tell me the smallest illegal number. Certainly, if someone does post such a number in the comments, it would be easy to argue that I incited it. But actually, it seems the law on that doesn't actually require the other person to break the law. And even if it's not enough for me just to ask what the number is, I don't want any of my readers going to jail if they do figure out the number.

One less

Perhaps it's better to ask what number is one less than the smallest illegal number. Maybe that's safer.

But consider the RIAA - I don't expect that the RIAA would have the slightest hesitation in trying to smack me down for publishing one less than the smallest illegal number because that description of it implies an algorithm for producing the smallest illegal number from it... and so this very paragraph contains an algorithm for producing an illegal number, were I to later post one less than such a number. (In fact, is it illegal to describe such an algorithm already, without having a number to apply it to? The DMCA seems to be pretty broad. Have I in fact already broken the law merely by describing the algorithm - that "the way to produce the smallest illegal integer from the next smallest number is simply to add 1"?)

So asking for one less than an illegal number is no good. Certainly publishing that number with the description ("just add one") is illegal. But if I publish a number and an algorithm for producing an illegal number from it in two different places, I believe I have still broken the law (they don't have to be in the same location, surely, or the law would be trivially circumvented). Given I've already posted an algorithm for producing an illegal number from one less than it, it seems that posting one less than the smallest illegal number anywhere must in turn be illegal.

No doubt you've noticed the problem already.

By that reasoning, one less than that number must also be illegal.

Assuming we restrict our search to the non-negative integers, I believe I can therefore assert that the smallest illegal number I can publish to my blog is zero.

Oops - and just I made the mistake of publishing it to my blog.

Numbers. Every digit of every number. They're all illegal.

If this blog disappears suddenly, I've probably been got by the goons at the RIAA.


  1. I think the induction argument applies to the algorithm as well. That is, you'd have to post 0 somewhere and then also post an algorithm for reaching the smallest illegal number from 0. So far you've only posted "keep adding one until the RIAA knocks down your door". Could take awhile.

  2. Actually, your proof draws a somewhat incorrect conclusion. The conclusion *isn't* that all numbers are illegal, merely that illegal numbers are incomputable. The proof is a direct analogue of Berry's paradox, or the proof of the incomputability of Kolmogorov Complexity. (KC)

    This of course leads to the following definition of illegal number:

    An illegal number is a number of sufficient length which encodes proprietary information and whose KC to size ratio is sufficiently small. In other words, compressible numbers which "mean" something.

    Even so, one could always take a One-Time Pad as an encryption key, either on the industry side, or the pirating side - thereby essentially transferring random bits; then the original number must be under copyright, and all of its encryptions under the OTP-scheme. Therefore, all sufficiently large numbers are illegal.

    I suppose the way to reason around this is as follows: A number is associatively illegal if, under a known and sufficiently widespread, *bijective* compression and/or encryption function, it is the image of an illegal number.

    It follows that all numbers are either illegal or associatively illegal; however to prove wrongdoing, one would have to prove that the person used an associatively illegal number in conjunction with a specific compressor/encryptor to represent a specific illegal number; which I think is what we want.

  3. Some of what is being said is a little above my head, but it's good for me. Amazing coincidence: Just today I told my intro to algebra students that I had read all finite sequences of digits appear in pi. We had fun looking up a fake phone number on a pi website. They were amazed to hear that this all implies that the complete works of Shakespeare (using a=1, b=2,...) must appear.

  4. You didn't put any profile information, but I surmise from the time stamp that you are in Eastern Australia. Am I right?