A new mystery about Satoshi hidden in the Bitcoin block-chain

Some time ago, I received an e-mail from my friend Timo Hanke. If you don’t know Timo, then you should, because he is, apart from a respected mathematician and Bitcoin enthusiast, an excellent person. The e-mail suggested that I looked into the nonce field to see if I could find out the endianess of Satoshi’s original mining machine. He was talking about the nonce in each block header, not the ExtraNonce I talked in my first post on Satoshi. My first thought was “nonces increase too quickly to leave any recognizable fingerprint”, but then I recalled that back in 2009 there were no GPUs and no ASICs and even if the nonce wrapped-around zero a couple of times, there would be a perceptible statistical imbalance towards zero in the most significant byte. So, armed with my own block-chain parser library, I prepared to begin a new fight against time and go back to the early days of Bitcoin. It took me half an hour to get my first surprise. One image worths a thousand words (or at least it stops a thousand false verbal arguments) so here is the image:


This image shows the least significant byte of the nonce, interpreted in a little endian machine, from the genesis block upto block 36288 (year 2010). This is neither an uniform distribution (which one would expect from a totally random byte) nor the decreasing exponential one would expect for the most significant byte of a big endian machine.

I’ll show you the most significant byte of the nonce, so you can compare.

Byte3-AllThis last graph clearly proves that almost all machines mining from 2009 to 2010 were little endian.

So the next thing I did is try to find the reason for such an awkward probability distribution in the LSB of the nonce. So I divided the graph into two: one for “Satoshi” coinbases and one for the remaining coinbases. To identify Satoshi coinbases I used a coarser method than the original: I just separated spent coinbases and unspent coinbases, which identifies Satoshi coins with good accuracy.

These are the corresponding graphs:


Now it was completely clear that the pattern belongs only to Satoshi mining machine. If you want to compare with a middle byte in the nonce of Satoshi nonces, this is how it looks:

Byte1-UnSpentThat is how the byte 0 should have looked like if it were truly random.

So at this point I thought about four possible reasons for the imbalance:

A. My block chain parser is completely broken. Or my PC has been hacked and someone is playing a joke on me.

B. Satoshi was mining with a hardware very different from a PC. The imbalance was due to an optimization on the hardware, such as using gray codes for counting. In fact, the missing nonce values seems roughly compatible with a kind of decimal broken gray code. But this is too extravagant to be true. And nobody would use a decimal gray code instead of a binary gray code for a binary machine. But f this is true, then it has far reaching consequences: Satoshi foresaw the advantage of FPGA/ASIC much sooner than everybody else.

C. Satoshi discovered a flaw in SHA-2 so he didn’t need to go through all 32 bit nonces, just one fourth of them. Or he just solved some equations to find the nonce which gave only those restricted solutions. This is highly improbable.

D. Satoshi  left a message fingerprinted in the nonces. A Message for us to see in a distant future.

The number of nonces that fall into each byte value  or histogram (up to block 20000) begins with the following values:

0: 312
1: 286
2: 303
3: 276
4: 297
5: 301
6: 324
7: 276
8: 305
9: 247
10: 2 (the gap begins here)
11: 2
12: 2
13: 2
14: 3
15: 6
16: 5
17: 2
18: 5
19: 138
(high values from 20 to 58)

58: 201

59: 7 ( low values continue from here up to 255)

The selected set of bytes (0 to 9, 19 to 58) could have been selected to map a somewhat extended alphabet into this set. This must be explored further.

Another idea that crossed my mind is that the set itself has a kind of hidden message. It could refer to the date 18-10-1960 (taking the numbers between where the sequence changes with the exception of 1960 which should be 1958). 18-10-1960 is the date the article “Socialism, Inflation, and the Thrifty Householder” from Ludwig von Mises was published (see the article online). But this is something I made up in my head to justify the nonsense in the numbers, don’t you think? I am seeing order where there is no order at all, as in the film Pi?

We’re living in a LOST movie: each time it looks a mystery is solved, another one appears. I hope my friend Timo (the mathematician) now finds the hidden message and brings me some peace.

As always, I ask Bitcoin programmers to check my findings because errare humanum est.

Best regards, Sergio.

EDIT: It looks that Eyal0 got it right. His explanation of the probability distribution is the most convincing: it’s not a probability distribution at all! He suggested that Satoshi had access to 58 machines for mining, so to avoid checking the same nonce twice he gave each machine a different id, which was stamped in the LSB of the nonce. I think the reason the machines 10-18 are missing is because they belonged to the next Computer Lab in Satoshi’s faculty, but at the last time he was forgiven access to that Lab.

This explanation could be proved/disproved by checking the frequency of ExtraNonces going back in time. If too many computers are mining together (started at the same time) then one would expect one to be slightly faster than the other, so ExtraNonces are not synchronized. Then a machine with a lower ExtraNonce can solve a block just after a machine with a higher ExtraNonce, and time seams to go back.

EDIT2: Still another theory is that there were only 6 computers, each running a limited range of 10 LSB nonces. One of them broke, and was not used at the last moment. But I don’t buy it, since 58 is not divisible by 6.

Don’t forget to donate a little bit to my Bitcoin address: 17mcFB7Xyymd9hxp2bgNPz1ruWsdoPoCnZ


, , ,

  1. #1 by Marc on September 3, 2013 - 11:06 am

    Wonderful to read your articles about Satoshi as usual. What you uncover is however not only a mystery, but if true, too good to be true. How one person can perceive the number of things that he has is the greatest mystery of them all. If true one person did this, it is a real Leonardo Da Vinci of the modern era.

  2. #3 by eyal0 on September 3, 2013 - 1:34 pm

    I wrote a little more about my theory behind this LSB oddity. Any chance of you making the scatter plot mentioned here:


    Or at least publish the data in Google docs?

    BTW, the difficulty in the year that you measured is just 1.0. That’s around 4million hashes/second. If he had an FPGA, it wasn’t a very fast one!

    • #4 by SDLerner on September 3, 2013 - 1:43 pm

      Eyal0, I’d like to add your explanation to the post. I think you got it right.
      May I?

      • #5 by eyal0 on September 3, 2013 - 2:38 pm

        Sure. Just don’t publish my email address. Thanks!

  3. #6 by bear on September 3, 2013 - 3:54 pm

    I recognize this pattern, or at least I recognize it as a pattern that a combination of things I know could have created.

    The pattern you’re looking at in the low six bits is the output of a normally distributed hardware randomness source, except that bit 4 (the least significant bit of the top nybble) is masked off to zero and then the output is run through a BCD-to-Binary conversion. normally distributed hardware “random” number generator chips were not uncommon in the 1990s, and the manufacturers did cover their asses by telling people to use them with heavy conditioning to improve the quality of the randomness.

    So this looks like a hardware hashing circuit, probably implemented in an FPGA, using a (somewhat flaky) hardware random number chip for the bottom 6 bits, and a BCD-to-Binary conversion on the output of that random number chip for “conditioning” — except that the trace between the RNG chip and the BCD-to-binary chip was subject to a soldering mistake. The top 2 bits of the bottom byte would have been supplied by a (serial) counter feeding a common value to the (parallel) hashing circuits.

    Anyway, running this hardware multiple times for each increment of the counter would cover all the possible values it could take on (except the 10 values blocked by the soldering error) pretty rapidly.

    So, Satoshi probably had a smokin fast hand-built FPGA mining rig when he started bitcoin, which gave him approximately the same advantage as a giant million-coin premine. If he’d gotten the soldering right on bit 4 between the RNG and the BCD converter, it could have been a million-and-a-half.

  4. #7 by gwern on September 3, 2013 - 4:21 pm

    > This explanation could be proved/disproved by checking the frequency of ExtraNonces going back in time. If too many computers are mining together (started at the same time) then one would expect one to be slightly faster than the other, so ExtraNonces are not synchronized. Then a machine with a lower ExtraNonce can solve a block just after a machine with a higher ExtraNonce, and time seams to go back.

    Well gosh, don’t leave us hanging!

  5. #8 by Stuart on September 4, 2013 - 10:29 am

    Great work and what an incredible finding, would love to know the answer! Do you think it’s a bit odd that he used a proprietary Base58 for the address encoding and just happens that you’re seeing the spike between values 0-57? Interesting the 10-19 are blank too as perhaps they’re just the numeric characters 0-9 so not used in a message although I guess that gap should have been at 0-9 to get really excited about this. There are however at least two distinct heights which may indicate more lowercase letters are used than upper? I know it’s probably just coincidence but it would be amazing if there was a secret message in the LSB to explain this, anyone tried printing the text of these LSB nonce values as base58 string… guessing it’s abUHGnbaedIUhjkasdfJkKJHDF but got to ask.

  6. #9 by Tamás Blummer on September 7, 2013 - 4:04 pm

    Fascinating discovery Sergio, congratulations! I am inclined to think that he intentionally left a fingerprint on those blocks. I hope this is his promise that those coins won’t ever be spent. Spending them could expose his identity and crash the value of the coins.

    • #10 by gwern on September 9, 2013 - 5:47 pm

      If so, that was a rare bit of stupidity on Satoshi’s part: if he wanted to reassure the world that he won’t crash the value of bitcoins, he could simply have provably destroyed the coins (there are several ways to do this and Satoshi would have known them), not done something as bizarre as statistically encoded some sort of bitpattern into the early blocks without any other hint or message and waited 4 years for someone to find a leak he didn’t notice for a year, while all the while people wondered whether Satoshi would spend his coins…

  7. #11 by Svc on December 15, 2013 - 3:43 am

    Could bitcoin be a federal government program? What evidence supports that or makes it unlikely?

  8. #12 by Franko on December 13, 2014 - 9:13 pm

    pretty cool find.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: