Computer forensics is the science of finding evidence in computers and digital documents, and when a hacker perform forensics, better be prepared for the unknown. Satoshi did many things in order to try to stay anonymous: he used Tor, he used anonymous e-mail servers, he did not disclose personal information in posts and probably he didn’t use the name Satoshi as his computer username. And here is when forensics come to play. In the Bitcoin original PDF paper (either the 2008 or the 2009 revisions) you will find PDF metadata. The PDF metadata holds no information for the “author” field, of course. Only we know from the metadata the paper was created with OpenOffice.org 2.4 Writer (which has a PDF export function), the date it was created, and the time-zone. If you look at the PDF with a text editor, you will see that there is a field not shown by the document property dialog:
/ID [ <CA1B0A44BD542453BEF918FFCD46DC04> <CA1B0A44BD542453BEF918FFCD46DC04> ]
This is the ID of the PDF document. It turns out that the ID of the document was (in version 2.4 of OpenOffice.org) built as an MD5 digest of some fields, as shown by this code:
OStringBuffer aID( 1024 ); if( m_aDocInfo.Title.Len() ) appendUnicodeTextString( m_aDocInfo.Title, aID ); if( m_aDocInfo.Author.Len() ) appendUnicodeTextString( m_aDocInfo.Author, aID ); if( m_aDocInfo.Subject.Len() ) appendUnicodeTextString( m_aDocInfo.Subject, aID ); if( m_aDocInfo.Keywords.Len() ) appendUnicodeTextString( m_aDocInfo.Keywords, aID ); if( m_aDocInfo.Creator.Len() ) appendUnicodeTextString( m_aDocInfo.Creator, aID ); if( m_aDocInfo.Producer.Len() ) appendUnicodeTextString( m_aDocInfo.Producer, aID ); ... aID.append( m_aCreationDateString.getStr(), m_aCreationDateString.getLength() ); aInfoValuesOut = aID.makeStringAndClear(); osl_getSystemTime( &aGMT ); rtlDigestError nError = rtl_digest_updateMD5( m_aDigest, &aGMT, sizeof( aGMT ) ); if( nError == rtl_Digest_E_None ) nError = rtl_digest_updateMD5( m_aDigest, m_aContext.URL.getStr(), m_aContext.URL.getLength()*sizeof(sal_Unicode) ); /// OJO QUE ES UNICODE if( nError == rtl_Digest_E_None ) nError = rtl_digest_updateMD5( m_aDigest, aInfoValuesOut.getStr(), aInfoValuesOut.getLength() );
The aGMT field can be re-created knowing the m_aCreationDateString field, which is stored on the PDF as:
Since we know Satoshi’s PC was a Windows XP, then we also know that aGMT has only millisecond precision. The CreationDate field does not store the milliseconds, but an attacker can easily brute-force the millisecond field and find the correct value.
The aInfoValuesOut field would contain Creator, Producer and CreationDate fields already present in the PDF, in their hexadecimal format:
So aInfoValuesOut would be:
The m_aContext.URL is the interesting part: it holds the path and name of the PDF where it was originally located, in Unicode format. I’m sure Satoshi did know about how products leak data, and probably he located the file in some innocent path like “c:\bitcoin\src\docs”. But while he has publishing one of the most extraordinary papers ever seen he may have been too exited to remember it. Then the URL could have been something like this:
Here JLewis is just an example of a possible username.
Then somebody could try to uncover Satoshi by brute-forcing each possible 8 letter username, each possible “My Documents” folder (in the most common languages), each possible subdirectory (such as /Desktop or /Bitcoin), and each possible millisecond field.
But of course it is highly improbable that Satoshi had overlooked this, very improbable he did store the PDF under the directory of his username, quite improbable the username is found, and it is much more improbable somebody finds the exact path and name of the original PDF file. And still, I’m uneasy.