Bitcoin 101 – Merkle Roots and Merkle Trees – Bitcoin Coding and Software – The Block Header

hello this is James D'Angelo and welcome to the Bitcoin 101 blackboard series today we're taking a look at this doohickey right here and how it ends up making this doohickey right here and how in every block in Bitcoin you'll see this doohickey right here and this thing right here is called the Merkel root so we're going kind of technical little bit low level and we're going to even look at some programming examples some quick Python code to try and understand what exactly is a Merkel root a little bit about why it's used and even how to generate the exact merkel root for a specific block okay and so you might have already understood that this right here is a picture of a Merkel tree and these are considered the leaves and all the way down here are the branches and this right here is the Merkel root and this number is one of the things that protects the integrity of Bitcoin transactions okay so let's rewind for a second and understand better what we're looking at right here we're on block Explorer comm which is a great site to see all the details there inside of every block on the blockchain so every ten minutes a new block with all the transactions that the previous ten minutes gets packed together and stuffed on top of the blockchain and each particular block has a number this particular block is two hundred eighty six thousand eight hundred and nineteen and what that means is there have been two hundred and eighty six thousand eight hundred nineteen blocks before this back to when Satoshi Nakamoto first released Bitcoin in January of 2009 and he started mining probably all by himself when he started mining block one okay so we're looking at the block header on block Explorer and you can type in the number for any block and actually see all the transactions that happen inside it so if you went to block Explorer and just type the number one it would show you the first transactions ever made with Bitcoin which is really just a coinbase transaction to whoever was mining most likely Satoshi himself now let's just sort of run through the rest of this header so we can kind of under and what we're dealing with and then we'll go further into this whole idea of Merkel and Merkel roots okay so here we have the hash this is the hash of the block that expresses all the difficulty it took to mine it and this is going to be something like sixteen or seventeen zeros and then just sort of some random numbers afterwards and this hash is like a signature for this particular block in the next block two hundred eighty six thousand eight hundred and twenty will have to include this number inside to prove that it actually came after the previous block and this is how a timestamp server works it uses unique numbers that have to be included in the following block okay and so here we've actually got the previous block so this is the hash of block 286 thousand eight hundred and eighteen okay and though this number wasn't known at the time here is actually the hash of the following block okay so if you look inside the actual details the raw block which you can actually see right here just by clicking this you'll see the JSON interpretation of the block it won't have the next block included in it and so this is the hash of block two-hundred and eighty six thousand eight hundred and twenty okay and this color is not great so let's move to another one and this is the time okay so this is the time approximately when this block was made so that was February 20th 2014 at four fifty seven in the morning and this next number is the difficulty this is this number that tells you how much computing power you're going to need to generate this particular block in approximately ten minutes and right now you need something like thirty peda hashes and will actually compute the difficulty from this number and other videos and all that we're going to actually look at all these items in different videos here's a number of transactions of this particular block so in these ten minutes in this approximately ten minute time there were 99 transactions okay and we're going to look at the whole list of the transactions in a second during those ten minutes 3925 bitcoins were transferred at the current rate of around six hundred dollars pretty good amount of money that's flying around okay the size and this is very important to people who are storing the entire blockchain the size of this particular block is a hundred and fifty-two kilobytes okay so it sounds really small it's actually smaller than most photos file size are on Facebook but every 10 minutes adding this kind of size accumulates and right now after five years of generating blocks we've got a blockchain which is now 17 gigabytes of information so it's getting pretty beefy and there's a lot of talk about how to prune the blockchain and they'll eventually succeed right most computers will be able to just hold unspent outputs and that will drop the the amount really low and then if you have unspent outputs that you can spend so you're not including all the one Satoshi transactions you could actually probably drop or very useful blockchain down in the order of megabytes probably maybe 50 megabytes or less but that's something for the future okay right now if you want to control and look at the blockchain you're going to be talking 17 gigs of information and then we get to our friend the Merkel group and this is to most people the most confusing one of all because it's got this really weird name but we're going to jump past it really quickly and just kind of understand the rest of this block the next number is the nonce the nonce we'll also get a lot of attention when we go into the details of mining but here's the nonce this is a particular that the miners are generating so that when they calculate the sha-256 of this block header that they'll end up with this shot 256 right here so they're just cranking through nonces until they get the right key to solve this block and so the nonce is basically that lucky combination that allowed this miner to end up with the 25 bitcoins associated with this block and what the particular miner did was probably start at some other number I don't know 85 million or something like that and just started counting up until he reached this number and if you do it faster than anyone else you can win the bitcoins so raw block is not part of the block header this is just a a thing you can click on block explore so let's take a look at that okay so here we are at the live page for this particular block on block explore calm it's just a great site for looking at really really sort of gritty details of Bitcoin so here's all the transactions okay all 99 transactions and each of these transactions it turns out has a transaction ID and that transaction ID is right here so if I click on this particular transaction ID for example I'm going to get the information on this specific transaction so what address it came from what address it went to okay so here's a to address here's another to address so one of these is the change address one of them's the real recipient and here's the from address that's one transaction so remember this particular block had 99 it shows the date and time of this particular transaction everything you could possibly want to know except for the people who actually received and sent the money remember Bitcoin provides a level of anonymity okay and just for fun let's let's jump back and take a look at another transaction and here we go and this transaction has this transaction hash very important for when we're talking about merkel roots and inside the transaction we've got the from address the to address and the amounts being spent okay and so here we're talking something underneath a Bitcoin so likely what was sent was this amount here and the person who had the bitcoins this amount sent that back to themselves in the form of change but again it's really important to realize that every transaction has a hash so they take all the transaction info right in the raw form and they actually take it in the binary form but it looks you know something like this a lot of code they hash that without obviously the hash number at top they hash all of this in binary form and end up with this it's a shot to 56 hash which we'll talk about another video each transaction every single transaction in Bitcoin will have a unique transaction ID hash and so this particular one has four six six eight okay and every single one will have some other bizarre transaction ID okay and and here's another transaction it's got the hash of the transaction right here B ADC and it's taking inputs for many different places that are all sign able by the person who's signing off of the strands action and sending them as 42 bitcoins to this one particular address on the specific time okay so everything can be tracked okay so now let's get a little more busy with the idea of this particular merkel route and we're going to try and actually calculate this particular one by using the information from all these transactions so so let's get a little background on exactly what we're doing by kind of drawing this up on the blackboard okay so we're talking about how to generate the merkel route and how to build a Merkel tree okay and as we saw every single transaction had a particular ID a transaction hash and so what a Merkel tree is is created out of those transaction IDs those hashes so say you have very simple an eight transaction block okay and so each of those transactions are going to have their transaction ID and bitcoin references just those IDs as it's making the Merkel tree so here's transaction ID one here's transaction ID two three four five six seven eight now a Merkel tree what it does is it takes those two hashes of transaction 1 and transaction ID 2 and it hashes them together so it just concatenates the two long 64 character hex transaction IDs right that we saw right here okay so it takes this 64 character hex thing adds to it the next hash right over here and then takes a hash of both of them together so it ends up with let's just call this hash one okay and then it does the same thing with the next two so let's call this hash two and it does it with five and six and you're getting the idea hash three and seven and eight hash four and it calls these things at the top these top transactions are referred to as Merkel tree leaves okay because the leaves are at the very end the branches and the root is at the very base okay so then what you do is you go down one level and then you take the first two from the next layer of branches and it hashes those together and I don't know let's call this B hash one and hash in the next two together let's call this B hash two and you get sort of another level of your branches okay here you have two branches and here you kind of are getting down to one then all you have to do is hash those two together and you end up with what's known as the merkel root and really that's it that's the whole idea behind Merkel's of course there's some big endian issues little endian issues and when you actually go to program it gets kind of fun and complex and we're going to take a look at some of that in a second but what a lot of people are wondering is probably where did this freaky name come from well that's pretty easy here we got a guy named ralph merkle ok born February 2nd 1952 and he invented this way of storing data in a way that's sort of provable and secure ok and here he is on his own home page right here's Ralph Merkle and so Ralph came up with this idea in 1987 in a paper called a digital signature based on a conventional encryption function and he comes up with the idea of a Merkel tree now in his paper because he's introducing the idea you won't find any reference to a Merkel tree anywhere ok you'll just find his name because he wrote the paper well it became such an important thing and they didn't have a very convenient name for it so they decided to call it a Merkel tree and use these things called Merkel roots Merkel trees and Merkel roots are used in a variety of applications and again they're used in every block in Bitcoin and they actually are a savings of computational power and the amount that needs to be sent out by mining pools because instead of projecting or sending all the transactions are rehashing all the transactions every time you hash the particular block you can just hash the block header so you're just using that one tiny little Merkel string so it turns out to be a huge computational savings for Bitcoin it is other ways in which it saves computational power especially as we look to decrease the size the blockchain is we're dealing and looking for individual transactions now a lot of this will get clear if we start to look into how a Merkel tree is actually encoded so here we've got a piece of simple software and was actually written and posted by a guy named Ken sheriff who's got this amazing blog okay this is a guy who's just recently jumped into Bitcoin in a big way and as some of the most beautiful posts for understanding Bitcoin okay so please take a look at his blog and at the very bottom of this most recent post okay so this is from February 23rd to go to the very bottom at the very bottom of his post in the notes and references section he includes some code about how to generate a Merkel tree okay and you can just click view raw and you can take his wonderful little piece of code and just dump that into a simple Python interpreter okay and that's exactly what we did and so here's our code over here by Ken sheriff and these down here are all 99 transaction IDs so these are all the individual hashes that we were seeing in the previous block Explorer page and just to hammer that home let's go back to the block Explorer and we see that the first transaction ID is zero zero ba F six six two six and if we look at Ken's code over here we see zero zero ba F two six six the next one is ninety one Cee 7/5 so this is taking all the transaction IDs and just putting them in one little file and then with just these few lines of Python code he's going to generate the merkel route so we can just take this right here and if you use idle or Python 2.7 six you can just hit your Run key and then you end up with this number right here which turns out to be the actual merkel route of this block okay well that seems to work pretty good but let's actually dive into this code and understand a little better what's going on okay like I said we've got all these transaction IDs and the amazing thing with how sha-256 works is that if i run this code as many times as I want it's always going to calculate the same fingerprint so let's run it again boom exact same ok run it again boom the exact same 64 characters of hexed which make the merkel root for this block but if I take any one of these and just change one letter anywhere so let's take the seven right here and turn it into a nine okay then I hit save and now I run this end up with a completely different hash okay and that right there is a really good example of how amazing shot is okay this is a unique identifier that can only be generated by this collection of transactions and even changing just one letter will end up in this particular fingerprint this one that starts with a to F a and now every time if I run it it will generate the exact same to F a fingerprint okay so hopefully I can undo the edit I made and get back to the original version yep there's my seven okay and now I save again and just by changing one letter I'll get back to my original hash that I was looking for so let's take a better look at this code and see how it's putting these things together okay and some of the key lines here that'll make sense or you see this hash Lib sha-256 of two different items okay so it's doing what we talked about on the Merkle tree this particular line is taking any two of these and dropping them into one hash value so that same piece of code will work over here or it will work even down here okay so this piece of code is actually the key code for generating the sha-256 and in fact it's the only point in the code where you see sha-256 mentioned so just quickly running down the code we see if the length of this hash list okay and this is right here which is 99 when you start if it's equal to one just return the first value hash lists zero okay so if it's the only transaction and the thing you just actually return the hash of that particular transaction if not let's create this new string of hashes okay now what it's going to do and if you're a little familiar with Python is it says for I in the range of 0 so starting at the very first transaction 2 the entire length of this hash list which in this case is 99 alternating by every to grab the first two and now hash them together grab the next two and hash them together so you'll see that right here append to this new list the hash this function right here which is actually this function down here which does our hashing stick these two guys together and hash them together and so what you're going to do is go through this entire list of transactions and hash together the pairs just like we talked about so here's the first two here's the second two here's the third all that and they'll generate because there's 99 transactions so generate a much more leafy and much taller tree and we're going to run through this code really quickly then we're going to add some instructions to this code to sort of tease out some of the more interesting aspects of it okay now in this line right here basically says if there's one transaction left over okay so if you have an odd number and in this case we have 99 then you take the last transaction transaction number 99 and you hash it to itself okay because Merkle trees need a new hash for every level you can't just keep passing along the transaction ID that's stringing out at the end you need new hashes okay and then when you're done when you've taken those 99 and chop them down to what 50 you run the whole thing over again with the 50 that you just generated okay and so you end up with round to until 50 goes to 25 25 goes to I don't know 13 because of the odd number goes to 13 13 goes to 7 and you end up with a bunch of recursions okay so to better understand it I've added a number of print statements in the middle of Ken's code that will tease out some of the ways this hashing works okay and we're just going to run this with a much smaller set of transactions we're going to actually rename our transactions to really tiny things so you'll see the very beginning ones just be small and then you'll see the sha-256 coming in and for the purposes of this demonstration we don't even need those transactions okay so our transaction list is right here and what is that there's ten of them there and so we're going to run the code with a much simpler set of transactions so let's say go okay and you see a lot of my print statements are generating some information and so at the bottom we get a merkel route which is not the Merkel route of the block we were looking at but it's actually the Merkel route of these ten transactions with really tiny transaction ID names that I just kind of put in okay so the bottom is our merkel route which is always going to be 64 hex characters because we're doing a sha-256 hash ok sha-256 will always be 64 characters no more no less and they will always be hex characters okay so it generates 256 bits of information okay and hex is obviously much shorter so you get 64 characters okay but what we see is that we basically get a number of rounds or a number of levels on our tree okay and if we go to the top we see that our tree is much bigger and then as we go down each section our tree gets smaller and each time it gets smaller and smaller until we end up with one merkel root and so really what we're doing is exactly what we did in this drawing where at the beginning we're saying this is a a this is b b CC DD e e and then I did one one two two three three and it continued over here so if we look at our output over here branch one is a a so it's the first transaction that we're dealing with that's the first well it's truly a leaf in this case because it's at the beginning but we'll just call it a branch for now branch two is BBU hashing a and B together okay so if you just took the letters a a no space b b and stuff that into a sha-256 hash you would end up with this of course there is some little endian and big-endian stuff so you have to reverse those bytes in there but you end up with this and so if you're a coder you can see some of this little endian and big-endian dealing with that right here okay these are basically just flipping around the bite order okay so bitcoin has a lot of reversals of big ending and little ending kind of a pain in the butt but that's the first hash so this has right here corresponds to this hash right here and that's exactly which one it is okay then branch three and branch four ends up with this hash so this one 8 AC would actually be right here 1 8 AC well I love because here's three here's 4 CC DD get hash down into here okay and so what it does is it goes through all these transactions right here and ends up with a new list and the new list starts with this hash then this hash then this hash this one and this one so it ends up with 1 2 3 4 5 the next round it takes those particular hashes and hashes them so it says branch 1 is 3 3 9 2 which we get right from here and branch 2 is 1 a a C and we get that from here it hashes those two together and it gets the next level of branching and of course this starts to get pretty obvious and so hopefully with this you begin to understand how merkel roots are generated inside of the block header okay and they've got lots of great uses and there's a lot of great reasons for them being there there's even lots of discussion of adding more levels of merkel roots and Merkel trees inside the transactions themselves that's likely to happen so it's really important to understand how these things work and not be afraid of them because they've got some guys name named Ralph Merkle who wrote some paper in 1987 okay so that's it for now just a taste for Merkel roots we'll be talking much more about this stuff in many other videos hope you enjoy please remember to Like comment subscribe do all those things you do and we'll catch you at the next video

As found on YouTube