Have a question? Want to advertise? Something else? Contact us: podcasts@btcmedia.org

From the Front Page

Categories: General

Hashing, Football, and Bitcoin

Published on August 2nd, 2014 by therealtwig

Say you are presented with the list of National Football League (NFL) players with their teams omitted. You are then asked to name every player on the Indianapolis Colts, using only the list. You can guess as many times as you want until you are successful. However, many people are playing the game, and only the first person to guess correctly wins the prize.

Of course this would be easy if you had some familiarity with the NFL or a smartphone nearby. However, assuming you knew nothing about football and you didnt have any access to information otherwise, you may have to run through quite the amount of possibilities to succeed. In the NFL, there are 32 teams with 53 players on the active roster for each team, or 1696 possible players that could play for the Colts. So, this would be very time consuming, but still doable eventually.

What exactly is a Hash?

What I have just described for you is a very basic idea of a hashing function. A hashing function takes a bunch of items and converts them into a precious few. The term hash comes from an analogy of chopping and mixing. For instance, if you have ever had a breakfast hash, then you know there are elements like onions, peppers, potatoes, and corned beef that combine to make delicious dish. In the example of the NFL, the ingredients being hashed are all of the athletes, and the delicious dishes are all of the teams that make up the league.

Hashing functions are as arbitrary as the aforementioned NFL example. All that is required is that many elements are split into a fewer amount of outputs, or containers. So, anytime your mom asked you to sort your room when you were younger, technically she was asking you to create your very own hashing function to sort your junk. Calendars used to organize dates are also hashing functions. All of us are hashed into 365 containers (366 if you are a leap year baby) based on the day we are born. Ever heard of a "hashtag" on twitter? Many comments about a topic sorted into a few categories that describe it.

Mathematical hashing formulas are just as arbitrary. Say I have an input space of {0,1,2,3,4,5,6,7,8,9} and my hashing function is to add ten to each number and round to the nearest ten. The set of outputs would then be {10,10,10,10,10,20,20,20,20,20}, or by eliminating repeats {10,20}. All that is required to hash in general is to arbitrarily convert many elements to a few categories. An added security convenience of such a function is how hard it is to guess the original input based solely on the output.

Why is Hashing Important?

Hashing functions have a wide range of uses. Typically, they increase the efficiency of quickly locating an item. For instance, knowing which drawer I stored my socks helps me get dressed considerably faster than if my socks were mixed with all my other clothes on the floor. However, if I am looking for a specific pair of socks, it still may take some time to sort through the drawer. But, at least its considerably less time than if all my clothes are mixed together randomly on the floor.

For something like Bitcoin, we would never want have a hashing function like a sock drawer or an NFL team. Or, even worse, a sock drawer of an NFL team! This is because the many-to-few sorting that takes place within Bitcoin are the many private keys to the few public addresses that contain funds. If there were only 1696 private keys and 32 public addresses with Bitcoin as in the NFL example, then there would be some MAJOR issues.

First and foremost, imagine how mad you would be if you were the 33rd person in line at a bank where only 32 accounts could be created! So, any system looking to have many people using it should at least have enough space for each person to create at least one account. Furthermore, say we alter the rules of the Guess the NFL team game to be instead a scenario where naming any member of the Denver Broncos unlocks a vault representing the entire net worth of the team. Suddenly, your incentive to spend time making as many guesses out of all of the 1696 possibilities as fast as possible grows quite considerably. So, the sample space of private keys should be considerably larger than 1696. It should also not be as low as 32, because guessing any of the keys randomly would open up any of the vaults.

So How Does Bitcoin Use Hashing?

One of many clever ways the bitcoin protocol makes use of hashing algorithms is in the process of generating bitcoin addresses. Every bitcoin address has a private key and a corresponding public key. You can think of the public key as a storage locker and the private key as what enables someone to spend funds that are located in their locker. The choice of the protocol is to generate the storage locker from the private key, but how?

A private key is a 256 digit random number made up of a series of 0s or 1s. When you generate a new bitcoin address, you are taking one of the possible private keys and running it through a series of hashing algorithms to produce an output that makes it very difficult to guess its input. In fact, as we are about to see, it is so difficult that it is virtually impossible.

Before any hashing takes place, first the private key is put through something known as elliptic curve multiplication to generate a private/public key combination that are linked to one another. This result is then put through a gauntlet of several complicated hashing algorithms with really cool and intimidating names like SHA256 and RIPE-MD160. The result at the very end is one of possible public addresses. When you see a bitcoin address such as my tipping address below, it is just essentially a vanity plate that represents one of these possible bitcoin storage locker possibilities.

While it is true that hashing takes many items to just a few, the few in this example is actually quite large. Checking with my good friends over at Wolfram Alpha, the few is actually the number listed below.

I dont think we'll have any more problems if 33 people try to get an account!

Each of these containers are all that will ever exist to store a total sum of 21,000,000 possible bitcoins ever to be in existence. Every time you spend a bitcoin, you just move a coin (or part of a coin) from one container to the next.

So, What's the Big Deal?

Now, you might be thinking, "Hey...there are more private keys available than public containers...couldnt someone rob from my container!!!" And the answer is yes, it is of course possible. In the same way that it is possible to find out you won the lottery while vacationing on Mars. Any hashing function where the set of inputs is larger than the field of outputs will produce, by nature, collisions by the pigeonhole principle. Sometimes math doesnt have to be scary. Its just obvious. If you have more pigeons than containers, one container has to have more than one pigeon!

Even with this possibility being out there, somewhat simple algebra can explain just how unlikely it would be for someone to find another key that works for your locker.

If you take the number of private keys divided by the number of public containers, you get: , or private keys that correspond to the same bitcoin address.

Sure, you could brute force your way through these. Although, by comparison, it has been estimated that there are grains of sand on the entire earth. Trying to find another key at random that opens up someone else's locker is like searching through every grain of sand on the planet.' Sounds like way too much effort for me. Id much rather be spending my free time eating some great corned beef hash and watching my Indianapolis Colts play some football!

-Adam Terwilliger

Views: 2,689


Make sure to make use of the "downvote" button for any spammy posts, and the "upvote" feature for interesting conversation. Be excellent.

comments powered by Disqus