One of the first encounters I had with studying probability came at the tender age of ten, when I was introduced to the quintessential geek game of Dungeons & Dragons.
One of the things I found amazing about the game is one of the game designers at one point felt the need to explain the way probability worked in the context of game rules. At the time, I was so enthralled by the idea that rolling different combinations of dice could result in different predictable random results that I calculated out long tables of xdy where x is the quantity of dice and y is the number of sides. Eventually, I got bored and wrote a Basic program to generate the histograms for me. (can you see a future in computers for this ten year old, if not get your foresight examined)
One of the things that struck me later in life, was 3d6 was picked for the attributes of characters to map to a normal bell curve distribution, which was also the basis for "grading on a curve". It seemed reasonable that there would be a range of human ability, and that most people would tend to cluster about a hypothetical norm when sampled randomly.
But what doesn't seem normal is that Universities pick their students, and ostensibly the normal should be filtered out. The same went for my highschool where only those in the 85th percentile and higher were admitted. Out at that 15%, the bell curve is starting to look a bit flat, and given selection bias we should have no expectation that the randomness of the sample would hold. This thinking gave me a true appreciation for grades: they're made up fiction, points assigned in a game, and the objective of the game is to do the least amount of work to collect the most points.
By grad school, I had 1class a day at 3pm, followed by drinks with the professor. I researched and wrote 16 papers and 1 thesis, and worked 2 hours a day max. Why? Because I understood the game, and weighed the probability that doing any particular work would impress my professors enough to warrant the points. Normal distribution be damned, I had learned the key to academic success was to be the only guy in the room.
Later in life when I was facing the cold reality of web site performance optimization, those same damn patterns would show up. Some places had nice normal distributions, usually peaking right after lunch on the East Coast, some places would plateau, as if they were rolling 6d4 rather than 3d8 over a 24 hour period. Some had 6d4+/-4 shifting the curve in a direction based on seasonal usage.
When you really think about these aggregate patterns, how a college professor grades his students, or how your users visit your website, or how you roll a character's base stats in D&D are mathematically speaking equivalent activities when looked at in large enough volumes. At the heart of each are 3-4 factors which are essentially random from the observer's point of view, but in any specific incarnation are purely deterministic.
This property of randomness is at the heart of cryptography and consistent hashing. Being able to characterize a safe upper bound and lower bound, determine a peak load, and ensure an even (random) distribution from an infinite domain over a finite range. Secure hashes are secure because exhaustive domain searches are expensive, and distribution is even providing few hints as to where to search. Consistent hashing works because typical values for keys collapse to finite ranges allowing for fast lookups. We must in both cases hash to a uniform distribution in order for these to work.
When we build any system, we exploit these emergent properties of aggregation to build reliable systems. If we can map an infinite domain to a finite range, we can design systems that are robust in the face of uncertain input. This does not mean that our system may not fail, just as the rules of probability do not rule out the possibility of us flipping heads every coin toss for the rest of our life. It just means the probability of failure can be made vanishingly small.