Showing posts with label random numbers. Show all posts
Showing posts with label random numbers. Show all posts

Monday, January 11, 2010

More on gambling and random seeds

I always appreciate it when people write in with questions about something I've posted before. Gives me an excuse to keep this blog at least somewhat active.

Joe in Illinois writes:

Enjoyed reading your article on google regarding how to set a seed and randomness. Would different seeds contain different overall results. For example, some people would argue that a simple Jacks or Better video poker game returning 99.54% (in the long run, whatever that is) is still not purely random because the maker must still set this % ( over time), So would/could some seeds produce more winning combinations, maybe with fewer overall winning numbers in the seed. Players would not recognize one seed starting or ending. The next seed would/could have say, only 16% of winning combinations. Some people are overly concerned with this RNG, I say it's just doing its job, running constantly. But I also think that the player should hope for a positive seed, along with some luck and knowledge.

The short answer is, yes, different random seeds would lead to better or worse luck. But in the long run, with a good random algorithm, it wouldn't matter to you. Hoping for a "good" string of numbers coming from the RNG makes neither more nor less sense than hoping for good luck when you sit down at a physical card table.

Look at it this way. If you take a video recording of your session at a poker game, obviously you'll draw more good hands than bad hands sometimes. If you took different videos on successive days, and later you compared the tapes, then you could say "Oh look: on day one I had a lucky streak, and on day two I had an unlucky streak." But that's just the way random numbers actually behave: it's a rare string of randoms that don't show signs of patterns that appear meaningful but aren't.

Since the nature of a correctly designed random number generator is to simulate real random numbers, of course an RNG will create those same apparent patterns. But as long as there is no way for you to reverse engineer the algorithm or figure out what the seed actually was, hoping for a certain pattern is not much more than superstition.

Another interesting thing to note is that the nature of probability is that the larger your sample set is, the more likely it is to confirm to the expected distribution. In other words, if you only play three games in a row, it's not entirely unlikely that you could get a lucky streak and win all three games. Of course, it's slightly more likely that you would lose all three games; but still, if you believe in luck, betting a lot on a small number of hands makes some sense. The longer you play, though, the less the game's outcome will look like a sequence of wild streaks, and the more it will look like a typical bell curve distribution of some sort.

Thus, if you play 5000 hands of video poker, you are almost guaranteed to gradually lose money at the expected rate, as very lucky or very unlucky streaks will tend to cancel each other out.

Monday, April 13, 2009

Random numbers aren't random

In a previous post, I mentioned that you can add the appearance of intelligence to a game playing script by throwing in a random component. While the computer can always win tic-tac-toe by playing the same sequence of moves, it looks much less boring if the computer picks different winning moves in each game.

Following up on this, I got a special request to explain how random number generators work. I happen to know that the person who asked for this information is sort of a logical determinist, so she'll be delighted to hear that random numbers are not random at all, but generated by a very specific script.

When we think about random numbers, what we generally want is an unpredictable number in a range, with 0 or 1 at the low end, and some number "m" at the high end. How big m is depends on the application. For instance, to simulate a die roll, we want a random number from 1 to 6. To simulate drawing a card, we need a number from 1 to 52. For a roulette wheel, 1 to 38. For a game with two dice, like monopoly, we need to generate two random numbers and add them together. The rules for a role-playing game such as Dungeons & Dragons will often instruct you to roll 1d20 (roll a 20 sided die, i.e., generate a random from 1-20) or 3d4 (roll three four sided dice) to determine a monster's health, or whether they hit you on a particular round. Games like World of Warcraft have borrowed heavily from tabletop gaming rules in order to come up with event odds in the range that make sense for the game balance.

It's hard to generate a truly random number, since computer programs are fundamentally deterministic in the way they follow algorithms. So they do something slightly easier. Once we have one random number (or pseudo-random), we generate a sequence of randoms after that, where each one is based on the number before it.

There are many functions which can generate this sequence; one common example is this:
  • Xn+1 = (a*Xn+b) mod m
In this case, a and b are constants that you can pick, often large prime numbers. "m" is a maximum value for the resulting number.

Let's try a few examples. I'll pick arbitrary values a=107, b = 23, and m=101.

  • 1 (This comes before the first number. It's called the "seed," as it helps to determine which numbers will come up in the pattern.)
  • 107*1+23 = 130; 130 % 101 = 29
  • 107*29+23 = 3126; 3126 % 101 = 96
  • 107*96+23 = 10295; 10295 % 101 = 94
  • 107*94+23 = 10081; 10081 % 101 = 82
  • 107*82+23 = 8797; 8797 % 101 = 10
  • 107*10+23 = 1093; 1093 % 101 = 83
As you can see, this function produces a string of seemingly unconnected numbers from 0 to 100. However, they aren't unpredictable. If you know the formula, and the current number is 29, then you know the next number is always going to be 96.

Because of this, you wouldn't want to use 101 as your "m" value, if you were actually looking for a random number in the range of 0 to 100. You could always predict the next number in the sequence. If you're a company that writes video poker machines, and your random number generator is easy to reverse engineer in this way, you'd be in very big trouble.

So instead, you'd make all your constants, m, a, and b, much larger. Then you'd take the resulting random numbers and mod them by a smaller number to get program data, but you'd still retain the larger number to generate the sequence. That way, if people see the number "3", they have no way of knowing that the real number in your random sequence was generated by 5,495,110 mod 6 plus one, and therefore they can't deduce what the next number is going to be, even if they know the algorithm.

So the random sequence is masked from the program's end user by not telling them what the formula is. Nevertheless, a random number generator is still predictable based on the seed. If you start with a fixed seed number of 1, and you get the sequence of die rolls 4, 2, 6, 1, 3, then you will always get the same sequence after you shut down the program and restart it.

If an enterprising gambler were to discover that his video poker machine was programmed so foolishly, he could take advantage of it in the following way: First, unplug the machine and plug it back in to reset it. Then, bet small amounts of money while writing down the card sequences that you get. Then, reset the machine again, and bet large amounts of money against the cards that you know will turn up.

So predictability is the bane of random number generators. Clearly, it's important not only to have a disguised algorithm, but an unpredictable seed set at the beginning of your program.

How can you set the seed? It doesn't really matter, as long as it's not based on a known quantity. It needs to be a truly random number, but again, there is no such thing as a truly random number to a computer.

One approach is to use the internal clock, including milliseconds. A human can't really control the precise millisecond at which the program starts, so there's no way to predict what the seed will be.

There are wackier ways to handle it. For instance, an algorithm called "lavarand" operated by pointing a camera at a lava lamp, taking a digitized picture, and creating a hash from the bits. This is chaotic enough to provide something that appears to be truly random.

So to recap, in order to generate random numbers, you need
  1. An algorithm that is deterministic but unpredictable
  2. A random seed that is preferably discovered through some (also unpredictable) aspect of the external world.
There are, of course, also advantages to using predictable seeds. If you always use the same seed, then you can test the same set of data repeatedly, which you sometimes must do if you don't want to tear your hair out trying to nail down a problem with the algorithm that actually does something with the numbers. But in a real world environment, you need some kind of outside information in order to randomize the results and stymie the users.