Friday, October 16, 2009

Ugly numbers in a heap

As I was saying, a good solution to finding ugly numbers would be to somehow enumerate only numbers which are known to be ugly, removing them from memory after they have been read. I couldn't think of the way to do that. Internet to the rescue.

Here's what you do:
  1. Store the number 1 in a set.
  2. Consider the smallest item, "n" in the set. That is counted as the next ugly number.
  3. Add the numbers n*2, n*3, and n*5 to the set.
  4. Remove n from the set.
  5. Repeat steps 2-4 until you reach the desired ugly count (1500).
Each time you search for the next ugly number, you remove one number from the set but add three more, for a net gain of two. Thus, the set of numbers you are storing on each step i is (2*i-1). That's a manageable memory size. But there's still a snag.

Sets are unordered lists, and we want to always find the smallest number of the set as efficiently as possible. How do we do that? We can't just read every number in the set, or the problem scope quickly becomes O(n2). Can't have that.

Turns out there is an ideal data structure for handling this kind of thing, but it's something I haven't looked at since I was an undergrad. It's called a min-heap.

A heap is a special kind of binary tree, where every level of the tree is filled to maximum density. For instance, in a heap with 8 elements, you know that the top level has one node with exactly two children; each of those two children has two children (that makes 7), and the leftmost node at the third level has one child on the left.

The neat thing about using a heap is that because of its predictable structure, you can represent it using an expandable one-dimensional array. You don't have to mess with pointers or the other headaches of data structures. If you are looking for root.left.right, you just calculate the spot in the array and access element 4.

A min-heap is constructed in such a way that the children of each node are guaranteed to be larger than the parent node. In other words, the smallest value is guaranteed to be at the root, and each child of the root is itself a min-heap with the smallest element of that at the root.

(If I'm going too fast for you, speak up. I don't believe I've ever done a post specifically on binary trees, so maybe I should.)

In any case... you can do two things with a min-heap: add a new element, which is rapidly sorted to the right place; and remove the root element, which then moves the next smallest element into the root and fixes the rest of the heap so that it is still a min-heap.

This is exactly what we need. It keeps the cost of searching for the smallest known ugly number to O(log n), where n is the number of elements currently in the list.

So to recap:
  1. Write your own min-heap class, which is the fun part. ;)
  2. Add the number 1 to the heap.
  3. Inspect the root of the heap, and count that number ("r") as ugly.
  4. Add 2r, 3r, and 5r to the heap.
  5. Delete the root.
  6. Repeat steps 3-5 until you reach the desired ugly count (1500).
Actually it turned out that wasn't quite the answer, because you can get duplicate ugly numbers on the heap. For instance, in the second pass you add 2*3, while in the third pass you add 3*2, which results in multiple 6's on the heap. But that's easy to fix. All we have to do is change step 5 to: "Keep deleting the root until it is no longer equal to the last recorded ugly number."

And there you go. Small storage space, quick execution time, ignores non-ugly numbers, and finishes in 3 seconds easily.

SPOILER:

In case you wondered, the answer is 859,963,392.

Ugly numbers and prime numbers

The second wrong approach I tried taking to find ugly numbers was something called a Sieve of Erathosthanes. This was something that my dad taught me about in high school, and it helped me win a programming contest as a sophomore.

Here's where I should give a quick shout out to my favorite HS teacher, Thom Laeser, who I think might read this blog. Thom was kind of a Renaissance man as a teacher; across my four years in high school, I took one geometry class, two programming classes, and a journalism class from him. He also started the school's first Humanities course, which I did not attend. It wasn't until high school that started really programming (rather than just dabbling in BASIC), and I credit Mr. Laeser with giving me that extra boost.

When I was a sophomore, he gave all his programming students a challenge, which was to be the one who could write a program printing the most consecutive prime numbers within five minutes. This was 1990 and the computer lab probably was limited to 80386 processors or something.

It was clear early on that the contest was going to be between me and a senior. The next day, we both proudly came in with programs that printed all primes from 2 to MAXINT in well under 5 minutes. This was in Pascal, I think integers were 16 bit, so the highest number would be 65,536. Mr. Laeser shrugged and said that was very impressive, but we would have to keep going if we wanted to win.

So. Along the way, I had to learn to write my own large integer output, and also save space by representing numbers with bit manipulation instead of wasting entire "int" values on booleans. More importantly, I learned about the Sieve of Erathosthanes, which I mentioned before.

The sieve is a method for quickly weeding out prime numbers from a finite list of numbers. Here's how it works (simplified version): start with an array of booleans from 1 to, say, a million. The first true prime number is two, so cross out all multiples of 2 that are not 2; so, 4, 6, 8, 10, etc. Then we move up to three, crossing out 6 (again), 9, 12, etc. Next we encounter 4, but 4 is already crossed out, so we ignore it and move up to 5, eliminating 10, 15, 20, etc.

It turns out that you only have to filter on numbers up to the square root of the array size, which in this case is 1000 (sqrt(1000000)). After that, you know there are no multiples of 1001 because you would have had to multiply 1001 by something smaller, and hence already visited, to reach a million.

Long story short, I won the contest, but it was a grueling arms race for several weeks as we kept improving our programs and then "stealing" new ideas by observing how the other's program worked.

Sieve of Ugly

So that was a roundabout personal story, but I haven't told it here before so I thought I might as well get it out. And the point is that my second pass at solving the ugly number problem was to use a sieve.

Sort of a simple modification, really. I didn't just use a boolean array, because there are now three kinds of numbers: 1: prime; 2: ugly; 3: neither prime nor ugly. So I made it an integer array, where each integer stores one of three flags. All numbers start prime; then multiples of 2, 3, and 5 get marked as ugly. Then run another pass on all numbers above 5, marking their multiples as non-ugly.

Using integers was super-overkill, as an int is 32 bits, and I only needed 2 bits to represent my flags: 00=prime, 01=ugly, 10=non-ugly. What I found was that I couldn't sieve up to the 1500'th ugly number, because the array was so huge that I ran out of memory.

So first I changed it to an array of bytes (8 bits). Then, still not satisfied, I improved performance by writing some bit-manipulation routines to simulate an array of two-bit numbers using four numbers per byte.

It worked; I found the same number much faster than before, but still with a noticeable delay before the sieve finished running. 3 seconds was still way out of reach.

What next?

While I was coding the sieve, it occurred to me that it was a terrible waste of space to store ALL the numbers, both ugly and non-ugly, in an array until the entire program was finished. It seemed like there ought to be some kind of quicker way to figure out, given the current ugly number, what the next one in the sequence should be. At worst, I could maintain an array of only 1500 numbers to store only ugly numbers (instead of all numbers up to a trillion).

I wracked my brains for this method; I wrote down the factorization of the first 10 numbers or so, looking for a pattern, but couldn't find one. I'm pretty sure there isn't one, but I was on the right track at this point. I know because that's when I "cheated" by looking to see what other people had done.

And maybe they're smarter than me, or maybe they spent more time on it, but there is still a certain grim accomplishment in not having to reinvent the wheel.

There's a well known "hard" problem in computer science called the traveling salesman problem, which goes like this: You have a bunch of cities, separated by known distances. You have to visit all of them while spending as little time as possible in transit.

To solve this problem by brute force becomes impossible very quickly as you add more cities. So the best known approaches use short cuts; they use sneaky tricks that don't necessarily produce the "best" answer, but are guaranteed to produce a pretty good answer most of the time.

Anyway, I heard that some computer science genius was once asked how he would solve the traveling salesman problem, and he said: "I would put it on the back of a cereal box. 'Find the most efficient path through these points. Winner gets $100, mail your answer to the following address'..."

Well, that's one kind of divide and conquer solution right there. Hooray for the internet, though... now we don't even need cereal boxes.

Online programming puzzles and "ugly" numbers

Ran across this site while looking for general programming puzzles to stay in practice.

On the site, my attention was captured by this puzzle involving "ugly" numbers.

Ugly numbers are numbers whose only prime factors are 2, 3 or 5. The sequence

1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 15, ...

shows the first 11 ugly numbers. By convention, 1 is included.

Write a program to find and print the 1500'th ugly number.


This isn't a particular hard puzzle if you brute force it. All you have to do is write an algorithm to determine whether a particular number is ugly or not, sort of like this:

while (num % 2 == 0) num /= 2;
while (num % 3 == 0) num /= 3;
while (num % 5 == 0) num /= 5;
if (num == 1) return true;

Then you can just loop through as many numbers as it takes, incrementing a counter as you find ugly numbers, until you reach 1500 numbers.

The problem with this approach is that there is a time limit of three seconds. The brute force approach requires you to do an increasingly long operation (dividing repeatedly to find the prime factors) on a series of numbers which turns out to be very large (I won't spoil it, but the answer is just short of one trillion). Early numbers are computer quickly; then it slows down so finding the 1500th number may take several minutes.

I confess I cheated to solve this puzzle. I came up with one other wrong approach after trying the brute force method, then went googling for discussions of this puzzle. As I hoped, I did find a protracted argument about the right way to solve it. One guy hit on a really clever solution, based on a data structure which I haven't used for years. It took me a few tries to implement, but it was very satisfying in the end.

I guess I could feel bad for "cheating," but I don't. I came close to finding the right answer, but really, if you want an optimal answer to a well defined puzzle then there's no harm in hunting around to see if someone did it better than you. Professionally, implementing a GREAT solution that you looked up is preferable to implementing a mediocre solution that you got with only the sweat of your brow. You can learn new tricks from other people that you'll have in your toolbox for next time.

As a tangent, I take the same approach to strategy games. I like to take my time and learn the rules of a game on my own, and I would scorn to look at a strategy guide during that period. However, there comes a point where you have a basic understanding of the rules and can improve much faster by considering input from others who have already mastered the game.

Anyway, more on the wrong approaches and the right one later.

Friday, August 28, 2009

Divide and conquer your project, part 1

James, who is a regular commenter, is a programmer looking for more experience. James has studied a few languages and probably has a decent grasp on the basic logic that it takes to write a program, but he wants to work on a personal project because he doesn't get enough variety of practical applications at work.

This is an admirable goal, first because developing your programming skills requires constant practice, and second because personal projects can make good resume padding. Potential employers like to hear that you program for fun, because it will make you more likely to throw yourself into your work. Also, if they ask "Have you worked with technology X?" you can say "Not professionally, but let me show you this project that I posted on my web page." It gives you the opportunity to talk intelligently about the technology.

However, starting a new project can be overwhelming, and it's hard to know where to start. You have all these things in your head that you would like the program to do, but you can't see how to get from what you have now (a blank screen) to what you would like to have (an awesome program with flawless logic and a clean interface). What do you do first?

I spent a few posts discussing the technique of "divide and conquer." As it turns out, that is also an excellent plan for approaching your programming projects. Here's a little secret of programming: you won't get it right the first time. Fuhgeddaboutit. Don't try. If you sit down for a couple of hours and try to write an enormous, complex program in one shot, and only after you think it is done do you compile it for the first time, you will be confronted by a screen full of errors and another few hours of headache, tracing back through what you did to figure out where each of numerous bugs exists in your thousand line program.

Actually, my proudest moment as an undergrad was writing a single assembly language module that executed perfectly with no debugging on my first try. But that's not something you should count on, and in any case, this was one line in a much larger program.

When you are building a castle of air, you do it brick by brick. Except they're, you know, wind bricks. You don't wave your hands and get a fully formed castle. You break down the project in your mind into easily manageable pieces, and at each step you (1) Look at that step to determine whether you should break it down even further before proceeding, (2) Write complete code that does something visible, and (3) test that component thoroughly.

That third step is crucial, and it's a hallmark of a solid programmer that they develop testable and tested code at every step. When you test a piece of your code until you know it's solid, you will save yourself a lot of time later that would have been spent backtracking to figure out where your code is going wrong.

How do you break it down? Let's consider the project that James set for himself.

One idea I've had is a program that accepts inputs on your credit card balances, minimum payments, etc, and calculates the most efficient payment plan to snowball. I've got a spreadsheet I downloaded from somewhere that I'd model, but the spreadsheet can't jigger your payments to determine the best way to pay off the balances the soonest, or with the least amount of interest. I envision it as an Iphone App, although I learned that without a Mac I can't write it that way.

Actually a spreadsheet probably can do that, but writing a full featured program would be cooler. So let's consider what components this program has.

A popular way of dividing up pieces of a program is to start by splitting it into three layers: a presentation layer (the way your program handles input and output); a data layer (the information that your program will be manipulating); and a business layer (the program logic that figures out what to do). If you split your program up into pieces this way, then you can hopefully change up the pieces at any stage without risking ruining your entire program.

How you start your problem will largely depending on which component is most interesting to you personally. In many projects, the graphics are the most interesting thing. In that case, your top priority would be to get something, anything to display on the screen.

For example: My first real programming job was modeling equations for scientists. There was an eccentric old physicist who wanted me to graph some of the equations he wrote for me, using Fortran on a green and black terminal. As I progressed on getting the equations to display the right answer, he seemed distinctly bored with hearing about the programming details. But then I figured out how to generate a line diagram. It excited the hell out of him, and suddenly he couldn't stop talking about his ideas for what we could do with this.

The diagram, of course, bore almost no relation to the actual equations he had described. But it didn't matter. It was a launching point for his concept of the program. Similarly, as you write your own programs, hopefully you will keep on continually inspiring yourself by producing code that does something that will tickle your interest in the next phase of your project.

Now, in James' case, the interesting part of the project is probably not the graphics. It's the business logic. After all, the ultimate objective is not to create something that looks cool, but something that provides useful information. Since this is a program that you will be using yourself, the inspiration will come from your feeling that "Wow! This is really telling me some valuable stuff! But I bet it would be even more useful if..." And then whatever comes after "if" will determine which piece of the program you conquer next.

So the business layer is the most important part of your program. Therefore, I would recommend that you pick a language, and start with an extremely simple presentation layer and data layer. So for presentation, let's say you'll use plain text input and output. For data, let's say that you'll write a module or an object that hard codes the value of your credit card balances, and then you can tweak the code to change the values.

These aren't good solutions, but they are stepping stones for the program you really want to build. Write good business logic first, using crude presentation and data models, and then make it better later. How will you make it better? Well, you might decide to turn your text input into a web form interface, and perhaps add graphs. You might change your data from hard-coded values, to a separate text file that you will edit by hand, to a database table. As long as you keep these parts of the program reasonably well separated, you can build them up a bit at a time, focusing on the parts that strike your fancy.

Whew, that was a mouthful. Pausing for feedback now. James, do you think this is enough to get you started? Do you have an idea which language you'll start in? Any thoughts or questions?

Monday, August 17, 2009

Q&A open thread

After a two month spell without posting, I'd like to get more out of this blog again, but I don't have any pressing topics that I feel I need to write about. So I'd like to invite questions in the comments section of this post in order to inspire new topics that we could explore. Any pressing questions about software development that need some discussion? Have I posted anything in the past that I ought to expand on? Is anyone anxiously awaiting a continuation of the chess series of posts?

I haven't spoken at all about my current position, which is a lot different from past jobs I've had. I'm currently doing contract work on a product that helps people to automatically activate and configure their high speed internet connections. The work is all done through customized web pages that use ActiveX to deal with low level PC operations. As you may have already noticed if you've been following either my Twitter or Facebook feeds, I spend a lot of time in a computer lab working with custom configurations for a like DSL modem on various configurations, including released candidates of Windows 7. I've never really been a fan of hardware, but this involves a lot of interesting network puzzles and some really advanced ways to use JavaScript.

Anyway, if you've got any questions, other commenters may answer you before I get to it, at least it will help me feel out some directions to take potential upcoming posts.

Friday, August 14, 2009

Question about web security

Wanted to put out a question to readers. I recently happened to tune in to the Kim Komando show and heard something that sounded like a mistake. However, I'm not a security expert, so I'd like to find out more.

A caller had been making use of his neighbor's unsecured wireless connection when his own internet was down. He was wondering about the security hazards of doing this.

Kim's answer was mostly correct, which was to say that anything you transmit or receive over the web is visible across the entire network. By using freely available software, your neighbor could potentially steal your information. He could have a packet sniffer and log the information on his computer, or even in the router.

Kim then went on to say "So obviously you wouldn't want to do any banking over the unsecured line, because you don't know who's looking at your data."

These struck me as solid words of caution, but potentially incomplete. Presumably your bank would make you log in over an SSL connection. My own understanding is fuzzy, but I believe this means that the data gets encrypted at each end before being sent over the network, and only your client and their server have the information necessary to decrypt the data.

Can anybody confirm or deny that my understanding is correct? Is there some way you know of where the neighbor could read your secure data as plain text? If not, what information is your neighbor lacking that would allow him to see it?

Sunday, May 31, 2009

Artificial intelligence in games, part 3

As I warned when I started this blog, I've allowed it to lie dormant for a while, but I feel like it's high time to pick up where I left off and finish the series about playing chess by look-ahead.

So to review the posts that led up to this one:
Now we're going to turn our attention to chess.

Although chess is a much more complex game than tic-tac-toe, it's easy to see that you can apply the same tactics to systematically list all possible games of chess, in theory. Of course I'm not going to attempt to diagram them, but there's definitely a finite number of things you can do on each move. If you're the white player, on your first turn you have eight pawns that you can move forward one or two spaces, and then you have two knights that can move to two spots (forward two, left or right one). So the total possible moves on the first turn is (8*2)+(2*2), or 20.

Then it gets a little more complicated. Black has the same choices, but he has the same 20 choices for each of the 20 possible moves that white could make. So after each player has made one move, there are 20*20=400 possible states the board could be in. It only gets worse from there.

For instance, if white performed the traditional P-K4 opening, on the second move he has nearly all the other choices remaining (7 pawns to move one or two squares, plus two knights two ways, 18 moves) but he also has a whole lot of new choices. The king's knight may move into the spot occupied by the pawn (19). If the black king's pawn doesn't block it, the white king's pawn may advance one space (20) or, depending on what black did, capture an enemy pawn.

Also, the white king's bishop is now free to move to any of 5 spaces toward the upper left (25) and the queen can move to any of 4 spaces toward the upper right (29). So the number of possible moves keeps increasing as the board opens up, and so does the complexity of the tree. Now it looks like by the time white has moved twice, there are about (20*20*29) possible states the board could be in. (Some starters obviously make less than 29 moves possible, and there is more than one way to get to the same state; i.e., P-K4 followed by P-Q4 has the same result as P-Q4 followed by P-K4).

As we observed in the tic-tac-toe problem, you can calculate roughly how many possible games there are in two ways: by multiplying out the possibilities for all moves, or by figuring out every possible state of the board. The second method tends to be smaller because it eliminates various ways of GETTING to those states. However, in the case of chess, figuring out how many moves are possible from each state is massively complicated, so will go with the second approach.

Each side has 16 pieces. The pieces are not unique; for instance, the eight pawns are interchangeable with each other and so are the two rooks . However, to simplify the calculations we'll assume that all pieces are unique. So there are 32 pieces on the board, and 64 squares where they could be located. The first piece could be on any of the 64 squares, the second on any of the remaining 63, the third on any of the remaining 62, and so on. So as a rough guess, the number of board configurations is (64*63*62*...*35*34*33) and then the remaining 32 squares are blank. That is to say, the answer is "64 choose 32", or (64!/(32!*32!))

Side note: Cool feature of Google. Type "64 choose 32" into the search bar and it will answer the question for you. It's 1.83*1018.

Is this number an accurate portrayal of all chess boards? Not exactly. In the first place, many of the moves are not legal. This set of all possible boards includes boards where the two kings are next to each other, or in an impossible multiple check; it includes pawns on the first row, and positions where all eight pawns are on their home squares, but the king and queen are not.

On the other hand, it does not include any position where one or more pieces have been captured, nor does it include positions with two or more queens due to promotion. So there may be more games that are truly possibly, or there may be fewer, but on the order of 1018 possibilities is a fair guess.

Now, 1018 is a very large number. In point of fact, it is larger than the number of seconds since the universe began (about 4.4*1017... although it's only 189 billion if you're a creationist. Har har.) It's close to the number of grains of sand on earth, and much larger than the number of stars in the milky way.

In theory, you could have a computer run through the win/loss state of every one of those games, but not in a reasonable amount of time. If you could analyze one game a second, it would of course take longer than the universe has been around to decide on one move. Computers can do much better than that, of course. Currently the world's fastest supercomputer can perform 596 TeraFLOPs -- 596 trillion floating point operations per second. That means that if you were so incredibly optimistic that you believed that analyzing one board position could be done in a single operation (believe me, it can't), the world's fastest supercomputer would still take 40 minutes to analyze the outcome for a single move. And that level of optimism is ridiculous.

Granted, computers have become a lot more powerful since this topic was discussed in the 1970's, but it's still fairly impractical to play a decent game of perfect chess, unless you have several supercomputers lying around doing parallel computations. Your desktop PC would take years. More realistically, thousands of years. Needless to say, no computer plays chess based on the tic-tac-toe strategy of exploring every possible move.

Next time: What's a heuristic, and how come a computer can beat the human chess champion of the world?

Monday, April 13, 2009

Random numbers aren't random

In a previous post, I mentioned that you can add the appearance of intelligence to a game playing script by throwing in a random component. While the computer can always win tic-tac-toe by playing the same sequence of moves, it looks much less boring if the computer picks different winning moves in each game.

Following up on this, I got a special request to explain how random number generators work. I happen to know that the person who asked for this information is sort of a logical determinist, so she'll be delighted to hear that random numbers are not random at all, but generated by a very specific script.

When we think about random numbers, what we generally want is an unpredictable number in a range, with 0 or 1 at the low end, and some number "m" at the high end. How big m is depends on the application. For instance, to simulate a die roll, we want a random number from 1 to 6. To simulate drawing a card, we need a number from 1 to 52. For a roulette wheel, 1 to 38. For a game with two dice, like monopoly, we need to generate two random numbers and add them together. The rules for a role-playing game such as Dungeons & Dragons will often instruct you to roll 1d20 (roll a 20 sided die, i.e., generate a random from 1-20) or 3d4 (roll three four sided dice) to determine a monster's health, or whether they hit you on a particular round. Games like World of Warcraft have borrowed heavily from tabletop gaming rules in order to come up with event odds in the range that make sense for the game balance.

It's hard to generate a truly random number, since computer programs are fundamentally deterministic in the way they follow algorithms. So they do something slightly easier. Once we have one random number (or pseudo-random), we generate a sequence of randoms after that, where each one is based on the number before it.

There are many functions which can generate this sequence; one common example is this:
  • Xn+1 = (a*Xn+b) mod m
In this case, a and b are constants that you can pick, often large prime numbers. "m" is a maximum value for the resulting number.

Let's try a few examples. I'll pick arbitrary values a=107, b = 23, and m=101.

  • 1 (This comes before the first number. It's called the "seed," as it helps to determine which numbers will come up in the pattern.)
  • 107*1+23 = 130; 130 % 101 = 29
  • 107*29+23 = 3126; 3126 % 101 = 96
  • 107*96+23 = 10295; 10295 % 101 = 94
  • 107*94+23 = 10081; 10081 % 101 = 82
  • 107*82+23 = 8797; 8797 % 101 = 10
  • 107*10+23 = 1093; 1093 % 101 = 83
As you can see, this function produces a string of seemingly unconnected numbers from 0 to 100. However, they aren't unpredictable. If you know the formula, and the current number is 29, then you know the next number is always going to be 96.

Because of this, you wouldn't want to use 101 as your "m" value, if you were actually looking for a random number in the range of 0 to 100. You could always predict the next number in the sequence. If you're a company that writes video poker machines, and your random number generator is easy to reverse engineer in this way, you'd be in very big trouble.

So instead, you'd make all your constants, m, a, and b, much larger. Then you'd take the resulting random numbers and mod them by a smaller number to get program data, but you'd still retain the larger number to generate the sequence. That way, if people see the number "3", they have no way of knowing that the real number in your random sequence was generated by 5,495,110 mod 6 plus one, and therefore they can't deduce what the next number is going to be, even if they know the algorithm.

So the random sequence is masked from the program's end user by not telling them what the formula is. Nevertheless, a random number generator is still predictable based on the seed. If you start with a fixed seed number of 1, and you get the sequence of die rolls 4, 2, 6, 1, 3, then you will always get the same sequence after you shut down the program and restart it.

If an enterprising gambler were to discover that his video poker machine was programmed so foolishly, he could take advantage of it in the following way: First, unplug the machine and plug it back in to reset it. Then, bet small amounts of money while writing down the card sequences that you get. Then, reset the machine again, and bet large amounts of money against the cards that you know will turn up.

So predictability is the bane of random number generators. Clearly, it's important not only to have a disguised algorithm, but an unpredictable seed set at the beginning of your program.

How can you set the seed? It doesn't really matter, as long as it's not based on a known quantity. It needs to be a truly random number, but again, there is no such thing as a truly random number to a computer.

One approach is to use the internal clock, including milliseconds. A human can't really control the precise millisecond at which the program starts, so there's no way to predict what the seed will be.

There are wackier ways to handle it. For instance, an algorithm called "lavarand" operated by pointing a camera at a lava lamp, taking a digitized picture, and creating a hash from the bits. This is chaotic enough to provide something that appears to be truly random.

So to recap, in order to generate random numbers, you need
  1. An algorithm that is deterministic but unpredictable
  2. A random seed that is preferably discovered through some (also unpredictable) aspect of the external world.
There are, of course, also advantages to using predictable seeds. If you always use the same seed, then you can test the same set of data repeatedly, which you sometimes must do if you don't want to tear your hair out trying to nail down a problem with the algorithm that actually does something with the numbers. But in a real world environment, you need some kind of outside information in order to randomize the results and stymie the users.

Monday, April 6, 2009

Artificial intelligence in games, part 2

In the previous post, we discussed the fact that tic-tac-toe is finite game, and you can even calculate all the possible states of the board: it's 19,683, with 362,880 different ways to get to the legal states.  Now we can talk about how to take advantage of this calculation to always win the game.

Let's imagine that we pit two perfect computer opponents against each other.  The first opponent, playing as "X", will want to evaluate each of the nine possible moves on a blank board.  How does it know which is best?

Well, starting in the corner, the question is: "Do I have a sure win on this path?  If so, then I will definitely make this move.  If not, then does my opponent have a sure win?  If so, then I will definitely not make this move."

But how do you know if you or the opponent has a sure win?  Simple: Imagine that you already made this move, and then try to make the best move for the opponent.

You can easily see that this is a recursive process.  It has a recursive step ("Let me try this move") and it definitely has a terminating condition.  After all, if you come to a step where (a) the board is full, (b) X has won, or (c) O has won, you don't need to look ahead to any other moves.

Here's an example of how the computer would evaluate the first possible path.

|  X |   |       X | O |       X | O | X
| ---+---+--- ---+---+--- ---+---+---
| | | | | | |
| ---+---+--- ---+---+--- ---+---+---
| | | | | | |
|
| X | O | X X | O | X X | O | X
| ---+---+--- ---+---+--- ---+---+---
| O | | O | X | O | X | O
| ---+---+--- ---+---+--- ---+---+---
| | | | | | |
|
| X | O | X
| ---+---+---
| O | X | O
| ---+---+---
| X | |

This game ends in a win for the X player.  So if you are the X player, you like this game.  If you  are the O player, you do not like this game.  So the computer then backs up by two moves ("waking up," in the dream analogy) and assumes that O would not go there, but would instead do this:

|  X | O | X
| ---+---+---
| O | X |
| ---+---+---
| O | |

Of course, this game will also result in a win for X if you follow it to the conclusion, which is this:

|  X | O | X
| ---+---+---
| O | X | X
| ---+---+---
| O | O | X

You get the idea.  The computer will play every game to its ending, at which point it will know exactly which games guarantee a win for one player or the other.

Now the pruning begins.  The X-playing intelligence will try to reach one of the game states that guarantees a win for X.  If it cannot, it will at least avoid reaching any game state that guarantees a win for O.

Writing an unbeatable tic-tac-toe opponent in this way is actually pretty straightforward.  You can write it to pick the first "good" move, where "good" means "First try to pick a move that guarantees a win; then try to pick a move that doesn't guarantee a loss."  If you do this, you'll find that your tic-tac-toe game will always play in the same way.  For instance, in the first move, the upper left corner is as good a move as any (the O play can never force a win).  So your program may well always start in the upper left.  This program will not appear to be intelligent.  But it will never lose.

If you want your program to be more interesting, throw in a randomizer.  If there are multiple moves that guarantee a win, then pick one at random.  Otherwise, if there are multiple moves that don't guarantee a loss, pick one of those.  Then you'll have a program that never loses, and also it will make moves that appear unpredictable.

Even though tic-tac-toe is an unwinnable game with two perfect players, some moves are better than others.  Against novice opponents, there are tricks you can use to moderately increase the likelihood that your opponent will make a mistake and lose.  This program doesn't know that.  It doesn't evaluate different positions as tactically stronger.  It just knows whether or not one player is guaranteed to win.

By the way, needless to say, this tic-tac-toe program does not have sufficient reasoning capacity to prevent it from launching a global thermonuclear war.  ;)  The computer doesn't have abstract thoughts; it's just following instructions.  Completely deterministic instructions for winning if it can.

Next time: Chess as a deterministic game.

Saturday, April 4, 2009

Separating form from content

In a previous post I highlighted a video showcasing some experimental wearable Technology. One of the features in this video was that a person picked up a product from the shelf, scanned it, and received some information (green, yellow, or right lights) indicating whether the product was environmentally friendly.

A friend asked:

The thing that interests me the most about this is the impact it will claim to have on shoppers' ability to make more eco-friendly purchasing decisions. Where will the database be kept? Does it already exist? Also wondering how companies would be held to objective standards about the environmentally friendly nature of their products. Are there standards already in place? Will the database include packaging environmental impacts?

The important thing to recognize about the gizmo in the video is that it is not an information creator. It is simply a display device. In the current web 2.0 world, it's important to recognize the separation of form and content.

Essentially, the device consists of a wireless receiver, a camera, and a projected screen. All computers and computer-like devices require three basic components: input, output, and data. They've gotten more complicated over time, but they can still be best understood as variations on those basic components.

As a kid in 1982, I had an IBM PC 8086. For output, it had a screen that would display only four colors (red, green, blue, and of course black), and it had a speaker that could generate various pitches of annoying beeps. For input, it had a keyboard that made loud solid clicky noises; no mouse. It also had no hard drive. All data was stored to 5-1/4" floppy disks, which would turn on a red light when writing and produce an audible "kachunk kachunk" noise.

Many of my friends had Atari 2600s. Ataris didn't come with their own screen or speakers of course; instead, they had a dongle thingy that would plug into a standard TV and provide output. Input was limited to a joystick that had one button and four directions, which are not sensitive to the level of force you apply. Data was in the form of interchangeable cartridges. You could not save the state of your games; your progress only existed in memory while the console was on.

Now the components for a high end gaming system are getting more and more elaborate, especially with high end games: output includes complex 3D graphics and digitized sound, while input may include mouse, keyboard, camera, gamepad, and a microphone. Game files (data) take up several gigabytes, sometimes read from your hard disk and sometimes directly from a CD. Saved games may take several megabytes.

But something else is happening too.

The most important aspect of this is the internet. Thanks to the internet, the data component has shifted away from individual computers and onto servers that can be accessed from anywhere in the world. I don't have to be actually at my home computer anymore to look up some important information that's saved in an email. I don't have to be holding a physical copy of "Consumer Reports" in my hand to check out a product rating.

At the same time, wireless devices -- basically a fancy method of receiving additional input -- are infiltrating everything. We've got wi-fi hubs popping up all over the place, in coffee shops and restaurants and airports and malls. If you're not near a hub and you're willing to pay extra (I'm not yet), your cell phone will pull down an internet connection from satellite. And they're coming up with new methods to connect people, like Mobile Ad Hoc Networks, which can turn a room full of people with wireless devices into a grid of amplified signals.

So, how does the device in the video generate the information about environmental products? It doesn't. It doesn't take up space in the device's memory. It comes from the internet, from sites that are maintained by someone else.

As I envision the guts of this device, you probably install a minimal amount of middleware -- software which "knows" where to get raw data from and what is the best way to display it. The database that keeps track of individual products would probably be very large, but that needn't concern you, any more than the inner workings of your favorite web pages concerns you. You ask for a specific piece of information -- what is THIS product? -- and that gets interpreted into a web request, and the server sends it back as a stream of formatted information, to which the middleware then adds some graphical bells and whistles, for the display.

So then the question is, where does the environmental information come from? Does it already exist now, or is it just a fantasy from the video?

Well, yes it does, I think. I'm not real up on this stuff, but I googled around and came up with this and this. The data may not yet be formatted in a way that is useful to a mobile device, but the information exists in some format. If nothing else, somebody could easily write a program to scan the web page and throw out most of the HTML, pulling out only relevant information and turning it into a graphical display.

So the remaining question is: what is the standard for this information? Are these sites reliable? Well, that's up to you to decide. The main thing that has been accomplished by web 2.0 is that many competing organizations can generate this sort of content in a standardized format. Whether you trust the information depends on whether you trust the organization that generated it.

I can understand the suspicion if it is, for instance, a government agency that is known to have been strongly influenced by a political agenda. Likewise, if it's a corporate site, and the corporation is influenced by advertising dollars. But there are also consumer advocacy groups with a reputation to uphold. And while I am not a pure capitalist by any means, this really is a case where if there's a strong enough demand, the market provides what you want. "The market" in this case may be profit driven, or it may just be a bunch of activists who want to make this data available to further their agenda. Whatever the case, it's likely that you can find the data you're looking for.

The bottom line is that technology is a tool for getting at the information that already exists, or can be created. It gives you quick access to the same data from anywhere in the world.

I don't think the device shown in the video is perfect. After having some discussions about it, I think that waving your arms around is a clunky and tiring way to interact. I bet the screen is hard to read in a brightly lit area. That's not the point, though. The point is that it's becoming easier to apply technological solutions to pull in information about the world around you, and get back useful background information that's not immediately obvious to your senses.

Friday, March 27, 2009

Wearable technology

I want one of these setups. NOW.

Friday, March 20, 2009

Artificial intelligence in games, part 1

"Chess is not a game. Chess is a well-defined form of computation. You may not be able to work out the answers, but in theory there must be a solution."
- John von Neumann

I'd like to talk about some basic concepts of Artificial Intelligence, by talking about games. In a few posts I'm going to tell you how to play a perfect game of chess. Not just a good strategy, but a perfect strategy. With this strategy, you will always beat a non-perfect opponent. If no win is possible, you will most likely be able to end the game in a draw.

Unfortunately, once I've explained this strategy, you will realize that you can't actually use it. So before I talk about that, I'm going to try something less ambitious. I'll tell you how to teach your computer to play a perfect game of tic-tac-toe, also known as "noughts and crosses" in England.

Here's some quick background for those who don't know how to play. Tic-tac-toe is played on a three-by-three grid. Two players mark their squares, one with the letter "X", the other with the letter "O". Traditionally the X player goes first, and may place a mark on any open square. The players continue to alternate marks until one of two conditions is met. A player can win by placing three of his or her marks in a row -- either horizontally, vertically, or on a diagonal. Much more commonly if the players are experienced, the board will fill up, no player will get three in a row, and the game ends in a draw.

Complexity of tic-tac-toe

How many possible games of tic-tac-toe are there? One way to do it might be to count all possible sequences of moves. The first player has nine possible moves to make. The second player can only go in an unoccupied square, so that's eight moves left. The first player can then go in any of seven remaining squares. And so on, so the number of possible games is at most 9*8*7*6*5*4*3*2*1.

This is a common mathematical operation called a factorial. "9 factorial" is abbreviated as "9!" Factorials are very big. Some algorithms have a running time of O(n!), which can be extremely dangerous -- but I'll get back to that in another post. Luckily, in this case n is only 9, and 9! is a fairly manageable number: It's 362,880.

That seems like a big number to people who are good at thinking in the abstract but bad at storing large amounts of information. However, computers are the opposite, and 362,880 is a very small, manageable number of objects to explore.

It is also an upper bound. We know that there are no MORE than 362,880 possible games of tic-tac-toe, but most of the games resemble each other. Another trick we could apply is to calculate possible states of the board without thinking of moves. Each square can be in one of three states. Either it is empty, or it has an X, or it has an O. Thus, the total number of possible states of the board (legal or not) is 3^9, which is 19,683. So actually, there are no more than this many different ways to set up the board.

In fact, there are even fewer possible states, because many of them aren't legal. For instance, the board could contain all X's, but that's not the state of a real game, because the "O" player never got to move. So there are actually far less than 19,683 states, but all we need is a rough order of magnitude estimate to see how big a problem we have ahead of us.

Calculating all possible games of tic-tac-toe

If you go first in a game of tic-tac-toe, there are nine moves that you can make. The possibilities break down like this:

|  X |   |         | X |         |   | X
| ---+---+--- ---+---+--- ---+---+---
| | | | | | |
| ---+---+--- ---+---+--- ---+---+---
| | | | | | |
|
| | | | | | |
| ---+---+--- ---+---+--- ---+---+---
| X | | | X | | | X
| ---+---+--- ---+---+--- ---+---+---
| | | | | | |

| | | | | | |
| ---+---+--- ---+---+--- ---+---+---
| | | | | | |
| ---+---+--- ---+---+--- ---+---+---
| X | | | X | | | X

Actually, these nine positions do not really represent unique games. Really, there are three unique moves that you can make: Go in the center, go in a corner, and go on an edge. The rest of the moves simply lead to rotated or mirrored versions of the same games. Some algorithms might take advantage of this similarity and reduce the search space, but for our purposes, it still doesn't take too long to treat all combinations as unique.

Now, if you go in the upper left corner, there are eight moves that the "O" player could make. Here are a few.

|  X | O |       X |   | O     X |   |
| ---+---+--- ---+---+--- ---+---+---
| | | | | O | |
| ---+---+--- ---+---+--- ---+---+---
| | | | | | |

Again, these do not represent unique games. For all practical purposes, the first move and the third move are the same game. But it's easier to program (though slower to run) if you don't bother trying to analyze the games.

In the next post, I'll explain how we can use this system for categorizing every possible game to write an unbeatable opponent.

Wednesday, March 18, 2009

Recursive sorting 2: QuickSort

I got good feedback from my last post about recursive sorting, some of it by email and in person.  I also heard some serious concerns about my world domination scheme. You can all relax, because my Evil Generals are all away on extended vacation for the foreseeable future. This post probably won't be as amusing as the last one, as I've already explained the basic groundwork. But I'll try to give you the basics so you can do QuickSorting yourself.

Here are the fundamentals of how QuickSort works. In Merge Sort, we split up two decks of cards by size, and then sorted each half. How exactly you "sort each half" is up to you. If the remaining divided up decks are still large, you can just divide them in half again and run two more Merge Sorts. But if the decks are small, you can sort them in another way. Even Selection Sort might be a good choice if you have no more than, let's say 10 cards. OR, you can just keep dividing and dividing until you get down to two or three cards, and then perform a couple of swaps to get the cards in order.

QuickSort is similar, but a little different. We still split up the cards, but in this case we split them around a card value instead of a specific deck size. Here's an example. Start with unsorted card:
  • 8, 4, K, A, J, 5, 9
In QuickSort our plan is to start with the first card (8) and put it in the correct place.  You remember that in Merge Sort we did the recursive step -- the splitting up into smaller problems -- first, and then the action of actually moving cards around was saved until we reached the lowest level (merging the cards).  That's when we actually do something constructive... like telling an evil private "Go kill that guy."  In this one, we do the reverse.  The action step comes first, and then the recursive step second.  After the 8 is in the right place, we'll sort the cards around it, using another call to QuickSort.

When I say we want the 8 in "the right place,"  what I mean is that we want all the cards that are less than 8 to be on the left side of the 8, and we want all the cards that are greater than 8 on the right side.

How do we get there? The 8 starts in the low position, so put your left hand on the 8, then put your right hand on the right side of the list. That's 9. 9 is bigger than 8. No switch is necessary yet, so move your right hand one space to the left, toward the 8. You're now on 5. 5 is smaller than 8, so it's in the wrong position.  Do a swap and you get this:
  • 5, 4, K, A, J, 8, 9
So now the 8 is on the high side. It is correctly positioned with respect to some of the cards: it is between 5 and 9, which is where it should be. But you're not done yet. Now you have your left hand on 5, right hand on 8.  Move your left hand up one space, to the 4.

4 is less than 8. Keep going.

King is more than 8. It's in the wrong place.  Time to swap.
  • 5, 4, 8, A, J, K, 9
Now the 8 is correctly sorted with respect to the cards you already looked at. It is more than 5 and 4, but less than king and 9. Nearly done.

The 8 is now on the low side. With your right hand on the king, move backwards. The jack is in the right place. The ace is smaller than the 8. Swap.
  • 5, 4, A, 8, J, K, 9
...your hands now meet in the middle, and you're done. The 8 is in the precise position where it should be.

Now what? Here's the recursive step. You have the 8 in the right place, but you have three cards on the left that are totally scrambled, and three cards on the right that are also scrambled. How do you unscramble them? You QuickSort them, of course.

First you sort this array:
  • 5, 4, A
Then you sort this array:
  • J, K, 9
Do it in the same way as before. For the first set of three cards, put your left hand on the 5, and move it to the proper place (the third position). For the second set, put your left hand on the jack, and move it to the proper place (second position, between J and K). By the time you've finished, we discover that all the cards are sorted. Quick, wasn't it?

Seriously, get your own deck and try this on 20 cards. It's easy and fun once you get the hang of it. Lay them out in a line. As you work, you'll put a card at a time in its final location. When you do this, push the card up slightly so it will stand out from the others. Then work on the unsorted cards from left to right until you're finished.

Complexity of QuickSort

Starting with the first card, you made one complete pass through the array in order to put it in the right place. So this algorithm is at least O(n). Then you performed this pass through the array multiple times. How many?

In the best case, QuickSort does the same thing as Merge Sort: it splits the array almost exactly in half. (In the above case, our first number, 8, landed smack in the middle of the array, with three numbers on each side. That's a perfect split.) So basically, you're dividing by two each time, and after (log2 n) divisions, you're down to sorting one card -- just like what we did in a binary search. So we simplify this and say we examined "n" cards approximately "log n" times, so as I indicated earlier, this algorithm is O(n*log n).

If you're lucky!

But what happens if you start with an array that's already sorted?
  • A, 4, 5, 8, 9, J, K
Well, the first card is already the lowest. So you'll start from the king and start moving left, but eventually you'll get to the ace, having examined all the cards but not moved any.

Then you'll "split" around the ace... but there are no cards at all on the left, and (n-1) cards on the right. So you'll do almost as many comparisons as before against the 4. Then against the 5. And so on, until you've compared every card to every other card. This takes O(n2) time. Needless to say, that's bad.

So what we see here is that there's a difference between the complexity of a typical execution of an algorithm and the worst case scenario. Knowing that, we can do things to plan for the worst case.

Unfortunately, sooner or later it's highly likely that you will accidentally try to sort an array that is already sorted. So although it is counter-intuitive, we can guard against this by scrambling the array first and then sorting it. Scrambling is easy... all you do is go through the entire array (n operations), pick a random number (picking random numbers takes constant time, so it's O(1)), and swap the current number with the one at the random location.

This takes 1*n moves, so it's O(n) time. Then "scramble + sort" would take O(n log n + n) time But n is of lower complexity than n log n, and at high numbers it will be trivially small in comparison. So the "scramble + sort" routine is simply expressed as O(n log n).

Of course, you could be super unlucky and wind up with a scrambled array that happens to be perfectly sorted.  But as you get a very large number for n, the chances of that happening become extremely small.

Another way to randomize this thing is by picking random pivot point.  Instead of always trying to position the first card, pick a RANDOM card, move it to the leftmost location, and put that one in position.  This way you fix the problem of potentially starting with the smallest card every time.

Saturday, March 14, 2009

Recursive sorting, part 1

I thought I could get away with just a brief mention of QuickSort, but apparently some people really want to hear details.  So here you go.  Today I'm going to talk about something that is not QuickSort, but it is a closely related technique called a Merge Sort.  This is tangentially related to the discussion about Big O notation, but it's not really my main focus here.  I'll just tell you in advance that, like QuickSort, this takes a time of O(n log n).

Along the way, I'm going to have to explain the basics of recursion to the novices, who may have been a little uncertain about what to make of the story about my dream.

Understanding recursion

Imagine that I am playing the board game "Risk" and I have an objective to take over the world.  No wait, better idea.  Imagine that I am an Evil Genius, and my actual objective is to take over the world.

The world is a big place.  Too big to plan my attack in every minute detail by myself.  How am I going to go about this?

Presumably I need military commanders, so I'll assume that I have a number of Evil Generals at my Evil Headquarters, who are expert military strategists.  Luckily, the world is already divided up into several logical categories, known as continents.  So I formulate a plan.  First I'll conquer Europe.  Then I'll conquer Asia.  Then Africa.  Then Australia.  Then North America.  Then South America.  Conquering each region will help me gain additional bases, which will aid me in the remainder of my nefarious plans.

So I pick my six finest generals, and I say "Okay, listen up minions!  YOU.  Conquer Europe.  I don't care how you do it, just get the job done and report back to me.  YOU.  Conquer Asia."  And so on.  I assign one continent to each general.  Then I sit back and relax, confident that my brilliant generals will each do their job thoroughly and well.  My part in the plan is finished.

But continents are still big places, so my Generals now have quite a job in front of them.  Consider the poor General whose job it is to conquer Europe.  Where does he start?  Well, like me, the logical thing to do is divide up the work, and the way to do that is country by country.  "First I'll conquer France," he says.  (Ha ha, get it?  That's apparently screamingly funny if you're a neoconservative.  I aim to please a wide audience.)  "Then I'll conquer England.  Then Germany."  Etc.

So my Evil General calls in all of his Evil Colonels, and he says "Okay, you get France, you get England, you get Germany," etc.

The Colonel in charge of conquering England then calls all his Evil Majors and explains "You conquer London, you get Southampton, you get Lancashire," etc.

You can expect this process to trickle down through several more layers of authority, but here's the thing: you can't go on subdividing the territories forever.  You can't run an organization with only a bunch of managers, because sooner or later somebody has to do the actual work.  You can't have some little underling running around saying "Okay, you conquer this microbe, and you conquer that microbe."

What you need is a terminating condition.  On some level, you simply have to have a whole lot of Evil Privates, and the Evil Sergeants will point at the enemy and say "Go now and kill that guy over there."  Privates don't get to delegate.

So my Privates kill people, which is something that is in some way completely different from what everyone else is doing, which is just splitting up the territories and telling somebody else to do the work.

Eventually, all my Privates report to all my Sergeants, announcing "Territory secured, sir!"  Then all the Sergeants report to their respective Evil Lieutenants, and so it goes.  Up the chain, eventually reaching my Colonels, who say "Country secured!" and my Generals then come to me and say "Continent secured!" and then I chortle with glee in my Evil Lair.  "I've done it!" I say.  "I own all the continents, and the world is now mine.  MINE!  Mooha, mooha, mwahahahahaha haaaaaaaaaaa!!!"

Or words to that effect.  I'll work on my speech when the time comes.

Today, Merge Sort.  Tomorrow, the world

We've just covered a fundamental principle of recursion, known as "divide and conquer."  In this case, we talked about literally dividing and conquering territories of the world.  Now we're going to bring it back to the domain of sorting cards.

I give you an entire deck of cards to sort, and I say "Put these cards in order.  You have three minutes."  I know, those instructions are a little unclear because the suits can come in any order. Typically in a new, unshuffled deck, all the spades come first, then the hearts, then diamonds, then clubs; and then within the suits, the cards are ordered from Ace to King.  We'll assume that you and I both know and agree on the order of a "correctly" shuffled deck.

So you have three minutes to sort the cards.  Now, you know your card-sorting abilities really well, and you think to yourself, "Hmmm, I know how fast I can work.  I can finish about half a deck in two minutes.  If I do that I'll finish almost all the work, but not quite.  I'm gonna need some help."

So you call in a friend, who also knows everything there is to know about correctly sorted decks, and works just as fast as you do.  You say to your friend "Hey, glad you're here.  Here's what I want you to do.  I'm going to split these cards in half, and give half to you.  You sort your half, I'll sort my half.  Then when I'm done, we'll deal out our two sorted half-decks into one pile, or 'merge' them, and we'll have a whole sorted deck."

So you divide up the deck into two.  You have 26 cards, which you sort using your favorite technique (you can use selection sort if you want).  Your friend sorts his pack of 26 cards, and then you get together.

You have the ace of spades.  You lay it face up on the table.  Your friend has the two of spades.  He lays his down.  Your friend also has the three, so he lays another one down.  You have the four, so you lay it down.  You continue until the whole pack is done; flip the pack and you can deal them out in perfect order, ace first.

You finished your deck in two minutes, and your friend finished his deck in two minutes.  Both of you worked in parallel, so the two of you were done in just two minutes, not four.  Merging your decks goes pretty quickly, only taking up another minute.  Three minutes have elapsed, and you're done.

Double your fun

I tell you: "Great job sorting that pack, now I've got more for you to do.  I have two different packs of cards this time, all mixed together.  I want you to sort them out into two piles, one with blue backs and one with green backs.  I know this will take more time... I'll give you five minutes."

How long does it take to sort a whole deck?  From the last job, we know it takes three minutes.  Clearly five minutes won't be enough for two decks, so we're going to need more friends.

So we get two more people in on the action.  You tell your friend: "I'm going to split up these 104 cards into two piles of about 52 cards each.  You get your half sorted, and I'll get my half sorted.  Sort them the same way you did before, but this time always put blue cards before green cards."

But selection sort is way too slow, so you then take your 52 cards and turn to another friend.  You say "Now I have 52 cards.  You take 26 and I'll take 26, and we'll each sort them in two minutes."

Two minutes later, you're both done.  You take another minute to merge the decks.  Now you have 52 sorted cards.  It's been 3 minutes.

You turn to your friend and discover that, with the help of the fourth guy, he also has a sorted pack of 52.  You merge the cards -- there are more than before, but you finish dealing them together in another 2 minutes.  Five minutes are now up, and you have a single pile of perfectly sorted cards.  Then all you have to do is separate them in the middle, to get two decks.

Generalizing the solution to "n"

By now I'm sure you've got the idea.  I can sort four decks, or eight decks, or any number of cards I want.  All I need is more friends.  Your friends aren't "free" -- they take up a lot of space, demand lunch breaks, and if you're asking for a lot of work they might want to get paid.  If you don't like that, then you're free to go back to selection sorting the whole deck.  But with one person performing an O(n2) algorithm, the job will take hours if you have a lot of cards.  Much better to keep dividing and subdividing your packs until you have a manageable little pile, and then it gets easier to merge a few piles at a time.

The analogy to taking over the world should be clear.  "Divide and conquer" is in fact a well known program-writing technique.  Splitting the deck and handing the job to friends is directly analogous to filtering the world by smaller pieces of land, and then handing them to lower ranking underlings.

Another point this illustrates is the frequently recurring theme of trading space for time.  As I said before, the more friends you have sorting cards, the more costly the job is.  Inside the guts of your computer, what this means is that using a recursive algorithm eats up more of your computer's memory.  This can cause everything else in your operating system to run slower, if you start using up too much memory.

On the other hand, computers these days have a lot of memory, and unless you're trying to run World of Warcraft while you're computing, most of your memory is often sitting idle.  Why not throw more storage at the problem, if the time saved is so great?

Better yet, thanks to advances in parallel computing, it's now possible to actually physically divide up the work among multiple machines.  You order one computer to sort half the cards, and another computer to sort the other half.  Then you just take up a little network traffic to have them report to a master server, which consolidates the information.

One example of a popular program that takes this approach is the SETI@home project.  We may never discover intelligent life elsewhere in the universe.  But if we want to try, we've got massive task collecting readings from different areas of space.  SETI takes advantage of the idle time on computers whose owners have voluntarily installed their program.  Your desktop runs the program when you enter screen saver mode, analyzing different areas of space, and the SETI server gathers whatever interesting information that you discover.

Now that everyone's up to speed on the basics of divide-and-conquer, I'll go over QuickSort next time.

Thursday, March 12, 2009

Big-O Notation 3: Showing off how you think

Some job interviewers ask you to describe your career history in great detail.  Some want you to to explain the most obscure facets of the language you'll be working with.  And some will give you a puzzle and ask you to design a program that solves it.

This last category is by far my favorite sort of opportunity in an interview.  I'm a programmer because I love solving puzzles, so showing off that love is far more likely to make me look good than struggling under pressure to remember whether it's ArrayList or Vector that is thread-safe.  (Tip: It's Vector.)

When you get asked a puzzle problem, remember above all else that the interviewer is much more keen to know how you think than whether you get the "right" answer.  There may be some specific technique that they are hoping you know.  But even if the answer is not immediately obvious, don't panic.  If you sit there and explain your reasoning, refining your technique out loud, you can make up a lot of ground by showing that you can think on your feet.

A good example is the question I got from the Director of Engineering at Digital Motorworks.  Not only did I get the job, but that portion of the interview went so smoothly that I wrote down the question, my answer, and his reaction to save for posterity.  Here is a recap of the problem:

You're responsible for implementing a method called "sumExists."  This method should read two arguments: an integer, and an array of integers.  The return value will be a boolean, which is true if and only if there exists some combination of two distinct elements in the array that sum to the original integer.

For instance, suppose you call the method in this way:
sumExists(5, [7, 1, 2, 6, 4])
The array contains the elements 1 and 4, and 1+4 = 5, so the method should return true.

I immediately jumped to the most obvious solution -- not because I expected it to be right, but because I wanted to find some answer quickly and then improve it.  I said:

"If you wanted to use brute force, you could just inspect every element of the array, and compare it to every other element of the array, and see if they sum to the first number.  But that would require you to perform O(n2) operations, which is dumb.  So I would like to find a better solution.

"Suppose that we sort the array.  That would take O(n*log n) time (QuickSort, remember?) to complete.  Where does that get us?  Well, then we could quickly find the largest number and the smallest number of the array and compare them to each other.  We'll put counters on the left and right ends of the array, and add those values.  If they add up to the target, then we're done.  (Note: in the example, the sorted array is [1, 2, 4, 6, 7].  The high and low numbers are 1 and 7, which add up to 8.  Too high.)

"If the resulting number is too high, then the biggest number must be too big.  So we'll move the right hand counter to the left and try adding a smaller number to the smallest.  On the other hand, if the sum is too low, then the smallest number must be too small.  So we'll move the left hand counter to the right and try again.

"By doing this we'll be able to test pairs of high and low numbers until either we find the sum we need, or the counters meet in the middle.  If the sum is correct then we return true.  Otherwise, return false.

"Scanning the list in this way requires only one pass, which takes O(n) time.  But the sort was already longer, it takes O(n*log n) time.  The time to scan a large array once would be much smaller than the time to sort, so we ignore that factor.  This solution takes O(n*log n) time."

As I said, I wrote down the response and saved it.  So this is more or less a direct quote: "I've been doing this for a long time, and there is not one person in a hundred who has given as thorough and correct an answer as you just did."

There were other difficulties with getting this job, based on my short experience with specific technologies.  I'm convinced, though, that this was the best part of my series of interviews, and it got my foot wedged pretty firmly in the door.

Writing software is fundamentally different from working on an assembly line.  You don't solve the same problem twice.  You don't write the same code twice.  If you are repeating yourself, then you are in some way failing to leverage past experience.  You should be moving the duplicated code into a common function, or a common library, or something, so you can stop working on problems that have already been solved, and get the business of making your program do something new and interesting.

Because of this non-repetitive approach, it's important to be able to think about every problem from multiple angles, to try and relate it to past experience and evaluate different kinds of solutions.  Big O Notation is one tool for being able to evaluate how you're doing.

Wednesday, March 11, 2009

Count on this

Let's switch things up a bit and just do something for fun.  How high can you count on your fingers? Ten, you say? Amateurs. I'm going to teach you how to count much, much higher... without even taking your shoes off.

However, you're going to have to learn to count in binary first. Again, this is a very basic topic to computer science students, but I hear that there are plenty of non-programmers who read this blog. For their benefit, I'm going to give a primer. If you already know binary, feel free to skip down to the next bold header.

How to count in binary

When you count by tens, the digits can be represented as wheels on the odometer of your car. The right-most wheel goes:
  • 1, 2, 3, 4, 5, 6, 7, 8, 9…
Then what happens? There is no "10" printed on the wheel. So where do we go next? The rightmost wheel flips from "9" back to "0", and as that happens, it also pushes the next wheel up by 1. So you have a 1 and a 0, and that's ten.
  • 11, 12, 13, 14, 15, 16, 17, 18, 19…
Then the first wheel again flips over from 9 to 0, which in turn advances the second wheel from 1 to 2. Twenty. Right? Don't tell me this is hard, my son Ben understands this and he's in first grade.

So anyway, things are pretty much the same from here on out, until we get up to:
  • 97, 98, 99…
And now? Well, the first wheel flips to zero, which in turn advances the second wheel. But the second wheel is already maxed out at nine, so it flips to zero, which also advances the third wheel, so now that wheel reads "1". 1, 0, 0. A hundred. Got it?

Okay, that's base 10. Binary is a similar system, except that it's got a severe shortage of numerals. Whereas we were using all the numbers from zero to nine, in binary you only have two possible values: one, and zero. Like a light switch: on, or off. Circuit open, or circuit closed. That's how a computer works. But with just ones and zeroes, you can represent every positive integer if you have the patience.

Let's imagine we have an odometer like before, but this one has very small wheels. They flip from 0 to 1. Then they flip from 1 back to 0. When they flip to 0, they advance the next wheel. Okay?

So your odometer reads:
  • 0: 00000
Advance the rightmost counter by one:
  • 1: 00001
Advance the counter again, and it flips to zero while dragging the second wheel along:
  • 2: 00010
So that's two. Ten is two, in binary. Makes perfect sense, right? Advance the right hand wheel one more.
  • 3: 00011
Next time we advance, both of the first two wheels will advance the next one over.
  • 4: 00100
One hundred is four. I think you can take it from here.

In base ten, every digit represents a value that is ten times the previous digit. So the first digit is worth one, the second digit is worth ten, the third digit is worth a hundred, the fourth is worth a thousand, and so on.

Similarly, in base two, every digit represents a value that is two times the previous digit.
  • 1 -> 1
  • 10 -> 2
  • 100 -> 4
  • 1000 -> 8
  • 10000 -> 16
Makes sense, right? You can write out any string of ones and zeroes and figure out what number they represent in binary by adding up the digits. For example,
  • 10110 = 16 + 4 + 2 = 22
Okay, here's the fun part.

How to count to 1,023 on your fingers

  • Hold out your hands, palms upward. Now make them fists. No fingers are extended. That's zero.
  • Extend your right thumb. The right-most digit is now set to "on". That's one.
  • Close up your thumb, but extend your right index finger. That's two.
  • Extend your thumb again, and also leave your index finger out. Three.
  • Close up your thumb and index fingers, but then extend your middle finger. That's four. It is also an obscene gesture to most people, so if you're in public, make sure your hands are under a desk or something.

That's really all there is to it. The highest you can go with this system is 1111111111 in binary, which is 1,023 in regular counting (210 - 1).

Other numbers that form obscene gestures to Americans: 128; 132

Numbers that may be obscene to Europeans: 6; 384; 390. They may also be interpreted as peace signs or a Richard Nixon impression.

What can you do with this? Not much. You can kill about ten to twenty minutes. You can also probably perform a cheap mind-reading trick if you have an accomplice who can also do it. ("I'm thinking of a number from one to a thousand." "583. I certainly wasn't paying attention to the funny way you're holding your hands.") And finally, you can waste a perfectly good afternoon writing a silly blog post.

By the way... if you DID take your shoes off, you could count to 1,048,575.

Tuesday, March 10, 2009

Big O Notation 2: Sorting

As a Card-Carrying Nerd (you have to mail away exactly 314 cereal box tops for your card) I find entertainment in some very unusual activities.  For example, I like to execute algorithms by hand. When I find a pack of cards lying around, I often find myself shuffling them, dealing out about 10-20, and operating a QuickSort on them. Sometimes when I have a stack of old bills and I want to eliminate duplicates while retaining the latest copy for my records, I'll even lay them out in a grid on the floor and perform a bucket sort.

Sorting algorithms are material from Intro to Computer Science 1, but I'm going to give a very condensed overview of the topic as background material to what I'm going to go over in a future post. If you're already rock solid on sorting algorithms, I suggest you skip or skim this one, and I promise I'll get to more practical discussions later.

You'll remember that in the last post on Big O, I explained that a "dumb dictionary search" (just viewing each word one at a time) requires O(n) time, while a more efficient search (which is also known as a "binary search") requires only O(log n) time, which is a lot shorter. But the binary search only works if the words are in some kind order already. If they're all scrambled up -- if "xylophone" comes before "cereal" comes before "pheasant" -- then you have no choice but to execute a dumb search. There's no other way to find your word besides looking at every entry to make sure you didn't miss any.

Therefore, it's often helpful to sort a list of things (such as dictionary words). Of course, sorting a dictionary might take longer than actually searching it. But if you have to search the dictionary multiple times, it's worth it. You eat the cost of sorting only once, but you get to do binary searches as many times as you want after that.

Suppose we take a pack of cards and deal out "n" cards at random. Let's make this quick and pick 5. I'll imagine they came out like this:

  • 5, A, 7, K, 3

How long will it take to put these cards in the correct order? Well, naturally it can't take less than O(n) time. It's easy to see why. Suppose you had a magic function which automatically skimmed the entire list in no time at all to find the smallest number remaining. Then you could scan these cards and find the Ace first, and put it in the first spot. Then you find the 3, and put it in the second spot. And so on. Then the sequence of events would look like this:

  • A, 5, 7, K, 3 (we put the ace in the first spot, but we needed to move the 5 out of the way, so we just swap their positions)
  • A, 3, 7, K, 5 (3 swapped with 5)
  • A, 3, 5, K, 7
  • A, 3, 5, 7, K (sorted!) 

This is the most efficient sort possible, but even with our magic function you still have to move each card to the correct location. There are n cards, so it takes n moves. So no matter what happens, we can't do better than O(n). (Technically, it took "n-1" moves to finish, because once the first four cards are sorted, the king MUST be in the right place. There's nothing to swap with. But remember how I said before that constant multipliers don't matter? Subtracting a 1 definitely doesn't matter. In fact, we ignore any constants that are added to or subtracted from the outcome. Order of magnitude is all that counts.)

Unfortunately, we don't have a magic function that takes no time to find the Ace, so we actually have to read the cards in order to find it. How do we do this? Well, we could read the entire list manually, finding the Ace by inspecting every card. This is a technique known as a "selection sort," for the obvious reason that we are looking through the unsorted list to select the smallest number. Like in the dumb dictionary search, that would take O(n) reads to accomplish (technically n/2 on average, but constant multipliers don't matter).

But it's worse than that. Because we have to move n cards, AND we have to read n cards each time we move one. So we do n*n things, which means that this sort is O(n2).

Terrible. Remember, our dictionary had 170,000 words. You square that, and you're doing almost thirty billion operations. There must be a better way.

There is, and one example is QuickSort. QuickSort is a recursive algorithm that is a little bit tedious to explain in a blog post, but it's so successful that it is implemented as a standard feature of many languages. If you type "sort @mylist" in Perl, you are instructing Perl to perform a QuickSort in most cases. If you have some time to kill, I recommend that you look up QuickSort on Wikipedia, lay out some cards, and try to figure out how it works. It's kind of cool.

QuickSort usually requires a time of O(n*log n), and you can't do much better than that. Well, there are some tricks you can use to speed it up if you know exactly what sort of data you're dealing with, even getting pretty close to O(n) in some cases. But for the most part, you're stuck with n*log n. And this makes sense: In the dictionary example we discovered that you can find a word in O(log n) time, and it takes O(n) time to move all the numbers, so in the best case we are going to find a number as quick as possible and then move it.

If you sort a 170,000 word dictionary in O(n*log n) time, how long will it take?  Log 170,000 is about 5, and multiplied by 170,000 itself is about 900,000.  That's not fast, but it's nowhere near as bad as 30 billion.

This is the last purely academic post I'm going to write about Big O notation. By now, I'm sure some readers are asking "What's with all this theory? Am I ever going to use this professionally?"  My answer is: "You might.  For one thing, you can use it to show off in job interviews."  Next time I post about Big O, I'll explain how.