Monday, September 19, 2011

Operation HackMaster Crit Tables, episode 2

Now that I've explained what data structures are for, I can finally explain how I approached the problem of deciphering those ponderous HackMaster tables.

First of all, I discovered to my dismay that the tables were only available in the form of PDF files -- images, not text.  Just as I was about to tell my friend that I didn't want to waste time converting six pages of tiny print to very usable data, my lovely assistant Lynnea (the aforementioned fiancee who actually plays the game) stepped in and volunteered to do it.  This is one thing about her personality that I've never been able to understand, but she loves doing data entry, filling out forms, etc.  I think it's some kind of OCD thing.  But for whatever reason, she's extremely enthusiastic, diligent, and thorough about this kind of work.  And by the way, if you need this kind of work done, she's available to hire!  Ask me for a resume.  :)

With this powerful slave human resource at my disposal, I wrote up a few sample lines of text in a spreadsheet to show how I wanted them, wound her up and let her go at it.  She cranked the rest out in a surprisingly short time.  I then converted the results to standard comma-separated value format, some of which you can download from here: Hacking weapon table part 1; List of effects.  In case you're not familiar with them, .csv files are a generic text-only format which can be read in a spreadsheet program like Excel, or any standard text editor.

Working with just my sample rows, I set out to work out what the abstract properties of the data were.  The first thing to consider is the way a body part is selected.  In the hacking weapon table, you can see that if you roll a 1-100, you get hit in the "Foot, Top"; if you roll 101-104, you get hit in the heel, 105-136 is Toe, and so on.

This is like a hash table, almost, but it's not one.  If it was a hash table, you'd usually have one body part per number: 1 -> Foot Top, 2 -> Heel, and so on.  Here, we're working with a range of numbers corresponding to each lookup value.

I decided to start with a generic lookup table, where you start with objects which contain a "low" value, a "high" value, and a generic object which can get returned from the lookup. The declaration looks like this:
public class RangeLookup<T>
{
   private List ranges;

   private class Entry
   {
      protected T item;
      protected int low, high;

      public Entry( T i, int l, int h )
      {
         item = i;
         low = l;
         high = h;
      }
   }
   ...
}
In Java using the "<T>" notation means that "T" could be anything.  Even though I wasn't going to be using this lookup table more than once, I like to keep structures as all-purpose as possible.  That's partly because I might want to reuse them in the future, and partly because I want to be able to test how the component works without making it dependent on the equally complex item which will be retrieved by the lookup.

Every structure needs an interface -- a means of communicating with it that only does what you want and hides the guts of it from the rest of the program.  I created an "addEntry" function to the RangeLookup class, so that you could insert a new entry with a high, a low, and a retrieved object of type T.  Then I added a "lookup" function where you send in a number, it gives you an object.  In my implementation, the lookup function simply walks through all of the possible results and checks whether the requested number is between the high and the low.  This would be inefficient if there were going to be a lot of entries, so I might have come up with some kind of hashing structure or tree search; but since there are only about 20 or so body parts, it wasn't worth the extra effort and runs fine as is.

After verifying that this was working right, I created the following additional structures:

  • Looking at the Effects table, it is a basic mapping (in my case, placed in a HashMap) from one string to another.  You put in the code "f", and the resulting effect is "fall prone and drop items".  So, I created a simple object called an "effect," containing "key" and "description."
  • It's a bit more complicated than that, though.  Often the table will contain numbers, but the effects will contain only the symbol "X".  For instance, if the table says "d4" then the relevant effect is "dX", which means "reduce Dexterity by X".  Therefore I made another class called an "Outcome," which contains an Effect AND a number (which may be zero if it's not necessary).
  • I made an EffectTable, which implements the HashMap of Effects.
  • Almost ready to create an actual table object, I first made a class called "CritTableEntry."  This represents a cell in the table.  It contains: a low roll, a high roll, the name of a body part, and a List of effects (because each cell may result in several outcomes, not just one).
  • A CritTable class to put them all together.  This class has an addEntry method and another method for retrieving the entries.

As a final step, I created a "Reader" class which did the heavy lifting of reading and interpreting the CSV files and adding one row at a time into a generated table.  I don't like to reinvent the wheel, so I googled class libraries which would read CSV files and interpret them as lists.  I settled on using OpenCSV.  I could have written my own parser, but when the task is as common as reading a CSV, I tend to assume that somebody has already done all the work before me and has already been through the process of making all the mistakes and catching the bugs which come up.

Notice that none of these objects deals with input and output directly.  It's preferable to test each component of your program separately as much as possible BEFORE trying to decide what kind of user interface to make.  Your interface should be tailored to the problem space.  As it turns out, I wound up creating several different interfaces before I settled on created a web application.  I'll discuss these concerns in a later post.

When testing your data structures it's a good idea to create unit tests.  A unit test is a small, self contained application which is designed to test one thing at a time.  You need to think about every possible way that your program might break, create a unit test for each one, and make sure that it works right at the boundary conditions.

Off the top of my head, here are some boundaries of the crit tables that needed to be tested:

  • Spot check several "body part" rolls with random(ish) numbers and see that the returned information matches the table.
  • Spot check several "outcome" rolls in one row and see that the returned effects match the table.
  • Test the boundaries of some rolls.  For instance, on the table I linked, "4301-4492" corresponds to "Arm, upper inner", and "4493-4588" corresponds to "Elbow".  Therefore I have to make sure that a roll of 4492 returns a different part from 4493.
  • Test when happens when the body part roll is 0 (invalid), 1, 10000, and 10001 (invalid).
  • Test what happens when the effect roll is 0, 1, 24, and 25.

Keep all your unit tests around forever.  If something breaks, that's a quick way of figuring out which part is not working.  If it's a problem with your data model rather than your user interface, the unit tests will catch it.

Next time I'll be talking about all the different ways of making an interface on the same models.

Wednesday, September 14, 2011

A bit about data structures

I wanted to write another HackMaster post, but what I wanted to write about was the way I approached deciphering the data in the tables and converting them into data structures.  Then I skimmed through some older posts looking for reference points about data structures, and it occurred to me that I've never written any. In order to provide a foundation for the rest of the HackMaster breakdown, I'll have to digress and talk about structures in the abstract.

Whenever you are presented with a problem of modeling some numbers in conceptual space, the first thing you have to figure out before you write a single line of behavioral code is what kind of data structures you are going to use.  Going all the way back to the beginning of this blog, I've emphasized the importance of considering the efficiency of your design and the effect that it has on the Big-O performance of your program.  Thinking about proper data structures can buy you a lot of speed, and it can also make it really easy to visualize your program in small chunks as the complexity increases.

So what's a data structure?  The first thing programmers learn is how to use variables for individual chunks of information, like this:
int x = 3;
String str = "Hello world.";
 (Technically, of course, a String object in Java is a whole bunch of characters, which makes it a data structure in itself.  But the nice thing about object-oriented programming is that you don't have to think about it if you want to.)

To understand data structures, consider an array.  An array is one of the first slightly more advanced concepts that a beginning programmer will run into.  Instead of storing just one integer, it can store several.  For example, here's a simple representation of part of the fibonacci sequence:
int[] fib = new int[10];
fib[0] = 1;
fib[1] = 1;
fib[2] = 2;
fib[3] = 3;
fib[4] = 5;
fib[5] = 8;
fib[6] = 13;
fib[7] = 21;
fib[8] = 34;
fib[9] = 55;
When you create a single "int," you're asking the program to set aside a chunk of space in memory, large enough to hold one number.  When you create an array like this, you're asking the program instead of set aside a bigger chunk of memory ten times that size, plus (for some languages) a little bit of extra information about size constraints and such.

But arrays can be wasteful.  What if you want to set aside space that sometimes houses a hundred numbers, and sometimes houses just a few?  You could create an array of size 100, but most of the time that space would be wasted.  That's when you want to use a linked list, where you ask for new memory only at the moment that you actually need it.

I'm not dedicating this whole post to the implementation fundamentals of lists, but interested beginners should go check out the Wikipedia article to find out how this works.  (Sidebar: While relying on Wikipedia for information about controversial topics is often unwise, most of the technical topics that are covered are really good.)

Besides linked lists, there are lots of other data structures that you can use depending on your situation:
  • A tree (which may or may not be binary) will hierarchically organize information for you, much like the folder structure on your computer does, shortening the search time as long as you know where you are going.
  • A hash table or map is a structure which will find a value associated with a key, usually very quickly.  An example would be a dictionary search: you supply a word, and the program would retrieve a definition.
You can write your own versions of these structures, or if your language supports it, use predefined classes that create common structures.

Understanding what purpose the various structures serve, and when to use each one, is a very key skill in programming interviews.  Often when you are asked "How would you solve this problem?" the best answer is not to blurt the first notion that comes into your head, but to start applying data structures to model the problem space: lists (or specifically, stacks or queues), trees (binary or otherwise), tables (sometimes you can just assume the existence of a database, which is centered around associative tables).

When I hear a problem that lends itself to this, I usually make a beeline to the whiteboard and start thinking out loud: "You're asking about a list of items, so let's describe what's in an item first... then build a linked list out of items..."  Then I'll be either writing code to illustrate what I'm thinking, or (if the interview is shorter) just sketch out diagrams so that the interviewer understands the description and will probably accept that I know how to implement it.

Software is built a piece at a time.  If you start explaining how you visualize the problem in your head, you can give a much better insight into how you think than if you just start solving the problem directly.  In fact, if you start off strong with this approach but then go off on the wrong track, often the interviewer will be eager to guide you towards his concept of the solution because he's being carried along with your thought process.  This often changes the dynamic of the interview entirely.  Instead of being a room with an interrogator and a suspect, the interviewer may start thinking of himself as your ally and not your judge.  And that's exactly where you want to be when you're looking for work.

Digression's over.  Next time I'll illustrate this when I get back to decoding HackMaster tables.

Thursday, September 8, 2011

Operation HackMaster Crit Tables, episode 1

This post is about a project I recently did for fun.  Although the project itself has extremely limited application, it inspired me to do something I've been meaning to do for a long time, namely upgrade the web server for my apollowebworks.com domain to something which supports Java applications with Tomcat.  (I chose a company called Arvixe based on scanning the features they offer and reading a bunch of reviews.  Shout out to my boyz at Arvixe!)

Because I'll be touching on a wide range of topics, I'll split them up into multiple posts, and that should keep me from neglecting this blog for a little while.  So, partly to tell you what's coming and partly just so I can keep track for myself, here's a road map of the topics I plan to hit with this discussion.

  • General geekery about using programming to automate complex tasks.
  • Building a project from the ground up, starting by visualizing the data structures instead of just diving in blindly.
  • How separating interface from implementation makes it easier to translate your program to multiple platforms.
  • Good riddance to the bad old days: How the web has made program content delivery easier.
  • A primer for noobs on what web servers do, and why I bought another one.

First let me explain the problem.  My fiancee, Lynnea, has gotten into paper-and-pencil roleplaying games.  She has been running a light Dungeons & Dragons campaign occasionally for me and my son Ben over the last few months. She also plays with a group of friends one night a week.  For my part, I've played RPGs a few times before but never been a strong part of that scene, so I just listen to the stories of her sessions.  Currently they're playing with a rules system called "HackMaster."

As I understand it from skimming a few chapters of the rulebook, HackMaster was written by someone who hates doing things the easy way.  When D&D was first published in 1974, it was at first an ever-expanding set of complicated rules involving tons of die-rolling for all kinds of special situations.  Or as a German exchange student my family hosted in high school put it, "D&D isn't a role-playing game, it's a roll-playing game."  Har.  Har.  Very droll, those gamers.  (Oh hey, how are things going, Max? ;)

Still, as the years went by, D&D started to reverse the trend and become more streamlined, or so I've heard.  The latest release, fourth edition rules, was clearly heavily influenced by World of Warcraft, and reduces a lot of the unnecessary choices in favor of more interesting combinations of focused options.

The HackMaster author just hates that.  Not that I would recognize the difference, but I have the impression that if anything, he's aggressively chosen to make every little action even more complicated than it would have been under old-school D&D, to the point where a party can spend thirty minutes meticulously checking a single room for traps before seeing any combat.  And the critical hit tables are an utter monstrosity.

Critical hits (or "crits") for you non-gamers, means that for every attack there is a chance (1 in 20 under D&D rules) that it will be a super-strong attack which does extra damage.  In WoW, this is handled automatically by the game; you can build up gear that will increase your crit chance, but usually there's a flat formula that says "If you crit, your strike does twice as much damage."  Rules in 4th edition D&D are pretty close to the same.

Not HackMaster.

In HackMaster, you roll a number between 1 and 10,000 for the body part where your blow lands, and there are all kinds of "realistic" outcomes that result if you land a powerful blow on that body part.  Furthermore, rolling a 20 is just your cue to pull out a ginormous, six page tables full of teeny little numbers and two accompanying rules pages.  You then roll another number that might range from 1-24 to see just how badly your crit hurt the monster (or the monster's crit hurt you).

For instance, if you get hit on the top of the head, the crit damage can range from "8 extra damage" to "twice normal damage AND you lose some hit accuracy AND you lose some dexterity AND you fall down and drop everything you're carrying"... to "brain goo."  Do not pass go.

Each one of those individual clauses, by the way, is represented in the table by a little symbol like "dX" which you have to look up on a smaller table of effects.  I think you have to look at a sample to see what I'm talking about.

(Click for a larger image.)

Not to put too fine a point on it, this sounds like a game that I would personally hate playing.  But no accounting for taste, you know?  Even so, problems of managing huge sets of data are piles of fun to solve with a computer.  NO, that's NOT sarcasm, you non-programmers, that's fun.  Shut up.

So when the DM tentatively asked if it would be possible to write a program that could handle the die rolls and just spit out some nice, clean output, I said "Piece of cake!"  I wouldn't even accept his offer to pay for it.  Just like when I wrote a Sudoku solver, sometimes figuring out how to automate gameplay is more fun than actually playing.  (Disclaimer: I used to claim that Sudoku is a fun math problem and a crappy game.  Now I like playing it.  It's my Android Evo's fault, damn it... it's the perfect game for a handheld, and it has built in strategy advice.  Once I started improving at harder puzzles, there was no escape.)

And so it began.

To be continued...