Thursday, April 18, 2013

A few more thoughts about Bitcoin and numeric values


Sorry to keep harping on this one topic, but I was reading the "Myths" page on the Bitcoin wiki and something struck me as kind of funny and strange.

Bitcoin has been compared to the gold standard, because the coins "exist" (so to speak) in limited supply. Indeed, believers in Bitcoin and believers in precious metal currencies seem to be the same people in a lot of cases.

But here's the interesting part: some of the arguments for Bitcoin openly undermine and negate the arguments for using gold and silver as currency. Check this out.

Wednesday, April 17, 2013

Further discussion about Bitcoin

My previous post about Bitcoin invited discussion, but much of that discussion took place on Facebook and Google+. A lot of good insights and links were offered, but brief messages on social media are hard to search and learn from in the future. So I'm writing a second post to acknowledge these responses, clear up some of the misconceptions I had in the first, and offer useful links for anyone who wants more information in the future.

Saturday, April 13, 2013

Open discussion on Bitcoin

I've been hearing a lot of stories lately about Bitcoin, and while I don't fully get it, I'm currently on the side of people who think it's ultimately going to wind up being a very technically innovative Ponzi scheme. I'm not making this post to argue about that. Realizing there are a lot of cheerleaders for Bitcoin out there, this post will be heavily moderated. Sales pitches for Bitcoin will not be approved, nor will posts calling me names, although pro-Bitcoin posters are welcome as long as they can contribute to explaining the either the technological details or the details of economic distribution. A lot of the purpose of writing this post is to foster some technical discussion and assemble my thoughts about how it works.

Bitcoin is an "alternative currency," which is something political Libertarians are always saying we need. It is meant to be decentralized, not controlled by any one government or individual. At the present time, one Bitcoin is worth around $100, down from a high last week of around $200. I am personally not particularly interested in getting involved with Bitcoin, either by trading my dollars for bitcoins and spending the bitcoins, or by speculating on buying low/selling high, or by mining for them. However, with my computer science degrees I'm mildly interested in working out exactly how the system works. I found a FAQ page, and I found the original technical paper by Satoshi Nakamoto, but I still don't have a full handle on it yet. Eventually I'm probably going to have to download the open source code and look at it, but I'm still trying to decide if that's worth my while.

Saturday, May 19, 2012

Thoughts on piracy

According to Forbes, Game of Thrones will likely be the most pirated show of all time. There are many reasons for this, one being HBO's business model, as summed up nicely in this comic by The Oatmeal.


I myself am watching Game of Thrones legally, yet not paying for it. My sister records the episodes and I watch them at her place. In effect, I'm piggybacking on her account and we're getting two views for the price of one subscription, since I don't have cable at all.

Personally, I've had mixed feelings about piracy for a long time, and I'm still not sure what my position will eventually evolve into. I've got friends in two camps on this. My artistic friends (game designers, people involved with film and music, and those who can draw well) generally think that piracy is one of the greatest sins of the modern world. Meanwhile, my techie friends seem to have not even the slightest trace of guilt about it.

A great case in point: I watched Breaking Bad for the first time last year, on the advice of a coworker. First he said "You should check it out." I said I'd see if it was on Netflix, it was, so I got hooked. I feel good about watching things on Netflix, because in effect I've already paid for it, and the many that I pay indirectly gets back to the studio via whatever contract they've negotiated.

When I'd finished the third season, I realized that the fourth and final season (so far) was not available yet. So I told my coworker, "Well, I'm gonna wait a while until Netflix uploads the next season." I got funny looks. He said "Why don't you just torrent it?" -- as if there is no reason in the world not to do that, and I must be some kind of Amish hippy or something to not have thought of it.

I'm not saying I took the moral high ground here. I held out for a few more days before I talked myself into torrenting it. But at least I'm aware that there is a moral issue. I discussed it a few times with said coworkers over lunch, and they at least acknowledge that it might be a problem for the studios. I have a strong suspicion that most people under the age of 20 would not even go as far as recognizing that it's illegal.

To be clear: Most of the movies I watch are either in theaters, purchased or rented DVDs, or legally endorsed streaming sources. Most of my music is from CDs I own or MP3s I purchased online. Most games I've played and enjoyed are either free or paid for. Most, but definitely not all.

Let me play "angel's advocate" and try to fairly represent the side of my art friends.
  • Being an artist, of one sort or another, is hard work, usually for low pay.
  • Making money as an artist depends on some sort of reliable revenue.
  • Without a business model that produces reliable revenue, big budget art will not be viable. Movies like The Avengers have to show in theaters and sell DVDs, or there's no economic incentive (hence no ability) to make them. Games like Diablo III and Skyrim need to pay their designers, actors, modelers, and developers, and that means they can't afford to give it away. Musicians need to make enough money to live on. And so on.
  • When you consume art for free that you have been asked to pay for -- watch a movie, play a game, put music in your collection -- you are stealing it. The artist deserves to make money for producing things that you enjoy, and you are taking advantage of them by not paying for it. (This is one of the points I am a little ambivalent about. Just bear in mind that I'm trying to accurately represent the artist side of the equation in making these points.)
  • The more art people steal, the more difficult it becomes to make money as an artist. It's a tragedy of the commons situation. Eventually we may get to the point where the quality of art declines dramatically, because the really talented people will not be able to produce art full time, nor will the budgets be there for big projects.
  • Therefore, by pirating, you're hurting everyone in the long run.
I don't really want to post the pirate's justification for pirating in much detail. I've heard them presented in many conversations; I've even tried using a few myself. But even to me, they ring a little bit hollow. They strike me as the rationalizations of someone who knows they're doing something wrong but wants to keep doing it.

For the sake of putting them out there, here are some briefer hits on the pro-piracy arguments:
  • It's not really stealing if you copy something without destroying the original.
  • Information should be free anyway.
  • I wouldn't pay for it even if I couldn't pirate it, I'm too poor or it's not that good.
  • I tried to give HBO my money but they made it too hard. (See the cartoon above.)
Many of the arguments come up in this Reddit conversation about "Thrones," which the Forbes article also links.

I don't want to pretend that these are really strong arguments from an ethical point of view, but I do want to point out a few things about managing incentives properly.

There is a saying among economists, that you put a lock on your bike to keep honest people from stealing it. In other words, leaving your bicycle unlocked is just too tempting, and some people who wouldn't normally steal a bicycle may succumb if it's just sitting there unlocked. Meanwhile, a really determined criminal will still steal your bike with or without the lock. It's just that if you have the lock, the probability that your bike will be stolen on any given day goes way down.

In other other words, we all have a certain moral threshold, some lower than others. I'm pretty sure I wouldn't steal a bike, with or without a lock; and yet I stole season 4 of Breaking Bad. Where does the moral calculus lie?

It seems to me that people decide on a course of action based on a variety of factors, of which the primary motivators are
  1. How great the benefit is for doing something unethical (If there's no benefit, the choice would be easy) versus how great is the fear of being punished for your actions (taking into account both the likelihood of being caught and severity of punishment).
  2. How difficult the action is to perform. (In the bike lock example, moral objections plus the difficulty of breaking a lock will be enough to deter some people from stealing a bike, whereas without the difficulty factor, they succumb.)
  3. What magnitude of harm they think their actions might cause. (Robin Hood is a prime example. In this case, the principle that "Stealing is wrong" butts up against the observation that it will do more good than harm. Most people would place "Stealing $1,000 from a billionaire" as a lesser evil than "Stealing $1,000 from a person who needs that money to eat.")
There may be more factors there, but let's just take these three to start with. On these axes, media piracy falls in an area for most people that makes it really easy to rationalize.

How great is the benefit of piracy? Well, not really that great. You can watch a movie that you would otherwise miss. It might not even be a very good movie, or else you'd be more likely to pay for it. There is benefit, though, as The Oatmeal points out. Sometimes you're saving the cost of a ticket or DVD, and sometimes you're seeing something now that won't even be available for a year or more.

But what about punishment? Despite a few well publicized cases in the last decade that turned out to be a PR disaster for the RIAA, generally people know that the chances of being caught and making charges stick for something that millions of people do, is minuscule at best.

How difficult is it? The first time you try it, it takes a little research. On subsequent tries, it's trivially easy, requiring only some nearly free bandwidth, and a few hours of slightly slower internet access.

What magnitude of harm? This is the major point of dispute. Even if we completely grant that it's wrong to pirate, and even if we accept the fact that artists need that money, the individual harm that me pirating a movie causes is still very small. Depending on when they say it, the MPAA claims that piracy costs their industrct $250 billion, $58 billion, or $6 billion per year. A piece in Ars Technica suggests that it's not nearly as bad as any of those.

Certainly, if someone pirates a movie that they would have otherwise have paid $20 for, then the studio loses $20. But part of the effect of piracy is that people wind up seeing a lot more movies than they would actually buy, and most people see their "theft" of any individual movie as being worth a buck or two at most... and one TV episode being worth far less. Again: I'm not saying any of this to argue that it's actually okay to do this, just pointing out how the calculation shakes out for people who pirate regularly. When a pirate steals an episode of a show, they probably think of that individual action as costing a few cents.

In HBO's case, on one hand I think they're being a little bit foolish by making customers jump through so many hoops to get a copy of the episodes. Yes, if they force somebody to subscribe to their full service then they wind up with a lot of money. But if somebody doesn't subscribed even though they would have been willing to buy some episodes on Netflix or Hulu for, say, $20, then that's $20 that HBO simply doesn't get which they could have. I'm not the CEO of HBO, of course, and they might have calculated the difference. But my feeling is that if their business model is to force customers to stay subscribed to traditional cable service forever, I don't think that model will last very far into the future. I know I'm not the only one who's just abandoned cable entirely in favor of paid online entertainment, and I imagine this will become more common going forward.

On the other hand... the fact that HBO doesn't charge a reasonable price for their services doesn't automatically entitle people to steal them. As with all of capitalistic offers, your options within the law are to either accept the services, find a legitimate bargain, or just don't use them.

But despite all that rationalizing, we still have those issues in the background that make piracy an easy thing to do. The act of pirating is ridiculously easy, and will only get easier. There are steps that many game companies are taking to prohibit piracy, such as the growing trend to do what Blizzard has just done with Diablo III, which is to require that even mostly single player games connect to a server at all times. However, in the case of music, media and text... I'm fairly well convinced that it will never get more difficult to pirate them from a technical perspective.

The reason is simple. No matter how many safeguards they put on DVDs and ebooks, eventually you have to let paying customers see and hear it. That means that you have to allow every customer to decode it and play it back visually and audibly, and that means that you can capture the output to a file as well as the screen. Look at it this way: in the worst case scenario for pirates, even if the software was completely flawless, it wouldn't be able to prevent external recording devices from just taking a video of the video.

When you look at it that way, legal wrangling like SOPA and PIPA are all that media companies really have to turn back this tide, and they're not good tools. They'll never make media harder to copy, and they won't convince people that the pirated video is costing them more than a couple of bucks per "theft". So the cost/benefit calculation of pirating is their only weapon -- trying to impose draconian punishments on people who get caught, so that they won't do it.

Yet SOPA and PIPA had all kinds of problems because they overreached, causing companies which do not encourage piracy to protest that this would hurt their business model. In effect, the cost to the society for implementing those measures was worse than the cost to the media companies. In my example above, I said you could copy movies with a video camera. Suppose the MPAA decided to push Congress for a law that made owning a video camera a federal violation subject to a hefty fine. That might solve some of their problems, but it wouldn't be a solution that citizens would stand for, for many legitimate reasons.

So as I've felt for years, I don't know what the solution to piracy is. Since it is a tragedy of the commons problem, everyone who pirates contributes a small amount to the problem, but the overall consequences are large. And I don't want HBO to go away. From what I've read about them, it seems to me like it would be extremely difficult for any other company to undertake such large scale projects with such a generous lack of censorship. As popular as it is to talk about "crowdsourcing" these days, I think people underestimate the magnitude of shared resources that has to go into a huge entertainment project.

There is, of course, a lot of fat that can be trimmed out of the publishing industry in general. Greta Christina recently proved that you don't have to go through a giant publishing corporation to make money on book sales, and Joss Whedon showed that you can make money on a silly one-off independent film project. But that's probably not a fair standard, since he's Joss Whedon. Already an established Hollywood presence (more so now that he's directed one of the biggest box office hits of all time) and with fairly big name actors willing to work with him pro bono.

It could be that some art forms will simply be unavailable, or will decline sharply in quality, because it's no longer feasible to produce expensive things and make money with them. That'll be sad. But with all that said... I may be a hypocrite, but torrenting is still fun.

Wednesday, October 12, 2011

Troubleshooting my own user idiocy

Random dumb tech story: For months my computer microphone has had a lot of loud static on it. Every time I say something in a game like Left 4 Dead 2, people complain and tell me not to talk again.

I finally got around to investigating the issue, which involved unplugging things, looking at the sound controls, etc.  I was bewildered to find that the computer was recording noise even when the mic was unplugged.  Then I realize: I have a USB webcam that I rarely use.  It's plugged into the back, it has its own built in mic... and it's hanging behind my desk RIGHT NEXT TO THE FAN.

Everything's crystal clear now.

Monday, September 19, 2011

Operation HackMaster Crit Tables, episode 2

Now that I've explained what data structures are for, I can finally explain how I approached the problem of deciphering those ponderous HackMaster tables.

First of all, I discovered to my dismay that the tables were only available in the form of PDF files -- images, not text.  Just as I was about to tell my friend that I didn't want to waste time converting six pages of tiny print to very usable data, my lovely assistant Lynnea (the aforementioned fiancee who actually plays the game) stepped in and volunteered to do it.  This is one thing about her personality that I've never been able to understand, but she loves doing data entry, filling out forms, etc.  I think it's some kind of OCD thing.  But for whatever reason, she's extremely enthusiastic, diligent, and thorough about this kind of work.  And by the way, if you need this kind of work done, she's available to hire!  Ask me for a resume.  :)

With this powerful slave human resource at my disposal, I wrote up a few sample lines of text in a spreadsheet to show how I wanted them, wound her up and let her go at it.  She cranked the rest out in a surprisingly short time.  I then converted the results to standard comma-separated value format, some of which you can download from here: Hacking weapon table part 1; List of effects.  In case you're not familiar with them, .csv files are a generic text-only format which can be read in a spreadsheet program like Excel, or any standard text editor.

Working with just my sample rows, I set out to work out what the abstract properties of the data were.  The first thing to consider is the way a body part is selected.  In the hacking weapon table, you can see that if you roll a 1-100, you get hit in the "Foot, Top"; if you roll 101-104, you get hit in the heel, 105-136 is Toe, and so on.

This is like a hash table, almost, but it's not one.  If it was a hash table, you'd usually have one body part per number: 1 -> Foot Top, 2 -> Heel, and so on.  Here, we're working with a range of numbers corresponding to each lookup value.

I decided to start with a generic lookup table, where you start with objects which contain a "low" value, a "high" value, and a generic object which can get returned from the lookup. The declaration looks like this:
public class RangeLookup<T>
{
   private List ranges;

   private class Entry
   {
      protected T item;
      protected int low, high;

      public Entry( T i, int l, int h )
      {
         item = i;
         low = l;
         high = h;
      }
   }
   ...
}
In Java using the "<T>" notation means that "T" could be anything.  Even though I wasn't going to be using this lookup table more than once, I like to keep structures as all-purpose as possible.  That's partly because I might want to reuse them in the future, and partly because I want to be able to test how the component works without making it dependent on the equally complex item which will be retrieved by the lookup.

Every structure needs an interface -- a means of communicating with it that only does what you want and hides the guts of it from the rest of the program.  I created an "addEntry" function to the RangeLookup class, so that you could insert a new entry with a high, a low, and a retrieved object of type T.  Then I added a "lookup" function where you send in a number, it gives you an object.  In my implementation, the lookup function simply walks through all of the possible results and checks whether the requested number is between the high and the low.  This would be inefficient if there were going to be a lot of entries, so I might have come up with some kind of hashing structure or tree search; but since there are only about 20 or so body parts, it wasn't worth the extra effort and runs fine as is.

After verifying that this was working right, I created the following additional structures:

  • Looking at the Effects table, it is a basic mapping (in my case, placed in a HashMap) from one string to another.  You put in the code "f", and the resulting effect is "fall prone and drop items".  So, I created a simple object called an "effect," containing "key" and "description."
  • It's a bit more complicated than that, though.  Often the table will contain numbers, but the effects will contain only the symbol "X".  For instance, if the table says "d4" then the relevant effect is "dX", which means "reduce Dexterity by X".  Therefore I made another class called an "Outcome," which contains an Effect AND a number (which may be zero if it's not necessary).
  • I made an EffectTable, which implements the HashMap of Effects.
  • Almost ready to create an actual table object, I first made a class called "CritTableEntry."  This represents a cell in the table.  It contains: a low roll, a high roll, the name of a body part, and a List of effects (because each cell may result in several outcomes, not just one).
  • A CritTable class to put them all together.  This class has an addEntry method and another method for retrieving the entries.

As a final step, I created a "Reader" class which did the heavy lifting of reading and interpreting the CSV files and adding one row at a time into a generated table.  I don't like to reinvent the wheel, so I googled class libraries which would read CSV files and interpret them as lists.  I settled on using OpenCSV.  I could have written my own parser, but when the task is as common as reading a CSV, I tend to assume that somebody has already done all the work before me and has already been through the process of making all the mistakes and catching the bugs which come up.

Notice that none of these objects deals with input and output directly.  It's preferable to test each component of your program separately as much as possible BEFORE trying to decide what kind of user interface to make.  Your interface should be tailored to the problem space.  As it turns out, I wound up creating several different interfaces before I settled on created a web application.  I'll discuss these concerns in a later post.

When testing your data structures it's a good idea to create unit tests.  A unit test is a small, self contained application which is designed to test one thing at a time.  You need to think about every possible way that your program might break, create a unit test for each one, and make sure that it works right at the boundary conditions.

Off the top of my head, here are some boundaries of the crit tables that needed to be tested:

  • Spot check several "body part" rolls with random(ish) numbers and see that the returned information matches the table.
  • Spot check several "outcome" rolls in one row and see that the returned effects match the table.
  • Test the boundaries of some rolls.  For instance, on the table I linked, "4301-4492" corresponds to "Arm, upper inner", and "4493-4588" corresponds to "Elbow".  Therefore I have to make sure that a roll of 4492 returns a different part from 4493.
  • Test when happens when the body part roll is 0 (invalid), 1, 10000, and 10001 (invalid).
  • Test what happens when the effect roll is 0, 1, 24, and 25.

Keep all your unit tests around forever.  If something breaks, that's a quick way of figuring out which part is not working.  If it's a problem with your data model rather than your user interface, the unit tests will catch it.

Next time I'll be talking about all the different ways of making an interface on the same models.

Wednesday, September 14, 2011

A bit about data structures

I wanted to write another HackMaster post, but what I wanted to write about was the way I approached deciphering the data in the tables and converting them into data structures.  Then I skimmed through some older posts looking for reference points about data structures, and it occurred to me that I've never written any. In order to provide a foundation for the rest of the HackMaster breakdown, I'll have to digress and talk about structures in the abstract.

Whenever you are presented with a problem of modeling some numbers in conceptual space, the first thing you have to figure out before you write a single line of behavioral code is what kind of data structures you are going to use.  Going all the way back to the beginning of this blog, I've emphasized the importance of considering the efficiency of your design and the effect that it has on the Big-O performance of your program.  Thinking about proper data structures can buy you a lot of speed, and it can also make it really easy to visualize your program in small chunks as the complexity increases.

So what's a data structure?  The first thing programmers learn is how to use variables for individual chunks of information, like this:
int x = 3;
String str = "Hello world.";
 (Technically, of course, a String object in Java is a whole bunch of characters, which makes it a data structure in itself.  But the nice thing about object-oriented programming is that you don't have to think about it if you want to.)

To understand data structures, consider an array.  An array is one of the first slightly more advanced concepts that a beginning programmer will run into.  Instead of storing just one integer, it can store several.  For example, here's a simple representation of part of the fibonacci sequence:
int[] fib = new int[10];
fib[0] = 1;
fib[1] = 1;
fib[2] = 2;
fib[3] = 3;
fib[4] = 5;
fib[5] = 8;
fib[6] = 13;
fib[7] = 21;
fib[8] = 34;
fib[9] = 55;
When you create a single "int," you're asking the program to set aside a chunk of space in memory, large enough to hold one number.  When you create an array like this, you're asking the program instead of set aside a bigger chunk of memory ten times that size, plus (for some languages) a little bit of extra information about size constraints and such.

But arrays can be wasteful.  What if you want to set aside space that sometimes houses a hundred numbers, and sometimes houses just a few?  You could create an array of size 100, but most of the time that space would be wasted.  That's when you want to use a linked list, where you ask for new memory only at the moment that you actually need it.

I'm not dedicating this whole post to the implementation fundamentals of lists, but interested beginners should go check out the Wikipedia article to find out how this works.  (Sidebar: While relying on Wikipedia for information about controversial topics is often unwise, most of the technical topics that are covered are really good.)

Besides linked lists, there are lots of other data structures that you can use depending on your situation:
  • A tree (which may or may not be binary) will hierarchically organize information for you, much like the folder structure on your computer does, shortening the search time as long as you know where you are going.
  • A hash table or map is a structure which will find a value associated with a key, usually very quickly.  An example would be a dictionary search: you supply a word, and the program would retrieve a definition.
You can write your own versions of these structures, or if your language supports it, use predefined classes that create common structures.

Understanding what purpose the various structures serve, and when to use each one, is a very key skill in programming interviews.  Often when you are asked "How would you solve this problem?" the best answer is not to blurt the first notion that comes into your head, but to start applying data structures to model the problem space: lists (or specifically, stacks or queues), trees (binary or otherwise), tables (sometimes you can just assume the existence of a database, which is centered around associative tables).

When I hear a problem that lends itself to this, I usually make a beeline to the whiteboard and start thinking out loud: "You're asking about a list of items, so let's describe what's in an item first... then build a linked list out of items..."  Then I'll be either writing code to illustrate what I'm thinking, or (if the interview is shorter) just sketch out diagrams so that the interviewer understands the description and will probably accept that I know how to implement it.

Software is built a piece at a time.  If you start explaining how you visualize the problem in your head, you can give a much better insight into how you think than if you just start solving the problem directly.  In fact, if you start off strong with this approach but then go off on the wrong track, often the interviewer will be eager to guide you towards his concept of the solution because he's being carried along with your thought process.  This often changes the dynamic of the interview entirely.  Instead of being a room with an interrogator and a suspect, the interviewer may start thinking of himself as your ally and not your judge.  And that's exactly where you want to be when you're looking for work.

Digression's over.  Next time I'll illustrate this when I get back to decoding HackMaster tables.