tag:blogger.com,1999:blog-7359791525336117032.post6800330301294035964..comments2020-04-24T20:58:07.807-05:00Comments on Castles of Air: Speed up your programs with Big O (part 1)Anonymoushttp://www.blogger.com/profile/05324968314168283095noreply@blogger.comBlogger4125tag:blogger.com,1999:blog-7359791525336117032.post-83614946284362123372009-03-05T09:15:00.000-06:002009-03-05T09:15:00.000-06:00Great questions, James.Does your system only work ...Great questions, James.<BR/><BR/><I>Does your system only work if your database is both sorted and static?</I><BR/><BR/>The reason database tables read information faster than ordinary text representations or spreadsheets is that they are indexed. There are two kinds of indexes: <A HREF="http://www.sql-server-performance.com/articles/per/index_data_structures_p1.aspx" REL="nofollow">clustered and non-clustered</A>. A clustered index will physically sort your entire table. Obviously you can only have one clustered index. A non-clustered index is simply a sorted list of one column value (for instance, last name) which then points to the memory location of the correct table row.<BR/><BR/>If you have a database search that is running very slowly, it's usually because you're doing your initial search on a non-indexed column, which means that you have to read nearly every item in your table and examine the values one at a time. It's a dumb dictionary search.<BR/><BR/><I>This could be just my lack of math skills, but in your first example you identified 170,000 entries can be found in 18 searches (n=170,000 -> log n = 18), but the second example you said (n=1,000,000 -> log n = 6). What am I missing?</I><BR/><BR/>When I say it's "18" or "6" that's really a rough estimate, because remember we eliminated the constant multipliers. So we said that the dumb search was O(n), even though it "really" takes 85k operations and not 170k operations.<BR/><BR/>The difference between (log_2 n) and (log<BR/>_10 n) is also a constant multiple -- there's a factor of about 3.32 between them. For example:<BR/>16 = 2^4 = 10^1.204<BR/>and<BR/>4/1.204 = 3.32<BR/>Similarly, 18 is 3 times bigger than 6, so you can how they're similar orders of magnitude. When we eliminate that factor of 3.3, the calculation will be correct.<BR/><BR/>Maybe we could make the dictionary search "really" take (log_10 n) operations instead of (log_2 n) operations if we divide up the dictionary into ten equally sized parts instead of just two. Then it would indeed take 6 operations and not 18. But there are additional hidden costs associated with dividing the dictionary up in pieces, so you wouldn't gain any performance that way.<BR/><BR/>Because it's a headache to price out all these different hidden costs, and not very reliable anyway, that's why big O is kind of a rule of thumb and we just ignore constant multipliers.Anonymoushttps://www.blogger.com/profile/05324968314168283095noreply@blogger.comtag:blogger.com,1999:blog-7359791525336117032.post-67606566533797256072009-03-05T08:33:00.000-06:002009-03-05T08:33:00.000-06:00Okay, a couple of questions from a prog newb.Does ...Okay, a couple of questions from a prog newb.<BR/><BR/>Does your system only work if your database is both sorted and static?<BR/><BR/>This could be just my lack of math skills, but in your first example you identified 170,000 entries can be found in 18 searches (n=170,000 -> log n = 18), but the second example you said (n=1,000,000 -> log n = 6). What am I missing?James A. Brownhttps://www.blogger.com/profile/12580984579089010427noreply@blogger.comtag:blogger.com,1999:blog-7359791525336117032.post-7844907299357258432009-03-04T15:01:00.000-06:002009-03-04T15:01:00.000-06:00If you fiddle with that equation, you'll find ...<I>If you fiddle with that equation, you'll find that<BR/>18 > log2 170,000</I><BR/><BR/>I'd work it from the other end: with one try, you can search a 2-word dictionary (is the first word the one that I want? If not, it must be the second one). With two tries, you can break up a dictionary into two sub-dictionaries that can be searched with one try, i.e., 2×2 = 4 words. With three tries, you can break it up into two 2-try-sized sub-dictionaries: 2×4 = 8.<BR/><BR/>More generally, using k tries, you can search a dictionary of n=2^k words. Solving for k gives k = log_2(n).arensbhttps://www.blogger.com/profile/15251547886605570242noreply@blogger.comtag:blogger.com,1999:blog-7359791525336117032.post-24324904920454974672009-03-03T23:22:00.000-06:002009-03-03T23:22:00.000-06:00I was actually relearning (previously learned this...I was actually relearning (previously learned this in my freshman level CS course about two years ago) this concept in my discrete math course today. What my professor taught was pretty much the same, but I really do like your presentation better.<BR/><BR/>Thanks for the blog, I really enjoy it.<BR/><BR/>- An undergrad CS majorBrithttps://www.blogger.com/profile/05478278976899021983noreply@blogger.com