Feb 18, 2010

Github Issue Closing

A little thing I discovered today, when you push something to Github and you're using their issue tracker, you can put "fixes #X" (X being the issue number) in the commit message and Github will close the issue. I know Redmine does this with SVN, and it's cool to see Github does it too.

Feb 15, 2010

Statistics For Programmers III: Differences of Means

This entry is part of a larger series of entries about statistics and an explanation on how to do basic statistics. You can see the first entry here. Again I will say that I am not an expert statistician, so feel free to pick apart this article and help bring it closer to correctness.

One thing that I've noticed programmers (including myself) do when comparing two versions of code is they run it once for both versions and compare the time it takes. If one is faster, then they conclude that that one is faster and take it.

It is possible that this conclusion is correct, that the faster one is indeed actually faster, however the evidence here is not very good. It is also highly possible that the faster one is only faster because some randomness in the execution time made it seem faster. Take a look at this picture:


Suppose this is the graph of the runtimes of two versions of the same function. The variance is unchanged at 16 (so the standard deviation is 4, which means that the average distance from the mean for a test is 4). The green version of the function has an average runtime that is twice as much as the red version, meaning that it is twice as slow. However it is highly possible that your test for the green function ends up in the left tail and the test for the red function ends up in the right tail, meaning that you'll wrongly conclude that the green version is faster. (Note that another good idea here would be to try and reduce the variance of your function).

In order to get a good idea of which one is actually faster, you need to do some analysis, and for this analysis you need to know the variance of the running time. Unfortunately you cannot get an estimate of the variance with one test (mathematically it results in a divide by zero) so you need to have several tests for both versions in order to create a decent sample. How many tests? That depends on a great many factors, but if we assume that the randomness is a random variable that normally distributed (I'll give a tiny bit of info at the end about what happens if it isn't) then the number of tests depends on what level of confidence you want in your results, and how precise you want your estimates to be. If you want to be 95% sure you are correct instead of 90% sure, you'll either need to accept a lower level of precision or run more tests. On the other hand if you want to be more precise, you'll need to either accept a lower level of confidence in your results, or run more tests.
Finally, the number of tests you'll need depends on the variance of the randomness in your tests. A higher variance will need more tests in order to be more precise.

So let's say we've got some data for our two versions of the code and we want to know which one is faster. Unfortunately it is not as simple as taking the average from one and subtracting the other, since this doesn't take into account the variance of the two distributions.

One way to do this is to compute a new dataset containing the differences between the two distributions. You would do this with the formula (assuming xt is the tth observation from one version of the code and yt is the tth observation from the second version):
dt = xt - yt
You can then take the average and variance of the ds and make a conclusion. We want to test if the difference is equal to zero, because if it is then we cannot say that there is a difference between these two samples. We do this by taking a t-statistic, which uses the following formula (assuming d is the difference, and n is the size of d):
t = average(d) / sqrt(variance(d) / n)
You then compare this value to the t-distribution with n - 1 degrees of freedom (I'll explain what the degrees of freedom are in a future post) to see what is the probability that we would get this t-statistic assuming that the actual mean of the difference is zero. If this probability is very low, then we can say with a certain level of confidence that we are wrong with our assumption that the mean of the difference is zero. This probability is called the p-value (some people call it the power of the test), and using R you can compute this p-value:
2 * pt( -abs(t), df = n - 1)
This will give you a value between 0 and 1, which is the probability that if the mean of the difference was actually zero, there is a p-value percent probability of us getting a sample that gives us this t-statistic. If this probability is really low, then we can say that it is more likely that our assumption of zero mean is wrong. If the difference is not zero, that means that one of the two versions of our code is faster, which one being the one with a lower average runtime.

You can also do this using confidence intervals. This is where you compute a range, and if zero is in this range then you cannot say that the difference is not zero. With a confidence interval you need some t-critical value for your level of confidence. If you want to be C% confident, you can use the formula in R to get the t-critical value (you divide by 2 because the qt() function in R is for a one-tailed test):
abs(qt((1 - C) / 2, df = n - 1))
You can then create your confidence interval:
lower_bound = average(d) - tcrit * sqrt(variance(d) / n)
upper_bound = average(d) + tcrit * sqrt(variance(d) / n)
If zero is in this interval, then you cannot say that there is a difference between the runtimes of the two functions.

A quick note on normality: we assume here that the randomness in the runtime is normally distributed. If we don't know that it is normal, that means that the "t-statistic" we calculate does not actually follow a t-distribution and the results we get from computing t-statistics and p-values is pretty meaningless. However this problem can be remedied by using a large sample, because as the sample size goes to infinity the distribution of our "t-statistic" converges to the normal distribution, and we can still use the p-value. This also requires a few assumptions, which I will talk about in later posts.

« Statistics For Programmers II: Estimators and Random Variables | Statistics For Programmers IV: Ordinary Least Squares »

Feb 14, 2010

Jarm

Well, it's time to announce yet another pet project of mine. This time it is a little Javascript game using gameQuery where you wander around planting stuff. I guess it's kinda like that Farmville game that is making its way around Facebook (I personally have no clue what that game is about, I avoid Facebook applications like the plague), however I got the idea more from a board game called Agricola, although this game and my game are not all that similar.

It's still far from finished, all you do right now is go around finding seeds, plant them, watch them grow, and then pick them. I'm hoping to add some sort of money system so that you can buy stuff to advance your garden-growing skills, and also some random events that could happen that will hinder your garden's growth. Hopefully someday I'll get some sort of points system so that maybe after X amount of time the game will end and your points are tallied up, and you have to somehow beat your old high score.

You can check out the source at GitHub if you're interested in playing it. It runs just fine in Firefox 3+ and Chrome (both tested on Linux only, not sure about anything else).

Feb 11, 2010

gameQuery

I've been working recently with a jQuery plugin called gameQuery, which is a basic 2D game development library for Javascript. It is designed to help you make simple browser games without the use of Flash or Java - which is nice, since Flash development doesn't work so well in Ubuntu (Adobe does provide an ActionScript compiler for Linux, but it doesn't really have any tools with it) and Javascript is a lot easier to program in than Java.

Right now it is still very basic, however it handles certain primitives for you quite well. Animations are implemented for you using tiled images (this is where you put all your frames into a row/column in a single image) which is very handy and much better than using animated .gifs. It will also do the preloading for those animations so that they don't need to be done during the gameplay. There are also mechanisms for grouping different animations into layers so that you can have different background/foreground groups, sprite groups, etc.
There's also some nice little addons for the game loop and input handling.

Other than that, there isn't really much to it. It has sound, although it is experimental and is not guaranteed to work (I haven't tested that part out yet). Supposedly it has collision detection, but it doesn't seem to work for me. Maybe I'll fix it and submit a patch.

I've been expanding on it a lot, I've made up two view classes that handle a scrolling background. You can either just straight up scroll by calling scroll(dx, dy), or you can have a "locked view" where it follows a particular sprite (it nicely handles the edge of the screen!).
I've also built-in a more efficient object management system using quad-trees, which are a space-partitioning data structure. It means that when you do location-dependent things like collision-detection you don't need to query every single object in the game, you just need to update the visible ones. This might also be good for animation rendering since you wouldn't need to update animations that are not visible.

Anyway give it a shot if you like, but remember that it is still pretty alpha! And don't bother with IE since it is really too slow for something like games.

Feb 9, 2010

On Converting People to Ubuntu

Personally I don't care too much if others like Ubuntu or not, so I don't really attempt to push it on any of my friends - back when I was still a Windows guy I absolutely hated Linux fanboys always trying to force it on me, citing irrelevant reasons like "it's free!" or something like that (my response was "yes, but does it work?", although that leads into a complicated discussion of the definition of "working" and we won't go there today). These days freedom is a lot more important, but I still don't think that it should be forced on them - "you have to understand, most of these people are not ready to be unplugged":

I guess this video isn't completely relevant to what I'm saying, I just like the message in that scene :) also the ending of that clip is good.

Anyway getting to my main point here, I think that many of the Linux geeks who preach about Linux would do well to read Ubuntu's guidelines on converting users: https://wiki.ubuntu.com/ConvertFriends.

Now I'll make a playful jab at my Mac user friends ;) scroll up a bit and read my sentence that says "freedom is a lot more important." That's one of my main reasons for not going with Apple - I feel bad enough owning an iPod Shuffle.

Back to Jaunty

I manage to brick my Ubuntu install today by switching the home partition from XFS to ext3. Don't ask why I had my home partition as XFS, it was one of those "it seemed like a good idea at the time" things and I really don't have much of a reason other than that. Why did I switch it back? Well, I like having the ability to shrink a partition and since my Windows one is full of games, the home partition is really the only one left. I was thinking of sticking the Lucid alpha in there to see what happens. Second, I couldn't seem to find an easy-to-install XFS driver for Windows, and sometimes it is handy to be able to access your home folder from Windows - read-only of course, I have some kind of paranoia about Windows and I'm afraid that if I allow the Windows partition to have write-access to my Ubuntu partitions then all hell will break loose.

Then I switched back to Jaunty because my Karmic LiveCD wouldn't boot.

How do I feel now that I'm back to Jaunty? This is probably because it's a brand new install (or maybe ext3 is just that much faster than XFS), but it's speedy. It boots way faster than Karmic did. Things are much more snappy, Firefox starts right away, and things actually respond when I click on them and not 5-10 seconds later. It's nice. Oh, and my webcam works again since I'm back on a 2.6.28 kernel.
What's bad? Old versions of things - I'm back on Firefox 3.0 which is a bit slower than 3.5. I can probably upgrade, but that is something for later. I've also noticed some slight rendering differences on pages that I had open before and after the reinstall, which is interesting. I didn't think that there was that much of a difference. Other than that, I don't really have a problem with going back, and I might just hang out until Lucid comes out to give that a shot!

Feb 6, 2010

Programming the Wiimote in Ubuntu

This post has been a long time in coming, but I finally got up off my ass and decided to figure out how to code using a Wiimote in Ubuntu.

First, you need a Wiimote (obviously). Note that you don't actually need a Wii to do this, you can just head on down to Future Shop or whatever and grab a Wiimote. It should still work.
Second you need a Bluetooth adapter. I've got a RocketFish micro-dongle thing which works just fine, I'll leave it to you to find something for your machine.
Third, I'm going to assume you have Bluetooth working on your computer, otherwise there isn't much hope for you since the Wiimote is a Bluetooth device.

Let's start by grabbing the required libraries. In Ubuntu you can go:
sudo apt-get install lswm libcwiid1-dev libbluetooth-dev wmgui
To test to see if your adapter can detect the Wiimote, just run this command and press the 1 and 2 buttons:
lswm
If it displays a bunch of hex characters, then your Wiimote has been detected and you're set to go. If it doesn't display it right away and the LEDs are still flashing on the Wiimote, run it again since lswm doesn't wait very long and you might have missed it.

You can also test it out using wmgui, which basically shows you all the various sensors the Wiimote has in it. This includes the buttons, accelerometer and IR sensor. It's pretty easy to get it going, the only issue comes when you want to start using the IR. Since Wii sensor bar is actually just a bunch of infrared LEDs, you just need an IR light source. Unfortunately since the Wii bar doesn't plug into my computer (it has some weird plug that connects to the Wii), I went with something simpler - a candle. If you actually start programming using the IR it'd probably work better if you used two candles so that you can detect the orientation of the Wiimote, but it still works fine with one.

Now let's get to some code! I'll be using C with gcc. There's a Python version of this library but I haven't tested it and don't know how it works.
First, we include the needed libraries:
#include <bluetooth/bluetooth.h>
#include <cwiid.h>
We then set up the various structs we need to interact with the Wiimote:
bdaddr_t bdaddr = *BDADDR_ANY;
cwiid_wiimote_t * wiimote = NULL;
We obviously need a cwiid_wiimote_t struct in order to call functions on it, we also need the bdaddr_t struct to give us the Bluetooth code for the Wiimote (remember those hex digits that lswm printed out?).

Next we set up the Wiimote. Normally you would wrap these up in if statements for error checking, but I'm going to leave those out for clarity:
/* Connect to the Wiimote - the CWIID_FLAG_MESG_IFC
means that we want to use a callback method for
interacting with the device */
wiimote = cwiid_open(&bdaddr, CWIID_FLAG_MESG_IFC);

/* Now we register a callback. I'll define this later. */
cwiid_set_mesg_callback(wiimote, &wii_callback);

/* Set the reporting mode. This says which messages we want
to receive in our callback. Let's only get buttons for now. */
cwiid_set_rpt_mode(wiimote, CWIID_RPT_BTN);
Then we define our callback function. I'll just make it so that when you press B the program will quit:

void wii_callback(cwiid_wiimote_t * wiimote, /* The Wiimote struct */
int msg_count, /* The number of messages received */
union cwiid_mesg msgs[], /* The messages themselves */
struct timespec * timestamp /* The timestamp of the callback */){
int i;
for (i = 0; i < msg_count; i++){
if (msgs[i].type == CWIID_MESG_BTN){
if (msgs[i].btn_mesg.buttons & CWIID_BTN_B){
printf("B button pushed!\n");

/* Close the connection to the Wiimote and quit */
cwiid_close(wiimote);
exit(0);
}
}
}
}
The way the callback works is that it sends you an array of messages that happened. The message is a union of different types of messages, determined by the type field. For the button message, it includes an integer that for each bit it says whether or not each button is pushed. There are many other types of events, you can check it all out by browsing the cwiid source code (in Ubuntu it installs to /usr/include/cwiid.h) and by looking at the cwiid docs. Have fun playing with the Wiimote!

If you want to enable other functions, you just use a binary OR for the reporting mode:
/* Enable the buttons, accelerometer, and nunchuk */
cwiid_set_rpt_mode(wiimote, CWIID_RPT_BTN | CWIID_RPT_ACC | CWIID_RPT_NUNCHUK);