Ubuntu: A Love/Hate Relationship: January 2010

Jan 31, 2010

Chrome In Linux

For those that care, Chrome is available for Linux now. You can grab it here. It seems to work alright, although I don't use it very much so you're probably better off just trying it for a while on your own.

Jan 28, 2010

The Touch Book

I figured with all this hubbub about the iPad, I'll give my bit overdue comment about the Touch Book, which is a netbook/tablet from a small company called Always Innovating.

I've had this little gadget for a few weeks now, although I ordered it last June. Which is fine, because I haven't had a strong need for it until now (convenient the way things happen!) It's a pretty good deal, although for netbooks it's a bit on the steep range. It starts at $399 if you buy the complete deal, but you're going to end up paying about $460-470 after exchange rates, shipping and customs. Not a problem though, we'd be paying that much in taxes anyway if we went out to Future Shop or Best Buy.

What are the good aspects?

Tablet Mode - The screen comes off. Every part of the computer is actually inside the screen, the only parts inside the lower half are the keyboard, touch pad, and an extra battery (which means if you get the keyboard too, you'll have double the battery life). After the screen is off, it becomes a touch screen and you do everything you normally would with the machine - read an eBook, write notes, etc. It comes with software that is great for this thing; the program called "Note Taker" by default gives you a ruled piece of paper that you can draw on to your heart's content, and it even opens up PDFs for you to draw on - keep in mind it does not copy the PDF (which is a good thing, since it won't alter the PDF) so you can't delete the PDF.
Anyway, when you're in tablet mode it will automatically pop up an onscreen keyboard for you to use
Battery Life - You can get about 7 hours of actually usage out of it before needing a recharge. Why/how? It uses an ARM CPU, which has a lower power consumption than Intel/AMD CPUs. This also means that you can't run anything x86 on it, so don't think about using Windows. You're limited to anything that runs on an ARM (the built-in Touch Book OS is pretty good, and Ubuntu has an ARM port so we don't really have a problem here!)
Solid State - the hard drive for this is actually an SD card, so there's no hard drive noise. Combine this with the fact that - as far as I can tell - there are no fans in this thing, so there isn't really any noise that comes from it.
Tinkerable - the back panel of the screen comes off, revealing the SD card slot and several internal USB ports. This means that if you have a USB drive that you want to keep in the computer for a while but don't want it poking out the side, you can just stick it inside and it is nice and out of the way.
It's a small company - as an economist-in-training, I naturally dislike monopolies and small oligopolies. It is good to see a small company marketing a product so that we don't always have to go with Apple or HP or whatever for a laptop.

Anyway there are still some drawbacks, and this wouldn't be a good review if I didn't go over them a bit.

Software - while I do like Linux, I've been a bit spoiled by how bug-free and smooth things go in Ubuntu. You can put Ubuntu onto this machine, although I think I'd rather keep the niftier features that come with the Touch Book OS. There are a few UI problems and other bugs, like for example when I double-click the title bar of a window, the window decorator seems to crash. All my windows are still there open, they just don't have title bars or borders (UPDATE: This bug has been fixed in the latest version, I will try and update). Another problem is that the only way to check free space (that I can find) is through the 'df' command line utility, since at the bottom of the file browser it says "check the Storage Control Panel", which unfortunately does not exist - or if it does, I can't find it. Finally, to upgrade the thing the upgrader that it comes with doesn't work, so you need to grab the image of the new OS off their website and copy it to the flash drive. I'm not sure how to do that yet (it will be my project for the next little while) but it definitely won't be happening on another computer since the only things in my house that can read an SD card other than this laptop are my printer and my camera.
A bit sluggish - I know that it is a netbook and it's not supposed to be blazing fast, but one of the main things I'd like to use this for is an eBook reader and when it takes 20-30 seconds to go to the next page of an 11MB PDF it interrupts the flow a bit (good thing I haven't tried Russell & Norvig yet, that thing clocks in at 36MB!).
Tablet UI - this is not in the special UI they have, but for the regular one in tablet mode. While the screen is quite precise when you're pressing in the centre of the screen, you can't quite reach the edge. This is really bad for scroll bars and close buttons, so when reading a PDF that doesn't fit on the page you can't actually read the bottom of the page, because you can't scroll down. It would be nice if when you dragged your finger along the screen it would scroll it, but the only app that really does that is Midori (the browser that comes with it) and the Note Taker - although now that I think about it I can probably just try opening the PDF in the Note Taker and just not writing on it...
A bit flimsy - one issue with the flap on the cover is that it doesn't fit back on perfectly and you can actually fit your fingernail underneath and lift it up without unlocking it. I think this is a disaster waiting to happen, a piece of paper slips in there or something gets caught and pulls up on the flap, etc. Another is that it doesn't quite stay closed after you close it, if you bump something the cover sorta bounces up and down. A simple fix would be to put some kind of latch there, although I'm not sure how that would fit in with the tablet idea so maybe it isn't a great fix.
Not quite as advertised - I think that the pictures and statements on the site aren't quite as accurate as they seem. As I mentioned before, the battery life is definitely not 10 hours, despite them plastering that on their site.
Second, it is not always on, it turns on and off like a regular computer. You can put it into standby by pushing the power button and it will be instant-on/off, but that drains the battery life big-time. If you used this like a phone you would be recharging the battery on a daily basis.
Third, although they say 7 USB ports, 2 of those are mini-USB ports, and while there are three internal regular USB ports two of them are already in use by what I am guessing to be the wireless adapter and the Bluetooth adapter. It's not a huge issue, I just think they're hiding a bit of the details from the front page.
Fourth, the screen does not bend all the way back and around (or maybe it does and mine just made bad noises when I tried which scared me), instead you take the screen off, turn it around, and put it back on.
Finally, I wish they'd say on the front page that it is still beta software - it might not be officially beta software, but it certainly feels like it.

So what is my general opinion of the Touch Book? It is very, very Linux. It is very customizable and has pretty cool gadgets, but the UI is a little weird and broken in places, and there are a few bugs. There is a small community around it who populate the IRC channel #touchbook and fill up the wiki, which is pretty cool, but it is mainly technical users. So while I am happy with the Touch Book, at the moment I don't think it is really ready for the prime-time.

Jan 27, 2010

Firebug 1.5.0 in 64-bit Ubuntu

Looks like the latest version of Firebug (as of today, Jan. 27, 2010) crashes Firefox in 64-bit Ubuntu when you try to open it. This is unfortunate, so to fix it you have to install an older version of Firebug:

1) Uninstall Firebug from the addons menu.
2) Go here: http://getfirebug.com/releases/firebug/1.4/
3) Grab firebug-1.4.5.xpi from the bottom, and install it.

Jan 26, 2010

Max/Min Problems in LaTeX

Suppose you want to write a maximization or minimization problem in LaTeX, and you want the variables that you are maximizing with respect to under the "max" keyword, kinda like this:

Unfortunately out of the box LaTeX can't do this, but there's a handy little trick. Right before the \begin{document} put:

\usepackage{amsmath}
\DeclareMathOperator*{\Max}{Max}
\begin{document}

Then to write the max problem above you can put:

V(p, y) = \Max_x U(x) : p \cdot x \le y

Jan 25, 2010

Statistics For Programmers II: Estimators and Random Variables

This entry is part of a larger series of entries about statistics and an explanation on how to do basic statistics. You can see the first entry here. Again I will say that I am not an expert statistician, so feel free to pick apart this article and help bring it closer to correctness.

The whole point of statistics is that there are some things that we don't know, and we can't really measure those things - it might be because it costs too much, or we have a lot of measurement error, or it is actually impossible to accurately measure something. With statistics we take a sample of something, and attempt to derive results based on that sample about the population as a whole. For example, you might want to test how long it takes for your code to process something. Unfortunately if you run it over and over, you're not going to get the same result each time. How do you know what the actual value is? And more importantly, how do you compare that actual value with other actual values? Another example would be that you want to compare the processing time across two different commits, how do you know that one is faster than the other? You could run it once for each one and see which one is faster, however the problem with that is you could just have a statistical blip and draw a wrong conclusion. Statistics will help you decide to a certain probability which one is actually faster.

Now we've got these unknown values that we would like to know about, how do we go about getting information about them? We do this using things called estimators. An estimator is basically a function which takes as an input your sample and perhaps a few other values, and returns an estimate of the value you are trying to know about. One of the main things I will be talking about over this series is how to obtain these estimators, and how to analyze them to find out how good of an estimate they produce.
I've also mentioned this idea of a sample, for the most part I will be referring to a random sample. This means that out of the set of all samples, you grab them at random.

One of the major mistakes when people do statistics is to assume that estimators can be manipulated just like other functions. This is incorrect, because usually the estimator is an example of a random variable, often because it is a function of a random error term (which as has been pointed out however, is not always random). This random error term is the combination of all the unobserved aspects of a variable, for example when you're measuring the execution speed of a program the random error might be caused by how many other processes are running, how much CPU time those processes used, how hot your computer is running, that one in a million time that something had to go swap, etc. Basically anything that you do not explicitly put into your equation is captured by the random error.
What is this random variable thing? A random variable is a variable that doesn't really take a single value, like regular variables do. Instead each time you look at it, it will have a different value. You analyze these by associating a probability distribution to the random variable, which lets you calculate certain properties. Examples include the binomial distribution (non-weighted dice follow this one), or the normal distribution. For a more precise definition of a random variable, look on Wikipedia.
There are certain hiccups when it comes to working with random variables in that the standard mathematical operations don't quite work the same as with regular variables. For example if you do a least squares regression (I'll explain this in detail in a later post) to obtain an estimate of ln(y), you can't just take it to the power e to get an estimate of y. It doesn't work that way. This type of thing comes up with you have heteroskedasticity, which is when the variance of the error term is a function of something you are measuring - for example, they could be linearly correlated.

This entry was a bit boring and very theory-heavy. The next entry will have some practical examples that you can use with the programs you write.

« Statistics For Programmers I: The Problem | Statistics For Programmers III: Differences of Means »

Jan 24, 2010

Statistics For Programmers I: The Problem

This entry is part of a larger series of entries about statistics and an explanation on how to do basic statistics.

After seeing Greg Wilson's talk at CUSEC, and reading Zed Shaw's rant about statistics, I think it is time to write up a little bit for all of you on how to do this stuff so that you don't make too many mistakes. If you've gone to university, then you probably did a class on statistics (from my very small sample size of computer science programs, Bishop's and Concordia both require a statistics class, McGill has one in the "choose one of the following 4 classes"). If you didn't do university chances are you haven't seen much statistics other than really basic stuff like means/variance and z-scores.

In either case, I don't think the level of statistics given really shows you how to use and analyze them properly - kinda like how an intro Java class really doesn't show you how to use and analyze programs written in Java, you need to practice it on your own. The difference is that you tend to use your Java/other-random-language a lot more than you would statistics in a computer science program. Furthermore, statistics has a lot more pitfalls in it than Java programming does, and these pitfalls don't really jump up and smack you in the face like they do in programming. Instead of having a program crashing or being difficult to maintain once you begin to scale (which are very obvious when you hit them) the problems with statistics lead you to wrong results, and it is quite difficult to tell when you are wrong unless you actually know how to analyze your results. Finally, even when you do know your statistics well, your human biases can kick in and say that you might be doing it wrong because the results don't match up with what you believe to be true (see cargo cult science).

Oh, and even if you do everything right, you might just be unlucky and get a crappy sample. Gotta love statistics.

So what is the problem? Well, there are actually two problems. The first one is that most programmers out there seem to have an insufficient education in statistics to use it properly. The second problem is that programmers don't seem to use statistics to properly evaluate their claims. Unfortunately I can really only do something about the first claim, and that is the goal of this mini-series. I can attempt to write some basic software that helps analyze statistics, spits out some numbers, etc., but there is only so much that I can do there. How does that old saying go, "you can lead a horse to water but you can't make it drink"? Something like that.

Did anybody spot the hypocritical aspect in the last paragraph? If you did, good job! Basically my conclusions on both problems are based on my own anecdotal evidence, and what I've heard smart people say at conferences or on their blogs. This isn't really a great sample from which I can draw conclusions, so my results may be quite incorrect. However, even if I am wrong and most programmers know and use statistics well, I figure it couldn't hurt to talk about it anyway and perhaps people will tell me where I am wrong.

Finally, I don't claim to be an expert in statistics. In fact, the more I learn about statistics, the more I realize how little I have learned! Basically my knowledge of statistics is based on the classes I have taken in my economics program – four undergrad classes to date, currently in a grad class, however the classes are on econometrics which I don't think is quite the same as statistics but still deals with analyzing messy data to extract information.

Statistics For Programmers II: Estimators and Random Variables »

Jan 23, 2010

Rhythmbox Auto-Add Files

There's a feature in Rhythmbox that I discovered a while back that's pretty handy. If you go to Edit -> Preferences and choose the Music tab, there is a checkbox there that says "Watch my library for new files" which when you check it, it will automatically add any new files that you put into your music folder. I really like it because I can just download a song, and then not have to add it myself to the music collection.

CUSEC: Day 2 and 3

Day 2 of CUSEC started off much like the first, except there was a presentation in the morning instead of just hanging out. There were more interesting talks to be had, check them out:

First was Rob Tyrie from NexJ, which is a company that builds CRM software for the financial industry. It was a pretty good talk, he spoke about his startup experiences, mergers and acquisitions, etc. I like to hear stories about successful startups, so the talk was pretty good.
Greg Wilson, one of the editors of Beautiful Code and the other books in this series. He gave a presentation about empiricism in software development, or the lack thereof. Many programmers (myself included) make claims which tend to be based on popular opinion or anecdotal evidence rather than good data, which is not really a good way to make statements. I really liked this presentation because it showed me that even though I've done a fair bit of statistics in the economics program at Concordia, I haven't really applied that knowledge to determining whether or not my beliefs about software development are correct. In fact, I liked this thought so much that I think I'll write a few posts about how to apply some statistical techniques to analyzing software.
Dominic Duval from Red Hat gave a presentation on how to get started in Linux kernel hacking. Some of it was review for me, but other stuff was pretty interesting. Perhaps I will try to get my webcam working sometime.
At the end was Douglas Crockford, the guy who came up with JSON. His presentation was called "Quality", and the main point was that there is a lot of snake oil in the software industry, and no silver bullets. We come up with better ways of doing things but they still don't change the fact that we still have bugs, missed release dates, projects that go over budget, etc. The difference is that newer methodologies let us build software of greater complexity.

These were some good presentations. Unfortunately today I only saw the last presentation since I got to the conference late. The one I saw was given by Jacqui Maher from the NY Times who gave a talk about doing programming jobs for NGOs in the third world. It sounded like a really great experience and would definitely be a great way for us nerds to improve the world a bit. You can check out the stuff she did on github, and also CrisisCamp which is an organization that focuses on using technology to help people in crisis.

It was another good year for CUSEC, I certainly enjoyed the talks and hope that next year is just as good!

Jan 21, 2010

CUSEC: Day 1

Today was the first day of CUSEC 2010, and despite being tired and hopped up on caffeine all day (this entry might sound really odd, and is susceptible to slight changes when I am more coherent) it was an excellent start - in fact the day hasn't finished yet, there are still drinks to be had this evening!

The speakers today were:

Matt Knox - You'd probably know him better as that guy who wrote adware that was posted on Slashdot a while back, and he'll probably be known as that for the rest of his programming career ;) The talk was quite interesting. He began talking about his adware career and how he basically was slowly talked into doing shadier and shadier things for the company. The most interesting part in this section was the various security exploits in Windows he spoke about (and they scared the shit out of me, it really rationalizes my decision to not use Windows anymore). One of them is CreateRemoteThreadEx, which to paraphrase Matt you say, "hey, process over there, please execute this arbitrary code!" So basically you don't even need your process running anymore to have your code still executing. The second one (that I can remember) was that while Windows stores strings internally as 16-bit unicode strings, the Win32 API uses null-terminated ASCII strings. So if you have a null byte in say, a filename or a registry entry, the programs written using the Win32 API can see the file/registry entry but can't actually do anything about it. I don't know if this is true or not, I'd have to do some research, but that is how I remember it.
The talk then went to explain the Milgram experiment, which I will leave to the reader to explore further. He explained that basically these tests show that about 70% of people will do evil if they are made to by an authority figure, and described this as basically a remote security exploit in 70% of the installed base. But, he wondered, if people have security exploits that cause them to do evil, is it possible that people have security exploits to make them do good? It was an interesting question, but what makes you (or perhaps only me) wonder more is that if the people knew that they had an "exploit" that caused them to do good, would the exploit still work? So yes, it was an interesting moral speech, and Matt is an entertaining speaker so when they post the videos (if ever) I recommend checking it out.
Pete Forde (music warning) from Unspace was one of the corporate speakers. Unfortunately for these speakers, there are two going on at the same time so I can only see one of them speak, but oh well. Pete spoke about his life, risk-taking, doing new things, etc. I really enjoyed the talk, even though I wasn't paying attention for half of it because when he started talking about side projects I'd start thinking about my side projects and forget that I was in a conference. I enjoyed the talk and hope to get a copy of the notes since there was a lot of suggestions for books and blog articles that I'd like to read but couldn't remember.
Sergei Savchenko from EA (I don't know of a link to put for his stuff) - he gave a talk about video game programming, focusing on network topologies and various memory management techniques. It was pretty neat since I'm interested in that kind of thing, however I felt like it was a bit more of a lecture than a conference presentation.
Reg Braithwaite (slides)- it seemed they saved the best for last. While I did like all the presentations, this one was packed full of insight in Reg's style of taking your brain out and prodding at it to figure out what makes it work and how to make it better. I feel like that once they publish the recording of this one I could download it, cut it up into 10 minute slices, watch each slice individually and after watching each slice get a class of wine, sit down in my thinking chair (yes, I do have a thinking chair) and dig down into what he is saying and determine if he's "a guy who smoked too much weed in the 70's" or a guy with some really good advice to give. He started off with a Ruby example and how to use his extension methods to fix the problem. However he said that the important thing about the example was not the extension methods themselves, but the fact that they were necessary in the first place. Basically if we're having to put dirty patches onto things in order to make them work, it is a pretty good indication that those things are broken. Another point was that if you listen to the single responsibility principle, then by using extension methods or monkey-patching then you're breaking that principle; however by breaking that principle and successfully creating good software with it, you're showing that perhaps it is not you that is the problem, but that the single responsibility principle itself is broken. Or to be more general, how much of what is considered "good practice" isn't really good practice, but rather holding us back from creating something better? It makes you think, what else are we taking for granted? Not only in software, but in the rest of our lives? The issue is even once we decide that there are things that we can do better, how do we find those things?
There was a lot more, however I will wait until the video comes out before I talk about it in any more detail (all that coffee is having an effect on my memory).

This gives me great hopes for what tomorrow will bring!

Jan 19, 2010

Best Thing About Rails...

Q: What's the best thing about being a Rails programmer?
A: Hitting on homophobic Django programmers and watching them freak out.

Jan 16, 2010

The Joel Paradox

The Joel Paradox: n. A situation where a software company is determined to only hire the best developers, but fails to offer work which the best developers would find interesting.

Jan 12, 2010

Programmer's Cookbook

I've started up another site here called "The Programmer's Cookbook", where I take recipes for basic foods and turn them into code for my own amusement. Feel free to check it out if you want, and if you feel like submitting any recipe code, feel free ;)

I'm using the syntax gem for HTML highlighting, which is a great little tool for displaying Ruby in non-Ruby contexts (like your web browser).

Jan 11, 2010

What I'm Checking Out At The Moment

I'm revisiting a few things. I heard about these things a while back, but after re-watching Avi Bryant's CUSEC 2009 talk I decided to give them another whirl:

Rubinius: An alternative VM for Ruby, I'm sure most of the readers have already heard of it. Last I checked it wasn't a 1.0 release, now it is. So I'll be checking that out again.

Seaside and consequently, Smalltalk. Seaside is a web development framework written in Smalltalk which is supposedly "heretical", which is a bold enough claim to inspire interest (again). I haven't really done much with this yet, however right off the bat I am amazed by two things:
1) The installation process - there isn't really one. Grab the one-click installer here to see for yourself. There's a lot of extra junk with this installer, including the entire Pharo Smalltalk implementation and IDE (with Smalltalk there is no separation between the language and the IDE, which I find pretty cool) and the framework itself - not bad, considering it clocks in at about ~36MB. Anyway just run the executable and you're set - the zip includes the executables for Linux, Mac and Windows.
2) The speed - it takes about the same amount of time for the VM, the web server, and the entire dev environment to start up as it takes for GVim to start up. Some things inside are a bit slow like dragging windows around, but other than that it is blazingly fast. Seriously. You have to see it to believe it.

There's also some things I'm not really revisiting, but checking out for the first time. I've recently started grad studies (one of my profs was wondering last semester what the hell I was doing in another undergrad, and told me to apply for the Master's program. I got in, which is pretty cool), so I'm looking for something to write my thesis on. I know it's a bit early - I've technically only been in the program for a week - but I figure it couldn't hurt to get going on it. I'm looking at doing stuff in computational economics, and more specifically agent-based methods. Instead of the traditional approach to economics where you set up firms and individuals as rational agents that maximize some mathematical objective function, you put them into some sort of search space and they explore it in an attempt to find a better situation than the one they're currently in - although they don't necessarily have to do this, you could introduce some level of irrationality like fear of change, in which case the potential gain in utility from switching to another outcome would have to overcome some amount, but now I'm rambling and could probably continue like this all day.
Anyway if you're interested in this kind of stuff there's seems to be a bit of stuff going on at Iowa State University of all places and they have it fairly well organized. And of course there's the Santa Fe Institute which does this sort of stuff, although if you're interested in something specific you'll probably have to dig a bit.

Jan 7, 2010

Nearly Free Speech

I thought I'd give out a little thanks to a company called NearlyFreeSpeech.NET, which is a server hosting company with a rather unique model for their business. Most hosting companies will give you some space for a fixed amount per month, some even go as cheap as $5/month (CAD) - I'd provide a link here but I've forgotten who it was.

I like these guys because you only pay for the bandwidth you use. You put a certain amount of money onto your account, and then as your site(s) use bandwidth, they bill a few cents per day. You have to pay extra for dynamic sites and for a MySQL instance, but it's something like an extra 3 cents per day, which is like a dollar a month.
You can then easily set up extra sites and tie them to your same billing account. If you end up making a lot of small sites, this type of thing is really handy!

One catch though: they don't support anything that requires a persistent process other than the web server and database. So no Rails, unfortunately. Also they don't support mod_python, mod_ruby, or a few other things. Which kinda sucks, but oh well.

So if you're needing a small hosting company, maybe give these guys a try and see if you like them!

Ubuntu: A Love/Hate Relationship