Ubuntu: A Love/Hate Relationship: 2008

Dec 29, 2008

A Disconnect from the Real World

After reading both Jeff Atwood's article and Joel Spolsky's response to a discussion topic, I'm wondering if these guys really live in the real world of programming or not. Atwood lives off of advertising revenue on his blog, and Spolsky runs a company. While I'm glad they've managed to get themselves to such good positions, I think the vast majority of programmers are not likely to find themselves in a similar position. These two are excellent writers, and Spolsky obviously has some business acumen considering that he does have a successful business. They are not in their positions due to programming skill. The "mainstream" programmer may not possess the same skills to elevate themselves to similar positions.

These two also don't seem to understand that there is a difference between programming and software development. I define programming as programming things you want to do, where software development is programming to support yourself doing whatever somebody else needs you to do. One is fun, the other is not. Both Spolsky and Atwood are in the first boat, they program what they want to program. Unfortunately for the rest of us, the world can't have a startup or a popular blog with ads for every programmer that doesn't want to work in a boring ~~corporate~~ job. I'd love to either do a startup or work for a sweet job. However the startup idea requires well, an idea, and startup capital. Without some good cash inflow, doing a startup might be a little difficult when the landlord comes knocking. There's always venture capital or loans, but then you must weigh the crappiness of a boring job vs. the depressing feeling of owing tons of cash to somebody much bigger than you. And that's assuming people will actually lend to you, which given the current economic environment is not likely.

The one thing that really set me off is Atwood suggesting that if you don't absolutely love being a programmer, you should get the hell out and make room for somebody who does like it (although given the labour shortage for programmers, I don't think this makes good economic sense).
I love programming. I'm guessing that based on salary increases and job offers that I'm fairly good at it. However by Atwood's analysis, the blog post linked to above that I wrote about 9 months ago would suggest that I should get out of the way for somebody who likes it more. And the more I think about it, the more I agree with him. Let somebody else be a drone, working away for people who do not listen to you or disregard your advice. People who don't understand what it is that you do, but make great expectations without giving respect. Because that (from what I've seen in my short time since graduation) is what the software industry is.

Dec 21, 2008

Rails and Processing Uploaded .zip Files

I'm doing a little side project at the moment that allows a user to upload .zip files and the Rails app will process the contents. It turns out to be quite a pain in the ass to get going! The main reason is that the Ruby gem that handles .zip files only works with files, and with Rails you're not actually guaranteed to get a File object when someone uploads a file to you.

Let's start with the basics. Suppose you have an action like this:

def upload_to_me
  file = params[:the_file]
end

Assume that the_file is an item uploaded from a form in a file input field. Now Rails will automatically process all this for you and handle the temp file creation and all that. However one optimization Rails will do is if the file is smaller than 10kB, it just sticks it in an UploadedStringIO object which is not a file - so there is no temporary file.

Let's expand our action. We want to open up this file (assume it is a .zip) and take a peek at the contents:

Zip::ZipFile.open(file.path) do |zip|
end

The ZipFile object only accepts a filename. There is no way for you to pass in anything else, like say an IO object. So we have a predicament. The UploadedStringIO object we have is raw zipped data, but we can't actually unzip it because it is not a file.

What's the solution? It's ugly, but turn it into a file:

if file.is_a?(UploadedStringIO)
  temp_file = Tempfile.new("some_temp_name")

  temp_file.write file.read
  file = temp_file
  temp_file.close
end

# now file is a File object and can be treated as such

We use Ruby's Tempfile object, which stores things in a temporary folder (by default on Ubuntu it appears to be /tmp) and is designed to be thread-safe so that you don't have to worry about people clobbering each other's temp files.

I suppose since I have access to both Rails' code and the Zip gem's code, I could probably hack this stuff to make it work properly without being ugly, but this small fix should be enough for now. A good optimization would be to add something to ZipFile so that it can accept a IO object and not just a filename.

Dec 9, 2008

Power of the Big O

The other day I posted an article about Hash#only and Hash#except. My friend posted a comment on it with another way of doing it that was IMO a lot more elegant, using a combination of reject and include? as opposed to my use of tap and array diffs/intersections.

Later on I decided to check out which is faster. My intuition told me that they'd be approximately the same, since the reject/include? version was quite obviously O(mn) with m and n being the sizes of the black/whitelist and the hash, and the &/- version was also O(mn). I wrote up a little benchmark here and tested it out. To my astonishment, the &/- version was a few orders of magnitude faster. I had to drop a zero from both the sizes of the array and the hash, and the results:

  0.090000   0.030000   0.120000 (  0.122033)
 10.920000   0.020000  10.940000 ( 10.960146)

Why is it like this? Well it turns out that &/- are O(max(m,n)), not O(mn). What these functions do is they convert one of the arrays to a Hash, which is O(n)¹. Then it iterates across the other array, checking to see if the current element is in the hash, which takes O(m) since hashtable lookup is O(1). Since the two things are not nested, the whole function is O(max(m,n)). The reject/include? is O(mn) because reject is O(m), and include? is O(n), and since include? is nested in the reject block you get O(mn).

I did a quick optimization and changed my friend's version to this:

class Hash
  def except(*blacklist)
    blacklist = Hash[*blacklist.map { |i| [i,1] }.flatten]
    self.reject { |k, v| blacklist.key? k } 
  end

  def only(*whitelist)
    whitelist = Hash[*whitelist.map { |i| [i,1] }.flatten]
    self.reject { |k, v| !whitelist.key? k } 
  end
end

This is O(max(m,n)), and is of similar performance to my original post.

I found this really interesting because it's one of the few times that computer science has actually played a direct role in my job, and because it let me dig into Ruby's internals a bit. It goes to show that even in web development, knowing the lower level details of things and some comp sci knowledge really pays off. Finally, it shows that elegance is not always worth it.

¹I'm fully aware that hashtables have some boundary cases where they are a lot slower than something like a tree, but for simplicity's sake let's say that they are as advertised.

Dec 8, 2008

Responsibility

Once upon a time, when I was in my second year at university, I was taking these two courses (I was taking more than two courses of course, but these are the two that have any relevance to my story). One of them was called Computer Graphics, where you learn about how computer graphics are done. Things like transformation matrices, perspective, anti-aliasing techniques, things like that. While the lectures were mostly theoretical using formulae, diagrams and pseudo-code, the assignments were to be done in C++ with OpenGL and GLUT - and for the most part the only function in OpenGL you were allowed to use was glDrawPixels. This meant that we were going to be working with pointers a lot, and doing pointer arithmetic and things like that since glDrawPixels only accepts a single-dimensional array of pixels. Not a big deal, pointer arithmetic is pretty damn basic. On top of that, we had a required first-year course on assembly language where if you want to get anything done you use mostly pointers (aka memory addresses) so pointers weren't really a foreign concept.

The second one was a class on programming languages. We learned about things like static vs. dynamic typing, functional programming, parsing, etc. It was pretty interesting (it was during this course that the FP light bulb went on in my head). However about halfway through we had to do a small unit on pointers and pointer arithmetic because the Graphics prof was complaining that nobody understood them and students were failing the assignments because of this - and in a class where there is no final, the assignments have quite a heavy weight.

I found it somewhat sad that people actually needed this. It's not like we were doing anything advanced with C++, the objects we were making were very basic, the standard libraries we used were no more advanced than std::list or std::vector. If you want to get familiar with pointers there is this thing called a search engine that you can use to find this stuff out, or another thing called the school library which was full of books on C++. Yet people blamed the prof for using things that they hadn't been taught in class - it is important to note that the C++ course was not required for the graphics course, probably because the scope of C++ that you use (pointers) is covered by a single lecture.

The thing that was holding these people back was their lack of responsibility. The lack of understanding that not everything is spoon-fed to you and that you actually need to go out and learn things on your own time (consider it homework). Isn't that what university is all about? Learning things? How about learning how to learn things?

Here's a fact (might be widely known, might not be). A computer science education does not give you the direct skills you will need in the workplace. You'll probably learn the basics of Java and some of the libraries for it like Swing or the collection classes, but it's doubtful you'll be able to use just that to develop enterprise applications. You might take a course on PHP, but that won't tell you how to build quality websites - if your university has a class anything like the one at the university I went to, chances are the stuff you'll be learning is well out of date. They have classes that teach you Haskell or Prolog, which have a gigantic market share and will grab you a job in no time.
Then there's other things - no class ever taught me how to use version control. Or how to do unit testing. Or how to use vi/emacs (some universities do force you to use these, mine didn't).
So if they don't teach you the things you'll need to know, how are you supposed to get a job? This is where that responsibility comes in. You have a lot of free time when you're at school - at least this is the way it seemed like to me. You have lots of resources at your disposal. Any of these things, be it a language or a software or a technique, can be learned just by sitting down in a lab for a little while and looking online for it. I learned how to manage Ubuntu because there was a lab run by students, and you could volunteer to manage a machine (due to dropping enrolment/interest, by my 4th year I ended up administering all the machines). I learned how to use SVN because I was working on a personal project and decided it should be under version control.

Responsibility doesn't just apply to the computer world - although it is really relevant here. Not happy with your job? Find something else. Think you're overweight? Go to the gym. Something bothering you? Figure out why it is bothering you, and attempt to find a solution (preferably a solution that solves the problem, not just puts it off) instead of sitting back complaining about it.

We live in a (mostly) free society. Your choices are ultimately the ones that direct what happens to you, so the only thing that really holds you back is yourself. I'd guess that the main thing holding people back is fear. It's what holds me back most of the time. I'm afraid right now, that after I post this article people will read it and leave nasty comments saying how dumb I am, or how inexperienced I sound, or how I'm completely wrong about everything.
That's part of learning. There's been several times when I write something and somebody will leave an insightful or informative comment telling me how I'm wrong. As much as I hate being wrong, it is a good experience and after the initial annoyance at being wrong subsides, I feel like I've learned something and am a better person due to my failure.

So if you're young and unhappy/unsatisfied, now is the time to go out and take risks. Ignore your fear of failure. What have you got to lose at this point? It's not like you have dependants or anything (if you do, ignore that last comment). Your life at this stage is mostly a blank slate, and what becomes of it is what you make of it. Don't let others dictate what goes on it, take responsibility for your own actions.

Dec 4, 2008

Ruby Hash#only, Hash#except

I was wondering the other day if the built-in Ruby Hash class had a way of applying a blacklist or a whitelist to filter out certain keys. I couldn't find anything, so I rolled up a little thing of my own, in case you're interested:

class Hash
  def except(*blacklist)
    {}.tap do |h|
      (keys - blacklist).each { |k| h[k] = self[k] }
    end
  end

  def only(*whitelist)
    {}.tap do |h|
      (keys & whitelist).each { |k| h[k] = self[k] }
    end
  end
end

Now you can do things like this:

h = {:a => 10, :b => 34, :c => "hello"}
h.only :a, :b     #=> {:a => 10, :b => 34}
h.except :a, :b   #=> {:c => "hello"}

Note that the code only works with Ruby 1.9 or higher, or with the andand gem installed because it depends on Object#tap.

If anybody has suggestions/improvements, or knows that this has been done already and can point me to it, feel free to comment.

Dec 3, 2008

Christmas Coffee

I got up yesterday all ready to start coding (or blog reading, whichever happens first when I sit down at the computer). After making coffee, I realized that I had used up the last of the milk for my cereal and consequently did not have milk for my coffee. Now this is downright heresy, so I quickly peeked into the fridge for a proper substitute. My eyes settled on the carton of eggnog sitting there on the shelf. I figure it's mostly milk and sugar anyway, so let's try it.

It's actually quite good, and I'll recommend it to anybody.

Today I'm trying it with nutmeg.

Maybe tomorrow I'll add some rum.

UPDATE: The nutmeg was gross.

Nov 27, 2008

Rails 2.2 and GetText

So Rails 2.2 has come out recently, and it's touting some nice features. These include internationalization (i18n) and thread safety - the thread safety aspect isn't so great for C Ruby, but is excelled for JRuby which uses a different threading system.

There is one problem though, and that's for anybody (like us) who is using ruby-gettext, which is an i18n gem for Ruby and Rails for doing translations. The problem is that gettext depends on plenty of the little things in Rails that were not thread-safe, and are consequently no longer there. If you even attempt to load the Rails environment with the gettext gem included (by doing something like script/server, or a rake task), it will blow up in your face.

The solution right now is not that cool. You've got a few options:
1) Revert to 2.1.2. Kinda sucks, since we're using JRuby for our app and I was looking forward to the fancy schmancy new stuff.
2) Fix the gettext issues yourself. Really sucks. While like a good FOSS boy I should be all over this one, time is my most limited resource and well, it has to go to things that either keep me fed or keep me sane.
3) Wait for the people behind ruby-gettext to fix it. Also sucks, because we have no idea when that will happen.
4) Rip out all the gettext stuff and replace it with Rails' new i18n stuff. Probably the best solution, however this takes more time than 1) and so is not on the immediate horizon.

So if you're using gettext, remember not to update to Rails 2.2+ without first ditching gettext. Or if you were thinking of using gettext, I'd recommend saying "screw that" and going with Rails built-in stuff.

UDPATE(Jan. 11/09): We have gone through and removed the gettext stuff from our app. Fortunately we weren't dependent on any gettext-related tools, so this was not a difficult task, just replace all the _ calls with t() calls (can by done easily with Netbeans using refactoring tools), and some configuration details. Was pretty simple fortunately.

Nov 26, 2008

Kingdom of ... Lennoxville?

I was talking to a friend from university yesterday and we started playing Kingdom of Loathing again. For those who haven't played it, it's a browser-based role-playing game where you control a character who goes around fighting monsters and doing quests (that is kinda implied by the term "role-playing game"). The difference is that it's quite satirical, it makes a lot of jokes based on puns or play-on-words, or pop culture, or things like that.

We were thinking that it'd be really fun to have something like that, except based on our experience at Bishop's, or on college life in general. It'd be based in the little town called Lennoxville which is where Bishop's is, and so the places you would explore would likely be places around there. However the game itself and most of the adventures would be more oriented towards college life in general.

I'm wondering if anybody would be interested in this idea, maybe not necessarily for coding (if you want to do that, it'd be cool too, but it's a fair bit of work and so I don't expect anything) but for ideas of things that would be stereotypically college, or ideas on the mechanics of the game. Or even just if you'd be interested in playing.

Nov 24, 2008

FOSS and the Software Industry

I read an article on Slashdot about how open source is slowly eroding a lot of the commercial applications out there. This isn't really anything new, it's been happening for years. However it seems like the quality of open-source is continually getting better and it's just a matter of time before it becomes "good enough" for other people to want to use it over the proprietary equivalents. Even the Economist thinks that open-source is going to be bigger - not that we really needed them to tell us it was, we knew it already.

I'm not trying to say open-source will eventually wipe out the market for proprietary software. Here's what I think will happen. Anything that's fun or interesting to code will eventually be taken over by open-source. These are things that programmers such as myself would not mind doing in our free time as one of those things, you know... what are they called? Oh right, a hobby. Like how some people spend tons of time building model trains or burning wood, we sit at our computers and churn out code that does cool stuff - well some people do, I tend to do more experimentation with random things and blog about them.

So what does that mean for professional software developers? Probably that demand for us will shrink. People won't need us to develop their software, since there are open-source versions that are as good as whatever we'll put out. People will still need custom-made software, or they won't want to comply with the GPL, or will want to make games (note to self: write about how the open-source development model doesn't really fit gaming), or whatever. There will still be jobs for us.
What I'm worried about is that most of the interesting jobs will be gone. We'll all be stuck either working for slave shops, fancy new web 2.x startup ideas, or content-management systems. Or things like that.
So I'll stop and make a confession here. I'm going to school. Not in anything really related to computers. I've been doing classes on and off since I graduated, but I'm starting to look into it more seriously. My rationale is this: I enjoy programming, but there is such thing as too much of something you enjoy, and it makes you not enjoy it any more. Therefore, I want to work in something else I enjoy, and then come home and mess around with software. Much more fun that way.

Nov 22, 2008

In The Office vs. Home

A few months ago I quit my job at the office to work for a startup, which has involved me working from home most of the time. I've worked at home before, and have sorta flip-flopped on how I feel about it. Here's my current analysis:

Pros:

Flexibililty - If I have to go to some place that is only open on weekdays between 9 and 4, I don't have to rush during my lunch break. If I want to stay up late and sleep in, I don't have to wait for the weekend. I can take classes if I want.
The flexibility of working at home is the main reason why I like it.
Freedom to Choose - I don't have to adhere to a particular software just because everybody else is using it. I can choose to work under Ubuntu with Vim instead of Windows and some IDE.
No commute - Well, there is a bit of a commute. I have to walk all the way from my bedroom to my computer room in the morning. Sometimes I even have to stop by the bathroom and the kitchen on the way, which means going really far out of my way since the computer room is next to the bedroom. Damn.
This one is really awesome in the winter here in Montreal. No more having to jump from a metre-high snowbank into the bus, only to spend an hour and a half on a normally 15 minute bus route. No more having to walk down a frozen sidewalk because it is faster to walk from the metro line than take the bus.
Choose a time frame - One interesting thing about coding is that your good coding periods do not really occur at the standard working hours. Sometimes they might, other times maybe not. I don't really have a time when I'm most productive, it changes from day to day. Sometimes I can churn out some good stuff at midnight, other times at 11am. It depends on the day, and the office doesn't compensate for this.
Claim things on taxes - since I work at home, I can claim things like Internet and hydro on my taxes. Awesome, since in Quebec you get raped when the tax man comes around.

And now, the cons:

Takes self-discipline - There's a Wii with Rock Band in the other room. I have Starcraft installed. I have beer here. The temptation to do any of these is quite high when you're working at home. It takes some good self-discipline to not do these while you're working.
No boundary between work and play - When working in an office, you have a clear distinction between what is work and what is not - the location. When you're not at the office, you don't have to give a shit about work and can sit back and relax. For me, I always have this feeling that I haven't got enough work done, so even if I have done 40 hours for the week I feel like I can put more in. Doesn't give me much time to relax.

These are the ones I can think of right now. Anybody have anything else?

Nov 19, 2008

Analyzing Blogs

This is completely unrelated to any thing I write about. I found this random site that analyzes a blog and tells you about the thought processes of the author.

Here's what I got:

ESTP - The Doers

The active and play-ful type. They are especially attuned to people and things around them and often full of energy, talking, joking and engaging in physical out-door activities.

The Doers are happiest with action-filled work which craves their full attention and focus. They might be very impulsive and more keen on starting something new than following it through. They might have a problem with sitting still or remaining inactive for any period of time.

I'm really not sure what to think about this. The second paragraph is pretty fitting, but the first? Hmm...
It is beta after all...

Nov 15, 2008

Using Reduce as a Decorator

A while back I wrote a post about reduce, which is a technique used in functional languages to get a single value from a list.

While one main use of this is for things like summation or finding a max/min of a list, it can be used in other ways too. One neat thing that I discovered you can do is apply the decorator pattern with reduce (this example is in Ruby which calls it inject):

decorators = [Mocha, Whip, TonsOfSugar]
new_coffee = decorators.inject(old_coffee) { |coffee, dec| dec.new coffee }

I thought this was pretty cool, and realized that it can be used in a more general case. If you have a set of transformation functions in an array, and you can apply them all using just this one line:

result = transforms.inject(original) { |o, t| t.call o }

One example I found is if you have a string, and a set of replacements to apply to it (could be stored as a hash), you can do it like this:

replacements = {"a" => 1, "b" => 2, "c" => 3}
original = "abcdefg"
result = replacements.inject(original) { |string, repl| string.gsub(repl[0].to_s, repl[1].to_s) }
puts result    # outputs 123defg

This is probably old news for FP folks, but to us young'ns it's an interesting discovery!

Nov 13, 2008

Rock Band Drums in Linux, Part II

I went back to my Rock Band drums code and decided to actually whip something up out of it. It's a tiny little app that uses some OpenGL to render the drum pads, and they bounce and things like that when you hit them. The kick pedal makes everything shake and gives a flash. It's pretty cool.

The link to the code is here: http://code.google.com/p/opendrums/

You can check out the code if you like, I released it under an MIT license so you can really do whatever you want with it.

There are a couple issues still. Mainly is the audio latency can be annoying sometimes, as it gets to be a half second off. Since I don't know much about programming audio, I can't fix it too much, but I'm doing my research and will try to get this resolved as fast as I can.

I tweaked the old code because there was something making it so that if you hit two pads at once, only one of them plays. This is no longer the case, you can hit as many pads as you like and they will all play.

The code is done with SDL and OpenGL, so it should work just fine on Windows or Mac too, you just need to compile it because I haven't made a compiled version for either of those platforms.

UPDATE (01/20/09): It seems there has been some trouble getting opendrums to compile. On Ubuntu you need the following packages:

build-essential - In order to compile C++
libsdl1.2-dev - For SDL
libsdl-mixer1.2-dev - For SDL_Mixer
libgl1-mesa-dev - For OpenGL

On other distributions the package names might be slightly different. On Windows/Mac you'll need a C++ compiler which will likely come with OpenGL, but you'll need to install SDL and SDL_mixer manually - although I believe Mac has some package management system that you might be able to use.

Nov 12, 2008

Montreal on Rails

I'm doing a presentation at the upcoming Montreal on Rails gathering, on Nov. 18. My presentation is a simple introduction to jQuery and jRails, and why you might want to use them instead of Prototype and Scriptaculous. I haven't thought of a cool name for it yet, stay tuned for that.

If you're in the Montreal area and like/use/are-interested-in-using Rails, you should come check this out as there are plenty of interesting talks and cool people to meet.

Ruby Makes You Lazy

I was messing around the other night with some code, working on a simple artificial life simulator thing to play Conway's Game of Life. I figured I'd take the opportunity to learn how to use RubySDL, which is Ruby bindings to use the SDL media library. It took about 45 minutes to an hour to get everything rolling and before long, I was seeing the shapes run around everywhere on my screen.

It was pretty nifty, although after I upped the size of the world to 200x200, Ruby could no longer process everything quickly enough to maintain the 1 FPS framerate I was working with - which I found pretty funny really, because there isn't that much processing going on. But whatever.

I decided to port the thing over to C because C is fast, and when it comes to SDL most of the calls are the same. It was now though that I really realized how much Ruby makes you lazy. Maybe not lazy, but it spoils you. For example, in Ruby you can do this:

pieces = File.read("my_file").split("\n").map { |l| l.split(//).map(&:to_i) }

What this does (for those who don't know Ruby) is take the file, split it into an array of it's lines, and convert each line into an array containing the integer version of each character in the line. It's magical! Try doing this stuff in C:

FILE * f;
char c;
int pieces[SIZE][SIZE];
int i;
f = fopen("my_file", "r");

while (!feof(f)){
  c = fgetc(f);
  pieces[i / size][i % size] = (int)(c - '0');
  i++;
}

It's so much longer in C! Why do I have to write so much?

The answer is because it is faster. Like ridiculously faster. I put up the screen size to 800x800 (didn't make it higher because the window was only 800x800, but I could probably go higher) and it still maintained the same framerate.

This whole experience taught me something. While it is really nice to work with Ruby all the time (or languages like Ruby, say Python, Perl or dare I say it, PHP), these languages spoil you in ways that in time, you forget. However these nice things do come at a price, and when we need to do some work that is CPU-time-bound, I'm afraid that we may no longer have the skills to speed things up. It is important that when we're working with "productivity" languages like Ruby, we should still work with faster languages to keep us sharp.

Nov 10, 2008

Glassfish with RMagick

Deploying a JRuby on Rails app with Glassfish is a relatively simple process, and in my opinion much easier than with Mongrel. Takes about 15 minutes.

Unless of course, you have a fancier Rails app which has dependencies. Then, sometimes Warbler doesn't always link the proper files, and you end up with a WAR file that can't properly connect to your gems. The one I had a big problem with was RMagick. See, this one is already an issue for JRuby, since it was written in C and JRuby is in Java, and so you can't load it. I have said before though how to get around that with a little gem called rmagick4j which ports the functionality of RMagick (supposedly there are a lot of things not built yet, but I haven't had any problems).

UPDATE: You can fix this by using the warbler config:

cd /path/to/Rails/app
jruby -S warble config

In the generated file config/warble.rb, find a line that goes:

# config.gems += ["activerecord-jdbcmysql-adapter" ...

Uncomment it, and add "rmagick4j" at the end of the array. This will fix the rmagick4j thing. However, feel free to keep reading, as I learned a bit of stuff from all that and you might too.

REST OF POST:

Unfortunately, the gem is not recognized from Glassfish. There is an alternative to RMagick called ImageVoodoo which you can use if you like, however I need to use RMagick because I use other gems that use RMagick (like gruff).

Here's how we fix it. First, you need to freeze the rmagick4j gem. There's a nice little thing called Gems on Rails, which freezes your gems in your Rails project folder. Install and use it like this:

jgem install gemsonrails
cd /path/to/Rails/app
gemsonrails
jruby -S rake gems:freeze GEM=rmagick4j

Now you're almost set. There's a problem with Rails. It seems to think that since we have this gem called rmagick4j in our app, that there should be a rmagick4j.rb file in the gem's folder. It's really a reasonable assumption, since that is the convention, but rmagick4j has to break the convention in order to be compatible with the original RMagick. So you have to do a little tiny hack here in order to get things working.

Create a file called rmagick4j.rb with this in it:

require File.dirname(__FILE__) + "/RMagick"

Put it in the vendor/gems/rmagick4j-0.3.6/lib folder (replace 0.3.6 with the version of rmagick4j that you have). Now you can use Warbler to package up your app, and it will work fine.

Nov 9, 2008

CUSEC 2009

I just bought my ticket to CUSEC 2009, which is a conference for Canadian university students in software engineering. While I'm not a student in software engineering, anybody is welcome, and the presentations are usually really good. I've gone for the last two years and they were awesome, and I recommend that if you are anywhere near Montreal that you attend too. Bring warm clothes though, Montreal is cold in January. And a toque too.

They've got some pretty big speakers picked out so far, such as Richard Stallman (founder of GNU, if you're reading an Ubuntu blog and haven't heard of this guy, get out from under your rock) and Dan Ingalls, one of the chief guys who built SmallTalk back in the day.

So if you're a developer in Montreal (or somewhere close to Montreal, like Quebec, Toronto, New York, Boston) you should come check it out, it's a blast and you'll likely learn a lot. If you're a student at a Canadian university (or an American one, you're welcome too) see if your school has a head delegate and if not, maybe become one and head on out for good times.

Plus, the parties are wicked.

Nov 8, 2008

Welcome to Intrepid

This would not be an Ubuntu blog if I didn't make a post about the most recent release, Intrepid Ibex. It was released not too long ago, but I usually give these things a few days before I test it out.

So upgrading was a little tricky, but not really. Hardy, being a LTS release, by default only upgrades if the new release is also a LTS release. Intrepid is not LTS, so you don't get any options to automatically update, or even update using apt-get dist-upgrade. Fortunately it is easily fixed by going to System->Administration->Software Sources, choose the Updates tab, and go to "Show new distribution releases" and set it to "Normal releases". After that, you should be able to auto-update. I was scared for a sec because I thought I would have to download the ISO and install from that. However, then I remembered that I've had to do that for every other version of Ubuntu that I've used, so it wouldn't be any different.

Another good thing to do is switch your download location to one of the mirrors, that way you aren't hitting the regular Ubuntu servers which are completely overloaded. You can do this in the same Software Sources dialog from the Ubuntu Software tab. Just go to "Download from", click "Other" and then you can pick a server near you. Makes it much smoother, and for me, much faster since USherbrooke (the mirror I'm using) has a fat pipe.

This was the first install that actually went smoothly. The upgrader didn't crap out and corrupt my install like it did with Gutsy and Hardy, it actually did all its stuff properly and left me with a usable system. There are a few differences with menus and what-not, the fonts are slightly changed and the arrows for menus are freakin' huge.

I did notice the tabbed browsing for Nautilus, it will probably take a little while to get used to but it will probably be handy. However if you're using the console a lot, you probably won't notice. Also I'm sure the fact that it doesn't really use xorg.conf anymore will be huge when my video card dies. Whether it is huge in a good or a bad way is yet to be seen.

The only really problem I'm having now is that I get a bunch of errors during boot which apparently mean nothing, because I'm not having any problems with hardware. The Ubuntu splash screen is gone, but I'm sure I can find a way to fix it up.

All-in-all this release is pretty good, although I'm thinking they put more focus on the server edition. I'm hoping that this is not the start of a trend, I would like to see more focus put on the desktop edition as that is where Linux needs the most work.

Nov 6, 2008

JRuby and OpenOffice

UPDATE: This code works for OpenOffice 2.4, but has not been tested with OpenOffice 3+ or with LibreOffice. Your mileage may vary.

I wrote a post not too long ago about how OpenOffice doesn't have much support for basic statistics, and that I would like to create something to fix that.

The OpenOffice API is written largely for Java and seemingly so far to a lesser extent, C++. However, they do try to make it work for most languages, including C#/Mono and Python.

I thought to myself, "JRuby can use Java classes, so can JRuby use the OpenOffice API?" And the answer is, "Yes, it can!"

So I'll tell you how to do it. It's not really that hard, although you'll have to deal with the Java way of doing things a fair bit, so you'll end up writing a fair bit more code than you would if it were a Ruby API. Also, there are a few cases where you have to convert from Ruby basic types to Java basic types, which can catch you if you're not careful.

The first thing you need to do is install the OpenOffice SDK. You can get it here, or if you're on Ubuntu you can install it like this:

sudo apt-get install openoffice.org-dev openoffice.org-dev-doc

The doc package is optional, however I find that documentation is a handy thing to have if you need it.

Once you have that, you're pretty much ready to go! You'll need to check for a few things first, to ensure that the proper JARs are there. Go to your OpenOffice install directory (if you installed using apt-get, this is /usr/lib/openoffice) and check under program/classes for juh.jar and unoil.jar. These are necessary to get the correct classes for this sample.

So let's get to the coding! We want to make something useful, so we'll create something that I complained about in the other post: a random number generator. It'll be a little script that reads in 3 arguments from the command-line: number of values to generate, and the lower and upper bound. We'll ignore some error checking for simplicity, you can feel free to add it in if you like.

First thing we need to do is load in all the Java stuff:

require "java"
require "juh.jar"
require "unoil.jar"

include_class "com.sun.star.uno.UnoRuntime"
include_class "com.sun.star.comp.helper.Bootstrap"
include_class "com.sun.star.beans.PropertyValue"
include_class "com.sun.star.sheet.XSpreadsheetDocument"
include_class "com.sun.star.sheet.XSpreadsheet"
include_class "com.sun.star.sheet.XSpreadsheetView"
include_class "com.sun.star.table.XCell"
include_class "com.sun.star.frame.XModel"
include_class "com.sun.star.frame.XComponentLoader"

def queryUno(klass, obj)
UnoRuntime.queryInterface(klass.java_class, obj)
end

What this does is fairly obvious, it loads in the JAR files, and the classes that we will need for this example. Fortunately, JRuby only needs to load in the classes that we directly use. Ones that are used in the code but not directly used don't need to be included this way - otherwise we'd have to add like 5-10 more include_class lines to this.
I also create a little helper function there to save you some typing. Basically this queryInterface function is used to get an object of a specific type (klass) based on an interface that you pass (obj). It's not really necessary with Ruby, but since we're working with an API that is based around static languages it is something we need to do for now.

Let's set some variables based on the command line. Let's assume only 3 variables were passed, and that they were all integers:

N, A, B = ARGV.map(&:to_i)

Next thing we need to do is actually connect to OpenOffice. We do this by creating a remote context which connects to OpenOffice itself, and a service manager which gives us access to various components of OpenOffice. We want to access the "com.sun.star.frame.Desktop" service which handles the documents that are loaded. After that, we want to load up Calc so that we can start mucking with things.

# bootstrap the environment and load desktop service
remoteContext = Bootstrap.bootstrap
desktop = remoteContext.getServiceManager.createInstanceWithContext("com.sun.star.frame.Desktop", remoteContext)

# get us something to load the Calc component
componentLoader = queryUno(XComponentLoader, desktop)

# load the Calc component
calcComponent = componentLoader.loadComponentFromURL("private:factory/scalc", "_blank", 0, [].to_java(PropertyValue))
calcDocument = queryUno(XSpreadsheetDocument, calcComponent)

Wow, all that code. As we can see, we're making heavy use of the Factory pattern. We connect to OpenOffice using the Bootstrap class, which spits out a remote context object that we can use to access the various services of OpenOffice. We then create a desktop object, which lets us access the Desktop service - which we need in order to edit documents.
After that we create a component loader (read: Component Factory) and use that to create us a component for Calc. Note that I use this:

[].to_java(PropertyValue)

Since the loadComponentFromURL method is expecting a Java array, we need to pass it a Java array. The little snippet there is the equivalent to this in Java:

new PropertyValue[]

Finally, we create an object for our Calc document. We can now start editing things! But first, let's see if this actually works for you. Save the file as calc.rb, and run this line to execute it (replace OOHOME with your OpenOffice install directory, /usr/lib/openoffice on Ubuntu):

jruby -IOOHOME/program/classes calc.rb 10 1 10

If you run this, what you should get is a blank OpenOffice Calc window opening up.

Assuming everything is working, let's start entering our data. We want to create a new sheet:

sheets = calcDocument.getSheets
sheets.insertNewByName("Random Numbers", 0)
sheet = queryUno(XSpreadsheet, sheets.getByName("Random Numbers"))

Fairly straight forward, we get the sheets of the document, insert a new one called "Random Numbers" at the beginning, and get the object for it.

Let's add our random numbers:

N.times do |i|
cell = sheet.getCellByPosition(0, i)
cell.setValue( rand(B - A + 1) + A )
end

If we run this code, we now have a new sheet in our open Calc window called "Random numbers, and the first column has 10 random numbers in it. Cool! Now we can do whatever we want with those numbers.

Finally, if we want to automatically switch to the new sheet:

model = queryUno(XModel, calcComponent)
controller = model.getCurrentController
view = queryUno(XSpreadsheetView, controller)
view.setActiveSheet(sheet)

OpenOffice uses an MVC structure for its information. We need to get access to the view, which is done through the controller, which is accessed through the model. Then we tell the view to set our sheet to the active one.

If you have any questions, let me know! I'm no expert by far, but I've been doing a bit of digging and this is what I've got so far.

Nov 4, 2008

X.org, Wayland, WTF?

I read on Slashdot about this new X server for Linux that attempts to clean up historical cruft and make things smaller (sounds like what X.org was trying to do over XFree86). One thing I noticed is that I can't actually find this Wayland thing to view any of the source, let alone try it out.

It seems like another one of those things that a bunch of geeks go, "oh, a shiny new idea!" (even if it isn't really a new idea) and get all excited about it. I'd be interested in checking it out if I could, but I can't, so no checking out.

This Wayland thing got me thinking and now I'm wondering about something. What is the usefulness of this network transparency thing for the majority of things we do with a computer? I'm sure there are a few cases where it would be handy, but not that many. I have to say there's been like two occasions where I've had to use it: once when I wanted to run a program from home when I was at work - which ended up horribly failing because it seems any modern Linux app is so resource-heavy that running it over the Internet is ridiculously slow, easier just to install a VM - and second, to read User Friendly because for some reason my box couldn't access it but our dev server could. Basically the only feasible way to have this whole network transparency thing is over a LAN, because over the Internet it's just too damn slow.

In a world of mainframes and slow-ass terminals, I can see how network transparency could be useful. Have the big box in the middle handle all the processing, and just send the display info to the terminals. This is much more feasible when your terminal is really slow, because the network slowdown is not so much the bottleneck as the CPU of the terminal.

Those days have long past with the advent of personal computers (hell, I don't even remember the days of mainframes and terminals, we had a desktop computer for as long as I can remember). Our computers nowadays can handle our processing needs (my computer here is probably more powerful than most mainframes when X was created), and the network delay is really the main slowdown for this kind of computing.

I haven't yet addressed the issue of distributed computing. There are plenty of reasons why we'd want a centralized system for our computing. Things like data storage, a central location for our apps so we don't need to install them on our system, etc. These are all very useful tools, and the whole personal computer thing doesn't handle it very well.
However, neither does X's model of computing. There are many reasons, and here are a few that I can think of:

Ease of use - X isn't exactly the most user friendly of environments. Believe it or not, there are non-geeks out there that may need to use a distributed computing environment. Say, the marketing team for your company, or your customers. Making them have to use X would probably make them less productive, or drive them away.
Programmer productivity - A lot of the time you have to re-invent the wheel when you're doing things like working with the window being resized, etc. Or you have to deal with memory management (even with Java or Ruby - Java is a huge memory whore). These things slow down the pace of a developer. It may not seem quite so apparent at this point in the essay, but I'll get to the alternative soon.
Scalability - Can X scale to say, thousands of users per day? Millions? If I want to market my app over the Internet, this might be a requirement.

Many of these problems have been solved by a different distributed computing technology: the Web. It's nice. It is easy to use - in the sense that accessing a web application is easy, no having to do X11 forwarding or all that crap.
Applications are simpler. You design your application with a quick execute-and-dump idea, as opposed to a more state-based system. Web technologies seem to acknowledge that web apps aren't usually CPU bound, and so can cut all sorts of corners and provide you with a nicer coding experience - look at Rails, or even PHP.
It is scalable. While scaling is not easy, it is definitely easier to make a web app scale to thousands of concurrent users than with a regular app.

Basically what my point is is that the need for this network transparency thing is not really as high as the Linux geeks seem to think. Most of it is covered either by traditional desktop computing, or "the cloud" of web apps out there that do pretty much the same thing but more easily. Maybe when redesigning something like X, they should focus on splitting the network forwarding from the whole display system - instead of one program that does everything, have several programs that do one thing very well, seems kinda UNIX-like to me.

Oct 29, 2008

OpenOffice and Stats

I've had to reinstall Microsoft Office recently. It made me sad. Why? Because OpenOffice couldn't do what I wanted it to do.

Basically there are two things I want to do: regression analysis, and random number generation. Both of these are doable using built-in functions of OpenOffice and some know-how about the formulas involved, but that's a pain in the ass. For the most part, there's a set of info that's handy to have for a regression like t-statistics and R² that you can't just spit out like the way Excel does. And no, I'm not pulling the whole "Microsoft product X does it this way so the open-source one should do it this way too", I'm saying that the Microsoft product's way of doing it is actually better, and perhaps emulating it might be a good idea.
Regression analysis is not that bad, although the optimization solver for OpenOffice leaves much to be desired - in fact, if I put "Assume linear model", it would always tell me that the optimal solution is 0 for all variables, which was not the case. Next, if I didn't assume a linear model (which is wrong, since it was a linear model) then it would come close enough to the real values for me to be able to use them, but they weren't that close (plus or minus 0.5, which is fine if the number is large, but if it is 1 then you're in trouble).
Random number generation is the next problem. OpenOffice has a built in RAND() function, which spits out a random value between 0 and 1. It's easy enough to scale it to whatever interval you need, but still a bit annoying. The next problem is when you want normally distributed random values. I found some formulas online to approximate this kind of stuff, and it started to get nasty. Plus every time you change a cell, it recalculates all the random values - slightly annoying when you're working with graphs since after changing a cell, the graph no longer reflects the data that you have.

I ended up trying to create something in Ruby and exporting data to CSV and loading it into Calc, but it was a bit of a pain to do. It was easier just to reinstall Office on my XP partition - better yet, I might install it in VirtualBox to save me the trouble of restarting.

Of course, I did search Google for this kind of stuff. For the most part I just found blog entries talking about how advanced DataPilot is - yes, if you consider variance advanced - or how there are better alternatives than OpenOffice, like R. I did find a set of macros, but unfortunately they didn't want to install due to dependence on a package that it thought wasn't installed but actually was, etc. Might have worked if I spent a few more hours on it.

Does anybody know of a good way of doing this with OpenOffice? If not, would anybody be interested in helping build a plugin? I'm thinking that if I had so many problems with this, a lot of other people who are less computer-savvy than I may have similar problems, and a plugin that allows for this stuff would be mighty-handy.

Oct 22, 2008

Using Gruff with JRuby

A while back, I had a need in one of my Rails apps to generate a graph. Not anything fancy, just a little thing to spit out some lines to make the data easier to visualize. I discovered this gem called Gruff which lets you do some pretty nice things like line charts, pie charts, etc. It's very handy for creating simple graphs that look pretty good. I used it for a graph on this post a while back.

Now, for my current project I have a need to use Gruff again, but this time I'm using JRuby. Normally gems work fine with JRuby, you just go "jgem install x" to install it. However, Gruff depends on RMagick, which (to the best of my knowledge) is written in C and therefore is not compatible with JRuby without a heck of a lot of hacking.

Fortunately, some folks out there somewhere created a gem called rmagick4j, which is a drop-in replacement for RMagick - well at least I think it is, I haven't found anything it does wrong yet. So, here's how to use Gruff with JRuby:

jgem install rmagick4j hoe gruff

And done! Now you can make fancy graphs to impress your boss.

Unfortunately in this day and age, people come to expect more. If you look at the graphs on Google Analytics, they're much fancier than anything that Gruff could ever come up with. Chances are they're using Flex or something, which means you can have a much higher degree of interactivity than you can with just a simple image.
The tradeoff is that with Gruff, you can have a nice-looking graph with about 3-4 lines of Ruby, whereas Flex would probably take you quite a bit longer.

Oct 15, 2008

TDD

I have a confession to make. I've never really been huge on the testing. At least not automated testing. I haven't been in the biz that long, so it's probably not a huge issue, but in hindsight a think a fair bit of unit testing on previous projects definitely would have helped.

With PHP if you're not using a pre-built framework or something, it might not be so easy to do unit testing. From my limited experience, you need to break things up into small units in order to do unit testing (gee, whoda thunk). While this is usually good development practice, in PHP it is not really enforced. I can think back to some projects in my younger days when things were kinda nasty.

Then you might start implementing some sort of basic unit testing by separating out some common functions and writing tests for those to see if they work best. You might even separate stuff into distinct chunks called actions. Those can be tested fairly well for behaviour, because you can just fire some $_GET or $_POST crap and then check the DB to see if the correct things happened, etc. Bit harder to check the display though, you'll have to dig through a fair bit of outputted HTML to see if things are correct. Unless you're using some sort of template engine, in which case you might have to get some hooks in there to see what is getting passed over.

Jump over to Rails. You've got everything really broken up into bits, there's the controllers that simply spit out raw Ruby objects which are very easily testable. There's the models, which are also very easily testable. Finally they have integration tests, which make sure that if you do a whole bunch of actions in a certain order, the correct result happens and nothing explodes in your face. It might be that each specific piece works fine, but suppose a bunch of actions are going off in succession and somehow one borks the data that a later one wants and everything explodes...this is much easier to catch through integration testing.

So now I've got a Rails app with a whole whack of tests (and believe me, it's not enough) that test a lot of basic functionality. Then the boss comes along and says, "we need this changed! And this and this and this! By tomorrow¹!" So I pile on all this functionality and make a big fat thing with code shooting out everywhere and the other coders are crying because there's all this stuff everywhere and they're having trouble following it. The next plan is to sit down and have a good refactor. Clean out some crap that is no longer in use, merge a couple other things that are doing something very similar, blah blah blah. The main problem with refactoring is that stuff breaks. In places you never knew it could possibly happen, like when you work out your arms at the gym and the next day your thigh hurts or something. Fortunately, you've got this wonderful command called "rake test" which then runs all your tests and gives you a nice little error report of everything that died. Then you quickly go through the list, fixing the little things here and there and run the test again. So much faster than this: I-make-change, fix-obvious-bugs-that-I-find, user-finds-bugs, I-track-them-down, I-fix, user-finds-more-bugs, etc.

The more I do this stuff, the more I wonder how I could have possibly coded without all this junk. I guess coming from my static typing world of C++ I had that wonderful thing called the compiler check my code for correctness, but once we're out here in PHP/Ruby-land we don't have such a luxury and need something else to watch our backs (not that unit testing isn't needed in C++, just slightly less). How much time would I have saved had I used more test-driven development from the beginning?

¹ This is a bit of an exaggeration, but I'm sure you get the idea.

Oct 14, 2008

Taconite

I'm working with a jQuery plugin called Taconite (not sure how to pronounce it, is it a soft C or hard C, long A or short A?) that is very handy for doing a whole bunch of jQuery commands with an AJAX request. Basically what you have to do is use XML, and the XML tags are the commands that you want to be executed.

There are a few things that I really like about it. First off, it makes it easy to have multiple updated divs. See with a regular updater object (like in Prototype) you can specify a URL and an HTML element ID, and it will dump the result of an AJAX request into that HTML element. Fairly handy. Taconite takes it two levels higher in that you can dump parts of the result in say three HTML elements, or seven thousand. You do this by passing a jQuery selector as a target, so if you go ".pickles", it replaces all HTML elements that have a class "pickles". Nice.

There's more than just replacing that you can do. Suppose you want to hide an element:

<hide select = "#thingy-id" />

All sorts of other commands can be used. Makes it nice and easy to do fancy stuff. You can also use <eval> tags to execute random Javascript.

The icing on the cake is that it is all automatic. You just have to include the Taconite js file, and all your AJAX requests will be Taconite-enabled. However, this doesn't mean every one will be processed by Taconite, only if it is XML-valid (perfect enforcer of standard XHTML) and it contains the <taconite> tag. Other than that, nothing happens - which is slightly annoying at times if you haven't managed to get your XHTML right, but that's what a validator is for.

So using this led me to some issues with Rails. How do we serve up XML? It's fairly easy, you just have to specify it. Go like this:

respond_to |wants|
  wants.xml
end

And Rails will look for action.xml.erb. Easy enough. The next problem is a little trickier, although the solution is fairly simple and intuitive. Since some of the "XML" that is put into this file is actually XHTML (it better be XHTML, or you're gonna be having headaches - remember to put the / in <img />) you're going to have to be coding part of your view in the Taconite XML file. It works just fine with the eRuby stuff, just code like you always have. The problem will come once you start trying to render partials in your XML file. Suppose you go

render :partial => "mypartial"

Since you're currently rendering XML, Rails will look for _mypartial.xml.erb, which may or may not exist. For me, the partials that I wanted to render in the XML were also being rendered in HTML files elsewhere in the app. Now the first option is to just copy the file from _mypartial.html.erb, but then we have code duplication which is so not cool dude. Maybe we could use a symlink, but that seems rather hackish and won't work if someone on your team is using Windows - speaking of which, I should write another article on Linux at work, seeing as how I haven't used Windows for work in over two months. The solution is very easy. Instead of the regular render statement, you go like this:

render :partial => "_mypartial.html.erb"

And it will work. Now you can have your cake and eat it too.

I just realized something, I haven't posted an article about how jQuery is so much more awesome than the other Javascript libraries I've used (Prototype/Scriptaculous or Mootools). More on this another day.

Oct 10, 2008

Election Time Baby

It's that time again. The time when our tax dollars get used to put up pictures of people that go on TV and squawk about what they think Canadians want - the problem being that Canadians have very widespread opinions, meaning there probably isn't a single set of issues that the "average Canadian" wants, probably explains why we have five major parties.

Even given five parties (and several smaller ones that don't usually get many votes), I'm still not sure which one to vote for. They all seem to annoy me.

First we have the current rulers, the Conservatives. They're all about big business, deregulation, etc. Their response to a falling economy is "sweet, a good time to buy stocks!" While I was starting to warm up to these kids, their fearless leader, Stephen Hitler, er, Harper proposed a nasty bill which from my inspection restricts the market far more than it helps it. This gives me the impression that when they say "the economy" what they really mean is "my corporate buddies". So these guys kinda suck.

Then there is the main opposition, the Liberals (they make it nice and clear to us which is left and which is right, although the Liberal party isn't really all that liberal by today's definition). I don't really have much to say about these guys, they're the one's I'm most likely to vote for.

Alright, let's get started with the NDP. Actually, let's not or I'll be here ranting all day. These are the ones who support the people who think that life is unfair, that they're being exploited, yadda yadda. It's these type of people who make it so that a bus driver gets paid more than a computer programmer. They couldn't tell a budget sheet if it came up and stuck itself up their asses (probably where they'd put the budget sheet if they found it anyway). Not to mention how Jack Layton plasters his picture everywhere, you'd think he got the idea from Stalin. At least he's better looking than Stalin.

I don't support Quebec separatism - I would have to quit my job and move if they separated, which would be really annoying - so the Bloc Quebecois is out.

Finally, the Greens. They haven't actually won a single seat in parliament, despite having around 5% of the vote last election and when I read the paper yesterday, they're catching up to the NDP in the polls. For a while, they were being refused entry to the election debates. One here in Montreal refused the Greens entrance, saying "they're not a real party." How democratic of them. I'd consider voting for this group, if simply to lend them a hand.

Of course, what I think really doesn't matter. We've got this wonderful first-past-the-post system that makes it so that if I don't vote for the party who will win in my riding, my vote really doesn't count that much. If history is a good mentor, I can say that the Liberals will win here given that in the last election they got over 20 000 votes, compared to their closest competition who got about 9000. The previous elections were similar. What to do?

Oct 4, 2008

The Firebug Attack

I've been thinking lately about a possible website attack which has to do with submitting phony data in a form, in an attempt to punch through some weak spots in a web application. For example, suppose in your database you have a field called "admin", which is a simple 0 or 1 which determines access to your admin section, or CMS, or whatever. The default value is 0 (set using the SQL default keyword), which means no access. In Rails you might write that like this:

t.boolean :admin, :default => false

Then when you deploy, you can create yourself a user and set the user's admin flag to 1.

Then suppose you have a signup section. You use a nice and easy way of doing things like what Rails does. In your HTML form:

<input name = "user[username]" /> ...

and in your controller:

user = User.new(params[:user])

This can be done in PHP too:

$user = new User($_POST["user"]);

Assuming that the "user" array is then passed to your object and the fields are loaded in. This increases productivity because you don't have to go in and type all the damn fields in that users fill out on the signup form - this might not actually be a lot, but you never know.

There is a possible exploit here. What if someone duplicates your form on their own computer, adds a hidden input like this:

<input type = "hidden" name = "user[admin]" value = "1" />

Uhoh, someone could possibly set their admin flag to 1. The SQL default won't save you here unless you manually set the admin flag to zero before it is saved to the database. But do we always do that? We might not have even thought of it.

One thing that Rails has now to protect against XSRF is the token_tag function. With newer version of Rails (not sure since when, but 2.1 does it) the default is to enable authenticity tokens to forms. So any forms you submit must have the authenticity token sent in a hidden field. Fortunately the Rails form helpers automatically put this field in for you, but if you are rolling your own form it is not difficult to insert

<%= token_tag %>

into the form somewhere. The authenticity token is session-based, so it is nearly impossible to send this token without actually sending it through the site. This prevents XSRF, but also prevents the type of attack I mentioned earlier.

All is sunshine and butterflies now, right? Wrong! See, someone with a good heart created this wonderful thing called Firebug. It does all sorts of wonderful things, like Javascript/AJAX debugging, file transfer statistics for images and CSS/JS files, and on-the-fly CSS/HTML editing. It's the last one that is the killer. You can edit the HTML of a page on-the-fly without needing access to the original code, or refreshing the page - a fun trick, edit your buddy's Facebook page, put a nasty picture up, and take a screenshot. Imagine the shock.

The editing HTML aspect is wonderful, I use it a lot to rapidly create features for showing to the non-techie people at work. It can also be used to insert things into a form. You could open up the code for a form that has the nice authenticity tokens there, and plug in the hidden input field that I wrote before. I tested this on my box, and you can make as many fields as you want with Firebug and they will be submitted to the server.

So when you're working with a framework that lets you pass a hash to a constructor to set fields, you should always sanitize the hash first, or reset any sensitive values afterward. This could be a potentially nasty attack vector to a site.

Oct 2, 2008

Sick of Web 2.0

Is anybody else getting sick of hearing all about Web 2.0? (pronounced web two-dot-oh or two-dot-zero or two-point-oh or ... oh I give up) All this crap about social networking and advertising based models and revolutions and blah blah blah.

IMHO, this is a phase. Facebook is boring. Nowadays I only use it because people respond to their Facebook faster than their emails. That's pretty much it. I signed up for Twitter, and still fail to see the appeal. Every time I sign in I get blasted by this huge barrage of messages from the people I am following, basically just saying what they are doing. That's nice. Close.

Then there's the look of things. Facebook looks pretty good, but Twitter and all its clones look like they were made by Dr. Seuss on an acid trip and all he had around were a bunch of crayons. The text boxes and fonts are so freaking huge with so much damn padding that I feel like I'm sent back to Giant Land in Super Mario 3. It seems like they've ditched this whole concept of screen real estate - mainly because there isn't really that much to show, so I guess they have to blow everything up.

A lot of the names leave something to be desired. Plurk? Spoink? This sounds like something they do at the end of a porn video (think about it this way and then look at this page). Kids, this is what happens when you smoke marijuana.

Anybody else sharing the same opinions?

Oct 1, 2008

Models of Computation

A random thought: currently our idea of computability revolves around the idea that a Turing machine can be created to accept or reject the input (you can also use lambda or predicate calculus, but for the purposes of this blog entry I will ignore them). For example, I can create a Turing machine to calculate the square of a number. But I cannot create a Turing machine that takes as an input another Turing machine t and an arbitrary input x and say if t will halt on x or not.

A question: Given a computable problem P, is it possible to have a Turing machine that will output a Turing machine that solves P? In English, can we make a computer program that is a programmer?

A brain can do this. A brain (IMHO) is an advanced device that can compute things. If a brain can do it, can a Turing machine? If not, perhaps there is a more powerful computational automaton than a Turing machine, just as a Turing machine is more powerful than a push-down automaton, which in turn is more powerful than a finite-state automaton?

Thinking about this kind of thing makes me want to go back to school.

Sep 30, 2008

Ruby Scoping Gotcha

One little thing you may need to remember when coding in Ruby. Consider this little program:

myArr2D = [ [2, 2], [2, 2] ]

myArr2D.each do |m| 
  x = m.map { |m| m * 2 } 

  puts m.class.to_s
end

Intuitively, this should output this:

Array
Array

However it outputs this:

Fixnum
Fixnum

What happens here is that the variable m in the map block overshadows the outer variable m, so that whenever you access the variable m after the call to map, you're accessing the inner variable. This might lead to some unexpected side effects, so make sure you keep it in mind...

Sep 29, 2008

Vimperator

A few days ago on a previous post I posted about how I was getting used to Vim and was liking it. A commenter mentioned Vimperator, which is a Vim-like plugin for Firefox. It basically takes Firefox and gives it a Vim-like interface.

It's pretty good. One may wonder why you would want to use something like this for a web browser, which is something that is inherently a mouse-based application.

If you think about it, what are the main things you do with a web browser? Open URLs, scroll, type text into text areas, and click links. At least that's what I mainly do. What Vimperator does is put most of these into keyboard commands.

For typing a URL, instead of pressing Alt+T (which is what I believe the shortcut was) you push 'o'. It is a much less awkward shortcut. What it does is begin entering the :open command, and then you type what it is you want. It even combines the address field with the search field, so that if you enter something that isn't a web page it sends you to a Google search. It's also nice because it has tab-completion.

For typing things into input boxes or text areas, and for clicking links, you have "hints mode". Press 'f', and it pops up a little number next to all the interactive components on your screen. You then hit the number and it acts as though you clicked the component (to open something in another tab, use 'F' instead). Pretty neat! The only problem with it that I've found is that in GMail half the "links" are actually span tags with onclick attached to them. This confuses Vimperator, as it doesn't realize that they are interactive components and doesn't give them numbers in hints mode. Note that you can still use everything the way you used to.

Finally, scrolling works the way it always has, but you can also use the hjkl shortcuts too to move around the page. Sounds useless, but saves you the effort from moving your hand all the way over to the arrow keys, and then having to move your hand back when you're done scrolling.

My other Firefox plugins like Firebug, Greasemonkey, HTML Validator, they all work as they used to.

It's probably not for everybody, it is a power tool. By default it takes away all the stuff at the top like the back button, menu bar, etc. You can get these back, but it is not the default. I wouldn't recommend it to people who don't want to take the few minutes or so to get used to it. I've had people sit down at my computer when I had Vimperator enabled and had no clue how to type in a URL (just press 'o'). So that's your warning. I like it, but you may not.

UPDATE: I stopped using Vimperator because it makes Firefox very slow. For a while I just thought it was Firefox, but it ended up being Vimperator so I scrapped it.

Sep 28, 2008

JRuby and SQLite

I've heard some fuss about JRuby not supporting SQLite. Personally, this doesn't bother me since I've stuck with MySQL, but some people might be interested. Here's how to get it working (this is for Ubuntu, but I don't think it'd be that different on other systems so long as you know how to edit files and install gems).

jruby -S gem install activerecord-jdbcsqlite3-adapter

This installs the JDBC Sqlite adapter, and should include any dependencies. If it doesn't, you need:

jruby -S gem install jdbc-sqlite3

After that, just edit config/database.yml to use the jdbcsqlite3 adapter instead of the normal one:

development:
  adapter: jdbcsqlite3
  database: db/development.sqlite3
  timeout: 5000

Presto! You're done.

Note that I'm assuming you're using Rails here, if not then you don't need the activerecord gem.

Sep 27, 2008

Ubuntu Game Experiment

On occasion I like to revisit the Linux gaming scene. No, it is not because I like to see horrible failures, rather I've thought of an interesting experiment.

Linux will not catch up to Windows or consoles in terms of hard-core games, at least in the next few years. These games take a lot of manpower to produce, and if the leaders of that manpower do not want to release on Linux, then it won't get released on Linux. Even under wine, the performance sucks a bit - I have Oblivion and Guild Wars working fine under wine right now, but they get a much lower FPS. I'd rather just reboot to Windows and use my graphics card to its full potential (I paid for it didn't I?).

I think too many people are trying to get the hardcore gamers to switch to Linux by making their games work. Personally, I think this is a bad idea. Hardcore gamers are among the biggest bitches I've ever seen, just go on Battle.net or something and listen to them talk. It's retarded. Why would we want these people polluting Ubuntu forums with their crap?

What more focus should be put on is the other 90% of gamers. The ones who like Frozen Bubble or Those Funny Funguloids, and only play once in a while - among this 90% are those people called girls, which last I checked are severely lacking in the Linux world.

Anyway, the moral of my story is that instead of targeting the niche market of hardcore gamers as would-be Linux converts, why not focus on everybody else?

Sep 25, 2008

Rails Fixtures Order

So I've been having some trouble with Rails and fixtures. What I have in my fixtures are two tables, we'll call them t1 and t2. There is also a join table between these two tables, which has some info about the relationship between rows in the two tables.

Now suppose I have n₁ fixtures for t1, and _n2 fixtures for t2. That means there are O(n₁n₂) fixtures in the join table, in my case I have an entry for every pair. It would be a huge pain in the ass to enter all that data into the fixture manually. So what I do is just

t1 = Table1.find(:all)
t2 = Table2.find(:all)
t1.each do |r1|
  t2.each do |r2|
    #output fixture YAML
  end
end

There is one problem with this. If the fixtures for Table1 and/or Table2 are not run before the fixture for JoinTable, then you're going to run into problems.

There are two things to do. The first one is in any controller test class, when you put your fixtures thing at the top, you put it like this:

fixtures :table1, :table2
fixtures :join_table

This seems fairly intuitive, but my first intuition was to put it like this:

fixtures :table1, :table2, :join_table

Then I had an epic brain fart trying to figure out why it wasn't working.

The second thing you need to do only needs to be done if you use the db:fixtures:load rake task. What this does is it loads your fixtures into your development database, which is very handy for coding. When you're in the development phase of your app, you don't need to create new migrations for your DB, just edit the old migration, and run db:migrate:reset. Makes things cleaner and easier to follow IMO.

However, this loads things in alphabetical order. You could name all your tables to be in the order that they should be loaded, but this is slightly annoying. The solution is to tweak your environment.rb file. Just add (EDIT: the other one didn't always work for me, I changed this so it does work):

ENV["FIXTURES"] ||= "table1,table2,join_table"

to config/environment.rb, and you will get the correct loading order.

EDIT: Always remember that when you add a new model, you'll need to manually add it to this list or your fixtures for that model will not be loaded. Learned this one the hard way, wondering why the fixtures were being loaded properly for tests, but not for the dev database.

EDIT (again): This doesn't always work, but it seems to work more often than if you didn't put this. The best option would be to load in fixtures, load whatever time-dependent stuff you have manually, and write a small script to export the DB into the YAML fixtures. That's what I ended up having to do finally, and it works like a charm.
Another thing if you don't want to do this is to create another script to do it for you. So instead of doing the normal rake task, you can have a script like this:

`rake db:fixtures:load`

# run whatever tasks you need ...

Run this script with script/runner so that it has access to your Rails models and what-not, and you'll be able to generate data automatically. You'll also have to load your file in from test/test_helper.rb during the setup() method so that your fixtures get loaded properly into tests.

Sep 24, 2008

WEBrick and Authentication

Picture this scenario: You are working on a Rails project. Your team (not just dev people, but any others like marketers, etc.) is distributed, so you're not all in the same office - and hence can't have any sort of internal network. You have a server somewhere for centralizing things via SVN, and you put other tools on it like Trac. That kind of thing is pretty easy, and with Apache you can just throw up some AuthType Basic stuff to keep unwelcomes out.

However, I want to make it so that the development version of the web app is viewable to non-dev people. Now for dev people, it's a requirement that they can get the code onto their machine and use it without relying on the central server to do work. So they have to be able to get MySQL up and running, install Ruby (or in the case of my project, JRuby), and anything else. But for the non-techies, how do they get everything up and running? They're probably running Windows too (it's funny, the entire dev team that I'm working with runs Mac, except me, who runs Ubuntu), which means that installing MySQL and all that will be a pain in the ass.

The first thought is maybe use Glassfish or something to deploy the semi-finished app, and then put some password lock. But that sounds like a lot of work. You need to WAR that shit up, and re-deploy it every time you do an update. Not cool. Why not just use WEBrick, which comes with every Rails project, and is as simple as going 'jruby script/server'?

The problem is when you want to password protect everything. Ideally, we don't want to have to make code changes. We want it so that on our local machines, we don't have to enter a password to see the site.

The first solution was to use Apache for authentication, then proxy over to WEBrick, who's port (3000) is not open to the outside world. This would work in theory, except that mod_proxy gets invoked before any authentication can happen. So even with the auth statements in there, it still just proxies over to WEBrick without asking for anything. Not cool.

Next solution: authenticate, then rewrite. Put in some authentication stuff, then mod_rewrite everything to localhost:3000. Authentication worked, rewrite didn't. I have no idea why. I would put in [P], but that would give a 404. Using anything else would result in a direct rewrite, and would redirect you to your own localhost:3000, which obviously would not give anything unless you had WEBrick running on your local machine (good thing it wasn't, or I would've been mightily confused until I looked at the address bar).

So my final solution was to modify the code. This in itself was a pain in the ass. There are many different ways to use HTTP authentication with rails. Rails has it baked in to use HTTP authentication, but not to use our htpasswd file. This meant that everybody had to have another username and password that was stored with the application just to access this little thing. As a coder, I find this level of duplication revolting, and so I attempt to write a little bit of code to check our htpasswd file to see if it's the right password entered. On Linux, by default, htpasswd uses the system's crypt() function to encrypt thing, which in Ruby translates to System#crypt. It unfortunate takes a salt to encrypt things (well, fortunately for security reasons, unfortunately for me since I didn't know the salt). I couldn't figure out the salt, so that ended up being wasted effort.

Then I found this beautiful thing. It is a plugin for Rails that lets you use an htpasswd file for HTTP authentication. It probably does more than that, but this is exactly what I wanted - well almost, I didn't want to make any code changes, but c'est la vie. It was one line of code:

htpasswd :file => '/path/to/passwords'

Put that in app/controllers/application.rb, and you've got your password locking. Now I can make WEBrick accessible to the world, and only the people with a username/password can see anything. Awesome.

I learned a lot during this adventure, about Apache and Rails Authentication and (rant alert!) how frigging useless #rubyonrails is when you have anything slightly advanced to do. I've spent a fair bit of time in there, and can answer the majority of questions people ask, because for the most part they are asking the questions because they're too lazy to read a good Rails book or google for an answer (which is what I do sometimes when I don't immediately know the answer). Every time I have asked something in there it has been something relatively advanced, and the response is either "figure it out for yourself", or silence. The first is a reasonable enough answer, given the standard questions that get asked in there, but not entirely helpful...what do they think I've spent the last hour or so trying to do? Silence is ok too, since if you don't know the answer then you're not expected to say anything. But still, both results are pretty useless.

What is still on the table: How to get WEBrick to run as a daemon with JRuby. The JRuby implementation has disabled the use of fork(), so using the -d flag for WEBrick is not an option. I'll have to write a daemon script or something.

Sep 22, 2008

Vim, revisited

A few months ago (just over 3 in fact) I posted about having begun to learn vi¹. Since then, I've learned a lot about it, discovered several plugins, and many times attempt to hit Esc after typing into Firefox. Or use hjkl to navigate in a text area. The only other program I now use to edit text is OpenOffice, mainly because it's a little difficult to get fancy charts and things into Vim. Also I submitted a paper in monospaced font with no formatting, the prof might be a little annoyed. You might think it overkill to use Vim for something simple like jotting down notes, but the funny thing is that Vim starts a fair bit faster than any other graphical text editor on my machine, like GEdit or Kate.

I've replaced my IDEs with it. I used to use Quanta, but it is slow to boot, and is once in a while unstable. Once in a while it will crash when I use the built-in FTP. And when I mean crash, it not only crashes Quanta, but the entire X server goes down. Slightly annoying.

Your productivity is improved by a fair bit when you start using this. Due to the mode-based editing, it is much easier to type commands than using Ctrl/Shift/Alt/some-combination-of-the-three, especially when you want more complex things. Want to delete a line? Press dd. Swap two characters? xp. One I use a lot is Ctrl+6 (or Ctrl+^ without pressing Shift), which opens the last file you had open. Kinda like how in Half-life you press q to get to the last weapon. On that note, I wonder how games would play if you could differentiate between q and Q... I guess you wouldn't be able to use Shift for sprint anymore.

You can even record a set of keystrokes, and bind that set of keystrokes to a key: press q, then the key, call it k. Every key you type will be recorded. Then press q to stop recording. Then later on when you want to use that recording, press @ then k. My only problem with this is that @ is a little awkward to do over and over, but there is probably a rebinding of keys.

It doesn't just end with the keyboard shortcuts. There are plenty of plugins for Vim. I have three favourites:
- VTreeExplore - it is a window in Vim (btw in Vim you can split windows, just like most fancy editors) that shows a directory tree. Very handy. Others have written about it too.
- Surround - when typing contexts (like dw or d$, which are delete word and delete from-cursor-to-end-of-line, respectively) you now can use s, which affects the surroundings around a bit of text. Type ds( to delete the parentheses around something. Type cs{[ to switch the curly brackets to square brackets.
- Vim's Rails plugin - This does more than just syntax highlighting. It adds some very helpful things for file navigation (something that is a fair bit annoying in Vim). If your cursor is over a model or controller name, you can press gf to go to that file. If you're in a view or model, press :Rcontroller (this uses tab auto-completion too, so just type :Rcont and hit tab) to jump to the controller. Similarily for jumping to models. If you're in a controller action, you can jump to the view. It's all pretty handy, and there are probably plenty of shortcuts that I don't know about. You can read here to learn more about Vim+Rails.
Note: I know at least one person is going to mention Textmate. Two things: I don't use a Mac (nor do I intend to any time soon), and I don't like to pay for software.

So it's been over 3 months and I'm not turning back. In fact, this was pretty much the case after a few weeks, and I am continually learning more. I recommend it to any programmer. You can also try Emacs too, I think it does the same kind of stuff and it is mostly a matter of preference - kinda like Ruby vs. Python ;).

¹ Technically, it's GVim, which is the graphical version of Vim, which is an open-source remake of an older text editor called vi, but these are just details.

Sep 18, 2008

The 64-bit Revolution

I've been wondering recently, how long did it take after 32-bit processors came out for everyone to switch to 32-bit? 64-bit processors have been out for quite some time now, and pretty much any new computer you buy nowadays is 64-bit. Yet the support for 64-bit systems, while growing, is still fairly limited - unless you're in Ubuntu, in which case pretty much everything works, I'm told Flash is a big pain to set up in Gentoo for 64-bit versions of the OS.

In my efforts with FreeGamage, I discovered many games which wouldn't work right on 64-bit. I dig through the code a bit and a lot of the time it's because they assume pointer sizes are the same as int sizes, which on my machine they are not. Pointers are 64-bit, ints are 32-bit. This leads to some problems, as they'll assign a pointer value to an int value, which cuts off the higher 8 bytes and then there are segmentation faults when they attempt to convert the int back into a pointer and dereference it. This is bad practice, but it doesn't stop people from doing it.

I believe that the big switch to 64-bit computing probably won't come for a while. Why? What are the benefits of switching to 64-bit? It's hard to say, but I don't know if it is that much faster than 32-bit. You lose compatibility with many 32-bit programs. You need different drivers - even in Windows this is a problem.

The major limitation of 32-bit that I can see is when it comes to RAM. Let's do a little math. In theory, a 32-bit register (ie. the memory address register used to access RAM) can hold 2³² possible combinations. 2³² = 2² * 2³⁰ = 4 * 2³⁰. As most of us geeks know, 2³⁰ is the size of a gigabyte. So the maximum amount of RAM that can be accessed by a 32-bit register is 4GB. In practice, I'm guessing it is slightly smaller than this because Windows XP on my desktop can only see 3.2GB, when the system has 4GB. On a 64-bit system, you can theoretically have up to 16 * 2⁶⁰ = 16EB of RAM. We don't even see EB in practice yet, not even on a hard-drive. Hell, we don't even see PB in practice yet, which is 0.1% of an EB. EB is exabyte, which is a million TB, or a billion GB. That's a lot of RAM. Maybe we'll see EB hard-drives by 2020, but RAM is still pretty far away.

Because of the limitations on RAM, anywhere you see RAM intensive processes, you should see it switching over to 64-bit. On the desktop, I'm thinking one of the first ones to do this will be games. Games demand gobs and gobs of RAM, and coupled with Vista, which also demands gobs and gobs of RAM, you're probably going to be hitting your 32-bit peak pretty quickly. A nice 64-bit system will be able to handle all the RAM that games will require in the next few years or so.

Sep 15, 2008

Coding Tests in Interviews

I'm reading this Slashdot article about IT professionals being tested on the interview. There seems to be two minds about this, some people hate it and thing that it's disrespectful, others think it is mandatory. What about you?

I'll be honest, I've never been asked to write code in an interview (as far as I can remember). This is probably because either I did so poorly on the pre-code sections that they didn't even want to bother, or I did so well on the pre-code sections that it was just assumed I could code. One company I worked for started giving coding questions after I had started, but I never had to take it (I probably would have mucked up a few questions).

This is a difficult question to ask. The argument Slashdot puts forward is that companies don't ask other professionals these kinds of questions on the interview, so why should they ask IT professionals? It seems fairly disrespectful, they say. If somebody has several years experience, a degree, certifications, etc. it should be a fairly good indication that they can code.

On the other hand, there are a lot of bad coders out there. Some of them even made it through university. Imagine random guy Joe in high school, with excellent interpersonal skills and a passing interest in computers. He hears one day through some survey that graduates of computer science (or a related discipline that has computer or software in its name) get paid a fair bit more than everybody else. He goes to school to study this stuff, picks a school with a relatively easy program, and squeaks through with 50's. He goes off and works for some smallish company for a while doing simple in-house software, doesn't do it too well but they keep him around anyway because as mentioned before, he's a nice guy. He's now got a few years experience under his belt, plus a shiny degree, which can pad up his resume fairly well (it'd probably look better than mine).

Now Joe goes and applies at a new place, hoping for a higher paycheque. What happens if they don't give him code samples? He could probably get in fairly easily. He's got experience, he's got a degree, and he's a nice guy. What more could you want? Well, Joe can't write a program that swaps the values of two variables. Nor can he write FizzBuzz. Yet he gets the job over some new grad hotshot who went through with A's but has no real work experience.

So this is a perfectly good reason as to why companies would give coding tests during interviews. Do I blame them? Not really. However, it does depend on the coding questions. For the company I mentioned above (which happened to be the first job I got out of university), they give a question on CSS, on how to make it so that a box on a web page can have rounded corners, but be able to expand and fit any amount of text. Pretty easy right? I probably wouldn't have gotten it. My CSS skills when I graduated were sadly lacking - I'm still not incredible with it, makes me wonder why I'm a web programmer. I probably would have given some crap with <img> tags relatively positioned a bit, with some IE specific file somewhere. I didn't know about the whole background-image field. Yet it took about 2 minutes looking at how someone else did it for me to be able to replicate it. Some coding tests, from what I've seen, fail to take into account that people are capable of learning.

This is where the 3-month probation period comes in. I think that this time is good enough for both the company and the employee to see if they are a good fit for one another. If the coder sucks, then the company can let them go after the 3 months. If the company sucks, the coder can leave. It's not perfect, but it is a whole lot better than letting the interview alone decide everything.