Ubuntu: A Love/Hate Relationship: rails

Showing posts with label rails. Show all posts

Jan 19, 2010

Best Thing About Rails...

Q: What's the best thing about being a Rails programmer?
A: Hitting on homophobic Django programmers and watching them freak out.

Jun 4, 2009

Glassfish, JRuby and Initialization Parameters

Sometimes in your Rails app you may want to pass some variable to your program, via an environment variable or something. With a regular Rails app, you can just do `export VAR_NAME=something` and it will detect it just fine. However if you're deploying in a container-based server like Glassfish or Tomcat, you'll have to pass the config values in a different way because it doesn't seem to pick up the enviroment variables.

You can do this with your warble config. If you haven't already done so, create your warble config file like this:

jruby -S warble config

This will create a file at config/warble.rb in your Rails app. Inside there you'll see all sorts of config settings for the JRuby environment.

What I wanted to do was output the SVN info on the site, so that when we were testing in the Glassfish environment we could see what revision we were working with. So I just had this little snippet of code:

`svn info` =~ /Revision: (\d+)/
revision = $1.to_i

Spit that out on the page somewhere and everyone can see what revision they're working with!

However the hard part is getting this into the WAR. What you have to do is pass it as a parameter in the warble config, like this:

config.webxml.svn_revision = revision

When warble runs, it will output this into the web.xml file within the WAR, which will get passed to your server.

To access this value from within the Rails app, you need something like this:

if defined?(JRUBY_VERSION)
  revision = $servlet_context.getInitParameter("svn_revision")
end

If you're only ever running in JRuby then you don't need the if defined? junk.

The $servlet_context object is a Java object that represents the servlet context. I actually don't really know what that is, but a quick search of the docs gives a handy bit of info and the one you want is the getInitParameter method.

Feb 25, 2009

JRuby on Rails and Development Efficiency

I've been working with JRuby for about six months now, and it has been pretty good. It has native thread support, and you have a number of different options available from the Java world.

As a deployment strategy, JRuby is pretty solid IMO. However from a development perspective it is a bit slower than MRI. The biggest one you notice is that JRuby takes a while for the JVM to warm up. This is fine if you're just running Mongrel or WEBrick or something, but when you have a bunch of small scripts or Rake tasks to run or something to play around with in irb, it is quite annoying to have to use JRuby and wait that extra few seconds for the JVM to load. Also for some reason it takes way longer for my test suite to run in JRuby than with MRI. Oh well, whatever.

Another problem is that many gems are native, and therefore not available to JRuby. At this point JRuby has enough of a following for popular gems to have a JRuby port somewhere, but the matter of finding it and getting it to work on all your developers' machines is a pain in the ass. Better to just do it once on the deployment machine(s) and be done with it. Some examples of gems that don't work in JRuby: rcov, RMagick, mysql, anything to do with datamapper. The memcache-client gem used to work, if you use version 1.5.0 it works fine but the latest one fails.
EDIT: There's been a bit of confusion by what I meant here. What I mean is that the gems in the repository do not work with JRuby, so going 'jgem install GEM' does not work, you have to find the port online. This isn't usually that difficult, but a bit more time-consuming than the standard way of doing things.

However I'd say JRuby is great for production for a few reasons. First off, it has access to native threads. I believe Ruby 1.9 uses native threads, but my Rails app currently does not work with the Ruby 1.9 available in the Ubuntu repositories and I'd rather not have to maintain a new Ruby install unless absolutely necessary.
JRuby also has access to a wider range of application servers. Mongrel works well with JRuby, and any other web server written in Ruby should work fine as well. JRuby can also be deployed as a WAR with any application server that uses WAR files. We're using Glassfish, but I think you can do it with Tomcat and others too.
Finally, JRuby has access to Java libraries. Say what you will about Java the language, there are a ton of Java libraries out there. For basic stuff, Ruby has pretty much everything it needs, but when you want to move outside of web development things get sparse quickly. Want to write an OpenOffice plugin? JRuby can do it by using OpenOffice's Java API. Want to use a NLP tool like GATE? The API is in Java. Where are things like this for Ruby?

Anyway, IMO ideal setup is:
development - Ruby, unless you're using some Java libraries like I mentioned above
production - JRuby
This may change as Ruby 1.9 gets better, but at the moment I'm liking the above setup.

Dec 21, 2008

Rails and Processing Uploaded .zip Files

I'm doing a little side project at the moment that allows a user to upload .zip files and the Rails app will process the contents. It turns out to be quite a pain in the ass to get going! The main reason is that the Ruby gem that handles .zip files only works with files, and with Rails you're not actually guaranteed to get a File object when someone uploads a file to you.

Let's start with the basics. Suppose you have an action like this:

def upload_to_me
  file = params[:the_file]
end

Assume that the_file is an item uploaded from a form in a file input field. Now Rails will automatically process all this for you and handle the temp file creation and all that. However one optimization Rails will do is if the file is smaller than 10kB, it just sticks it in an UploadedStringIO object which is not a file - so there is no temporary file.

Let's expand our action. We want to open up this file (assume it is a .zip) and take a peek at the contents:

Zip::ZipFile.open(file.path) do |zip|
end

The ZipFile object only accepts a filename. There is no way for you to pass in anything else, like say an IO object. So we have a predicament. The UploadedStringIO object we have is raw zipped data, but we can't actually unzip it because it is not a file.

What's the solution? It's ugly, but turn it into a file:

if file.is_a?(UploadedStringIO)
  temp_file = Tempfile.new("some_temp_name")

  temp_file.write file.read
  file = temp_file
  temp_file.close
end

# now file is a File object and can be treated as such

We use Ruby's Tempfile object, which stores things in a temporary folder (by default on Ubuntu it appears to be /tmp) and is designed to be thread-safe so that you don't have to worry about people clobbering each other's temp files.

I suppose since I have access to both Rails' code and the Zip gem's code, I could probably hack this stuff to make it work properly without being ugly, but this small fix should be enough for now. A good optimization would be to add something to ZipFile so that it can accept a IO object and not just a filename.

Nov 27, 2008

Rails 2.2 and GetText

So Rails 2.2 has come out recently, and it's touting some nice features. These include internationalization (i18n) and thread safety - the thread safety aspect isn't so great for C Ruby, but is excelled for JRuby which uses a different threading system.

There is one problem though, and that's for anybody (like us) who is using ruby-gettext, which is an i18n gem for Ruby and Rails for doing translations. The problem is that gettext depends on plenty of the little things in Rails that were not thread-safe, and are consequently no longer there. If you even attempt to load the Rails environment with the gettext gem included (by doing something like script/server, or a rake task), it will blow up in your face.

The solution right now is not that cool. You've got a few options:
1) Revert to 2.1.2. Kinda sucks, since we're using JRuby for our app and I was looking forward to the fancy schmancy new stuff.
2) Fix the gettext issues yourself. Really sucks. While like a good FOSS boy I should be all over this one, time is my most limited resource and well, it has to go to things that either keep me fed or keep me sane.
3) Wait for the people behind ruby-gettext to fix it. Also sucks, because we have no idea when that will happen.
4) Rip out all the gettext stuff and replace it with Rails' new i18n stuff. Probably the best solution, however this takes more time than 1) and so is not on the immediate horizon.

So if you're using gettext, remember not to update to Rails 2.2+ without first ditching gettext. Or if you were thinking of using gettext, I'd recommend saying "screw that" and going with Rails built-in stuff.

UDPATE(Jan. 11/09): We have gone through and removed the gettext stuff from our app. Fortunately we weren't dependent on any gettext-related tools, so this was not a difficult task, just replace all the _ calls with t() calls (can by done easily with Netbeans using refactoring tools), and some configuration details. Was pretty simple fortunately.

Nov 12, 2008

Montreal on Rails

I'm doing a presentation at the upcoming Montreal on Rails gathering, on Nov. 18. My presentation is a simple introduction to jQuery and jRails, and why you might want to use them instead of Prototype and Scriptaculous. I haven't thought of a cool name for it yet, stay tuned for that.

If you're in the Montreal area and like/use/are-interested-in-using Rails, you should come check this out as there are plenty of interesting talks and cool people to meet.

Oct 15, 2008

TDD

I have a confession to make. I've never really been huge on the testing. At least not automated testing. I haven't been in the biz that long, so it's probably not a huge issue, but in hindsight a think a fair bit of unit testing on previous projects definitely would have helped.

With PHP if you're not using a pre-built framework or something, it might not be so easy to do unit testing. From my limited experience, you need to break things up into small units in order to do unit testing (gee, whoda thunk). While this is usually good development practice, in PHP it is not really enforced. I can think back to some projects in my younger days when things were kinda nasty.

Then you might start implementing some sort of basic unit testing by separating out some common functions and writing tests for those to see if they work best. You might even separate stuff into distinct chunks called actions. Those can be tested fairly well for behaviour, because you can just fire some $_GET or $_POST crap and then check the DB to see if the correct things happened, etc. Bit harder to check the display though, you'll have to dig through a fair bit of outputted HTML to see if things are correct. Unless you're using some sort of template engine, in which case you might have to get some hooks in there to see what is getting passed over.

Jump over to Rails. You've got everything really broken up into bits, there's the controllers that simply spit out raw Ruby objects which are very easily testable. There's the models, which are also very easily testable. Finally they have integration tests, which make sure that if you do a whole bunch of actions in a certain order, the correct result happens and nothing explodes in your face. It might be that each specific piece works fine, but suppose a bunch of actions are going off in succession and somehow one borks the data that a later one wants and everything explodes...this is much easier to catch through integration testing.

So now I've got a Rails app with a whole whack of tests (and believe me, it's not enough) that test a lot of basic functionality. Then the boss comes along and says, "we need this changed! And this and this and this! By tomorrow¹!" So I pile on all this functionality and make a big fat thing with code shooting out everywhere and the other coders are crying because there's all this stuff everywhere and they're having trouble following it. The next plan is to sit down and have a good refactor. Clean out some crap that is no longer in use, merge a couple other things that are doing something very similar, blah blah blah. The main problem with refactoring is that stuff breaks. In places you never knew it could possibly happen, like when you work out your arms at the gym and the next day your thigh hurts or something. Fortunately, you've got this wonderful command called "rake test" which then runs all your tests and gives you a nice little error report of everything that died. Then you quickly go through the list, fixing the little things here and there and run the test again. So much faster than this: I-make-change, fix-obvious-bugs-that-I-find, user-finds-bugs, I-track-them-down, I-fix, user-finds-more-bugs, etc.

The more I do this stuff, the more I wonder how I could have possibly coded without all this junk. I guess coming from my static typing world of C++ I had that wonderful thing called the compiler check my code for correctness, but once we're out here in PHP/Ruby-land we don't have such a luxury and need something else to watch our backs (not that unit testing isn't needed in C++, just slightly less). How much time would I have saved had I used more test-driven development from the beginning?

¹ This is a bit of an exaggeration, but I'm sure you get the idea.

Oct 14, 2008

Taconite

I'm working with a jQuery plugin called Taconite (not sure how to pronounce it, is it a soft C or hard C, long A or short A?) that is very handy for doing a whole bunch of jQuery commands with an AJAX request. Basically what you have to do is use XML, and the XML tags are the commands that you want to be executed.

There are a few things that I really like about it. First off, it makes it easy to have multiple updated divs. See with a regular updater object (like in Prototype) you can specify a URL and an HTML element ID, and it will dump the result of an AJAX request into that HTML element. Fairly handy. Taconite takes it two levels higher in that you can dump parts of the result in say three HTML elements, or seven thousand. You do this by passing a jQuery selector as a target, so if you go ".pickles", it replaces all HTML elements that have a class "pickles". Nice.

There's more than just replacing that you can do. Suppose you want to hide an element:

<hide select = "#thingy-id" />

All sorts of other commands can be used. Makes it nice and easy to do fancy stuff. You can also use <eval> tags to execute random Javascript.

The icing on the cake is that it is all automatic. You just have to include the Taconite js file, and all your AJAX requests will be Taconite-enabled. However, this doesn't mean every one will be processed by Taconite, only if it is XML-valid (perfect enforcer of standard XHTML) and it contains the <taconite> tag. Other than that, nothing happens - which is slightly annoying at times if you haven't managed to get your XHTML right, but that's what a validator is for.

So using this led me to some issues with Rails. How do we serve up XML? It's fairly easy, you just have to specify it. Go like this:

respond_to |wants|
  wants.xml
end

And Rails will look for action.xml.erb. Easy enough. The next problem is a little trickier, although the solution is fairly simple and intuitive. Since some of the "XML" that is put into this file is actually XHTML (it better be XHTML, or you're gonna be having headaches - remember to put the / in <img />) you're going to have to be coding part of your view in the Taconite XML file. It works just fine with the eRuby stuff, just code like you always have. The problem will come once you start trying to render partials in your XML file. Suppose you go

render :partial => "mypartial"

Since you're currently rendering XML, Rails will look for _mypartial.xml.erb, which may or may not exist. For me, the partials that I wanted to render in the XML were also being rendered in HTML files elsewhere in the app. Now the first option is to just copy the file from _mypartial.html.erb, but then we have code duplication which is so not cool dude. Maybe we could use a symlink, but that seems rather hackish and won't work if someone on your team is using Windows - speaking of which, I should write another article on Linux at work, seeing as how I haven't used Windows for work in over two months. The solution is very easy. Instead of the regular render statement, you go like this:

render :partial => "_mypartial.html.erb"

And it will work. Now you can have your cake and eat it too.

I just realized something, I haven't posted an article about how jQuery is so much more awesome than the other Javascript libraries I've used (Prototype/Scriptaculous or Mootools). More on this another day.

Sep 28, 2008

JRuby and SQLite

I've heard some fuss about JRuby not supporting SQLite. Personally, this doesn't bother me since I've stuck with MySQL, but some people might be interested. Here's how to get it working (this is for Ubuntu, but I don't think it'd be that different on other systems so long as you know how to edit files and install gems).

jruby -S gem install activerecord-jdbcsqlite3-adapter

This installs the JDBC Sqlite adapter, and should include any dependencies. If it doesn't, you need:

jruby -S gem install jdbc-sqlite3

After that, just edit config/database.yml to use the jdbcsqlite3 adapter instead of the normal one:

development:
  adapter: jdbcsqlite3
  database: db/development.sqlite3
  timeout: 5000

Presto! You're done.

Note that I'm assuming you're using Rails here, if not then you don't need the activerecord gem.

Sep 25, 2008

Rails Fixtures Order

So I've been having some trouble with Rails and fixtures. What I have in my fixtures are two tables, we'll call them t1 and t2. There is also a join table between these two tables, which has some info about the relationship between rows in the two tables.

Now suppose I have n₁ fixtures for t1, and _n2 fixtures for t2. That means there are O(n₁n₂) fixtures in the join table, in my case I have an entry for every pair. It would be a huge pain in the ass to enter all that data into the fixture manually. So what I do is just

t1 = Table1.find(:all)
t2 = Table2.find(:all)
t1.each do |r1|
  t2.each do |r2|
    #output fixture YAML
  end
end

There is one problem with this. If the fixtures for Table1 and/or Table2 are not run before the fixture for JoinTable, then you're going to run into problems.

There are two things to do. The first one is in any controller test class, when you put your fixtures thing at the top, you put it like this:

fixtures :table1, :table2
fixtures :join_table

This seems fairly intuitive, but my first intuition was to put it like this:

fixtures :table1, :table2, :join_table

Then I had an epic brain fart trying to figure out why it wasn't working.

The second thing you need to do only needs to be done if you use the db:fixtures:load rake task. What this does is it loads your fixtures into your development database, which is very handy for coding. When you're in the development phase of your app, you don't need to create new migrations for your DB, just edit the old migration, and run db:migrate:reset. Makes things cleaner and easier to follow IMO.

However, this loads things in alphabetical order. You could name all your tables to be in the order that they should be loaded, but this is slightly annoying. The solution is to tweak your environment.rb file. Just add (EDIT: the other one didn't always work for me, I changed this so it does work):

ENV["FIXTURES"] ||= "table1,table2,join_table"

to config/environment.rb, and you will get the correct loading order.

EDIT: Always remember that when you add a new model, you'll need to manually add it to this list or your fixtures for that model will not be loaded. Learned this one the hard way, wondering why the fixtures were being loaded properly for tests, but not for the dev database.

EDIT (again): This doesn't always work, but it seems to work more often than if you didn't put this. The best option would be to load in fixtures, load whatever time-dependent stuff you have manually, and write a small script to export the DB into the YAML fixtures. That's what I ended up having to do finally, and it works like a charm.
Another thing if you don't want to do this is to create another script to do it for you. So instead of doing the normal rake task, you can have a script like this:

`rake db:fixtures:load`

# run whatever tasks you need ...

Run this script with script/runner so that it has access to your Rails models and what-not, and you'll be able to generate data automatically. You'll also have to load your file in from test/test_helper.rb during the setup() method so that your fixtures get loaded properly into tests.

Sep 24, 2008

WEBrick and Authentication

Picture this scenario: You are working on a Rails project. Your team (not just dev people, but any others like marketers, etc.) is distributed, so you're not all in the same office - and hence can't have any sort of internal network. You have a server somewhere for centralizing things via SVN, and you put other tools on it like Trac. That kind of thing is pretty easy, and with Apache you can just throw up some AuthType Basic stuff to keep unwelcomes out.

However, I want to make it so that the development version of the web app is viewable to non-dev people. Now for dev people, it's a requirement that they can get the code onto their machine and use it without relying on the central server to do work. So they have to be able to get MySQL up and running, install Ruby (or in the case of my project, JRuby), and anything else. But for the non-techies, how do they get everything up and running? They're probably running Windows too (it's funny, the entire dev team that I'm working with runs Mac, except me, who runs Ubuntu), which means that installing MySQL and all that will be a pain in the ass.

The first thought is maybe use Glassfish or something to deploy the semi-finished app, and then put some password lock. But that sounds like a lot of work. You need to WAR that shit up, and re-deploy it every time you do an update. Not cool. Why not just use WEBrick, which comes with every Rails project, and is as simple as going 'jruby script/server'?

The problem is when you want to password protect everything. Ideally, we don't want to have to make code changes. We want it so that on our local machines, we don't have to enter a password to see the site.

The first solution was to use Apache for authentication, then proxy over to WEBrick, who's port (3000) is not open to the outside world. This would work in theory, except that mod_proxy gets invoked before any authentication can happen. So even with the auth statements in there, it still just proxies over to WEBrick without asking for anything. Not cool.

Next solution: authenticate, then rewrite. Put in some authentication stuff, then mod_rewrite everything to localhost:3000. Authentication worked, rewrite didn't. I have no idea why. I would put in [P], but that would give a 404. Using anything else would result in a direct rewrite, and would redirect you to your own localhost:3000, which obviously would not give anything unless you had WEBrick running on your local machine (good thing it wasn't, or I would've been mightily confused until I looked at the address bar).

So my final solution was to modify the code. This in itself was a pain in the ass. There are many different ways to use HTTP authentication with rails. Rails has it baked in to use HTTP authentication, but not to use our htpasswd file. This meant that everybody had to have another username and password that was stored with the application just to access this little thing. As a coder, I find this level of duplication revolting, and so I attempt to write a little bit of code to check our htpasswd file to see if it's the right password entered. On Linux, by default, htpasswd uses the system's crypt() function to encrypt thing, which in Ruby translates to System#crypt. It unfortunate takes a salt to encrypt things (well, fortunately for security reasons, unfortunately for me since I didn't know the salt). I couldn't figure out the salt, so that ended up being wasted effort.

Then I found this beautiful thing. It is a plugin for Rails that lets you use an htpasswd file for HTTP authentication. It probably does more than that, but this is exactly what I wanted - well almost, I didn't want to make any code changes, but c'est la vie. It was one line of code:

htpasswd :file => '/path/to/passwords'

Put that in app/controllers/application.rb, and you've got your password locking. Now I can make WEBrick accessible to the world, and only the people with a username/password can see anything. Awesome.

I learned a lot during this adventure, about Apache and Rails Authentication and (rant alert!) how frigging useless #rubyonrails is when you have anything slightly advanced to do. I've spent a fair bit of time in there, and can answer the majority of questions people ask, because for the most part they are asking the questions because they're too lazy to read a good Rails book or google for an answer (which is what I do sometimes when I don't immediately know the answer). Every time I have asked something in there it has been something relatively advanced, and the response is either "figure it out for yourself", or silence. The first is a reasonable enough answer, given the standard questions that get asked in there, but not entirely helpful...what do they think I've spent the last hour or so trying to do? Silence is ok too, since if you don't know the answer then you're not expected to say anything. But still, both results are pretty useless.

What is still on the table: How to get WEBrick to run as a daemon with JRuby. The JRuby implementation has disabled the use of fork(), so using the -d flag for WEBrick is not an option. I'll have to write a daemon script or something.

Mar 14, 2008

Maybe Rails isn't so slow

After seeing sites like Twitter and Penny Arcade that run on Rails, maybe it isn't so slow after all. I know about the basic things like caching and FastCGI to speed things up, but after those I found that my sites still went a bit slow - I may attribute it the fact that I'm still running on a dev server, but that's beside the point.

However in the long run, it is not the language that you write the software in that will slow it down, it is the design of your site. I've seen sites written in Java/JSP - supposedly super fast - that run slow as shit. There are other sites like Facebook or Youtube that run on PHP or Python, and are super fast. It doesn't take a Ph.D. in Computer Science to know that Java is faster than PHP or Python, and so obviously the latter sites are doing something else right that the first site(s) are doing wrong.

From my experience, the speed bottlenecks tend to come more from the database. Making too many database queries (especially writes) will really slow down your site. The problem with Rails here is that it is so easy to make lots of database calls. On top of that, the Rails community (as far as I've seen) tends to see SQL as a filthy beast that only hackers and PHP developers like to touch. This means that you will be seeing things like User.find(:all, ...) everywhere in your code, which then use the nice little belongs_to constructs to use more queries to do everything that could have been accomplished by a simple join (I could write a whole blog entry on how annoying the built-in :joins parameter is in Rails, but we won't go there).

So, Rails developers, use SQL for complicated things including belongs_to relationships. It was designed for a purpose, by very smart people. On top of that, it is probably a better query optimizer than you are. If you need to do a complex query for sophisticated relationships, you need to either a) rework your DB structure to a simpler system, or b) use SQL for what it is good at. SQL is considered dirty because it is verbose for simple things: UPDATE users SET all_my_stupid_fields WHERE stupid_criteria, so that is where you use the Rails built-in stuff.

Next, memcached! Oh my god. I cannot express how awesome this software is. I've found a nice tutorial on how to use it with Rails elegantly. It is basically a hashtable in memory that you can use to store things. Like ActiveRecord objects. That means that if you want to get the user with ID 342, you can get it from RAM instead of the database. This is a lot faster than accessing the hard drive, and since it's a hashtable and not a B-tree or whatever you're using in the DB, it is O(1) to access what you want.

EDIT: It's been pointed out here (thanks Guillaume) that hashtables are not always faster than B-trees. I agree with many of the points made, and from what I gather the efficiency of hashtables is highly implementation dependent. I don't claim to know how memcached implements their hashtables, but considering the app is open-source I'd assume they have (or soon will have) a good implementation. This article also notes that for a small n, B-trees perform better than hashtables. This is true, but if a site is at the point where it needs to use memcached, the n is not going to be very small. Finally I will point out that sometimes you will not be able to use only an indexed field in your query, so your time complexity for DB lookup increases to O(n).

Given all these little speed up things, mixed with a good server farm and smart developers who don't abuse Rails' nice features, you could have a super-high performance site with Rails up in no time. Now if only I could convince the people at work to switch to it...

Ubuntu: A Love/Hate Relationship