Mar 14, 2008

Maybe Rails isn't so slow

After seeing sites like Twitter and Penny Arcade that run on Rails, maybe it isn't so slow after all. I know about the basic things like caching and FastCGI to speed things up, but after those I found that my sites still went a bit slow - I may attribute it the fact that I'm still running on a dev server, but that's beside the point.

However in the long run, it is not the language that you write the software in that will slow it down, it is the design of your site. I've seen sites written in Java/JSP - supposedly super fast - that run slow as shit. There are other sites like Facebook or Youtube that run on PHP or Python, and are super fast. It doesn't take a Ph.D. in Computer Science to know that Java is faster than PHP or Python, and so obviously the latter sites are doing something else right that the first site(s) are doing wrong.

From my experience, the speed bottlenecks tend to come more from the database. Making too many database queries (especially writes) will really slow down your site. The problem with Rails here is that it is so easy to make lots of database calls. On top of that, the Rails community (as far as I've seen) tends to see SQL as a filthy beast that only hackers and PHP developers like to touch. This means that you will be seeing things like User.find(:all, ...) everywhere in your code, which then use the nice little belongs_to constructs to use more queries to do everything that could have been accomplished by a simple join (I could write a whole blog entry on how annoying the built-in :joins parameter is in Rails, but we won't go there).

So, Rails developers, use SQL for complicated things including belongs_to relationships. It was designed for a purpose, by very smart people. On top of that, it is probably a better query optimizer than you are. If you need to do a complex query for sophisticated relationships, you need to either a) rework your DB structure to a simpler system, or b) use SQL for what it is good at. SQL is considered dirty because it is verbose for simple things: UPDATE users SET all_my_stupid_fields WHERE stupid_criteria, so that is where you use the Rails built-in stuff.

Next, memcached! Oh my god. I cannot express how awesome this software is. I've found a nice tutorial on how to use it with Rails elegantly. It is basically a hashtable in memory that you can use to store things. Like ActiveRecord objects. That means that if you want to get the user with ID 342, you can get it from RAM instead of the database. This is a lot faster than accessing the hard drive, and since it's a hashtable and not a B-tree or whatever you're using in the DB, it is O(1) to access what you want.

EDIT: It's been pointed out here (thanks Guillaume) that hashtables are not always faster than B-trees. I agree with many of the points made, and from what I gather the efficiency of hashtables is highly implementation dependent. I don't claim to know how memcached implements their hashtables, but considering the app is open-source I'd assume they have (or soon will have) a good implementation. This article also notes that for a small n, B-trees perform better than hashtables. This is true, but if a site is at the point where it needs to use memcached, the n is not going to be very small. Finally I will point out that sometimes you will not be able to use only an indexed field in your query, so your time complexity for DB lookup increases to O(n).

Given all these little speed up things, mixed with a good server farm and smart developers who don't abuse Rails' nice features, you could have a super-high performance site with Rails up in no time. Now if only I could convince the people at work to switch to it...


Guillaume Theoret said...

You might want to re-think your assumption about hash tables vs b-trees though:

Robert Fischer said...

The trade-off with Rails is that you're going to be developing really fast, but you end up having to throw a lot more horsepower (particularly memory and supporting applications) in order to get it to run quickly. It is certainly very possible to get an application to run quickly, but the fact of the matter is that raw Ruby on Rails deployments are a resource hog.

JRuby deployments as WARs and page caching with a smart reverse proxy are pretty convincingly the best answers for speed improvements at this point, which you haven't really gotten into.

IllegalCharacter said...

Truth is, I've never actually worked with JRuby. I think it's something I should probably look into soon.

One thing that I didn't get into much is the economics of Rails. The trade-off ultimately as you said is fast development time vs. horsepower. With falling computer prices and high programmer wages, the cost of development time relative to cost of hardware is pretty high, making Rails a much more viable option.

Robert Fischer said...

Google around for JRuby performance stats -- particularly JRuby on Rails performance stats -- and it's pretty impressive.

You might want to start your learning on Headius's blog:

He presented at Ruby.MN a while back, and it was incredible.

Robert Fischer said...

Here's a post recommended by Headius:

IllegalCharacter said...

Hmm...from the benchmarks I've been seeing it looks like JRuby is actually slower than both Ruby 1.8 and Ruby 1.9.

These are a couple I looked at:

I have to say that these examples do not really give honours to JRuby's multi-processor support. I'm not sure how YARV handles multi-processing, although it claims to have better thread handling than MRI.

Given the direction of CPU hardware, we might all have to switch over to Erlang on Rails if we want to get a good speed boost ;)

Robert Fischer said...

Compare a compiled JRuby vs. a scripted Ruby. Scripted JRuby vs. scripted Ruby loses.

There's also some slowness in the Regex stuff, even in the compiled version, but that's targeted for improvement already.

IllegalCharacter said...

Hmm, I'll have to check it out on my own sometime. It's somewhat intuitive that the compiled JRuby would be faster, but as I've found it's not usually the language that is the bottleneck, I've worked on some pretty high traffic sites done in PHP and the language is rarely the cause for slowdown.

Robert Fischer said...

There's certainly a lot to that -- you need to have a pretty popular site before the language is the problem.