Jun 27, 2008

Some Neat Things in Ruby

One thing I love about Ruby is how I'm always discovering more language features. I never had a "formal" tutorial on Ruby, so many of these things may be well known to you.

Some nice little things I've recently discovered:
  • Backticks! You can use backticks to execute something on the command-line and put the result into a variable. Apparently this is from Perl. Check it out:
    output = `ls $HOME | grep someregex`
  • The % operator for strings. I learned about this when trying some stuff in Python a while back, and though "wow, it would be cool if Ruby had this" and it turns out it does. Go like this:
    sql = "SELECT * FROM some_table WHERE id = %d " +
    "AND some_other_column = '%s'" %
    [params[:id].to_i, some_string]
    Of course this is a trivial example, but I'm sure you can think of some nastier SQL queries where this would come in handy. If you're like me and don't like the #{ .. Ruby code .. } stuff then the % operator is your friend.
  • Regex variables - I learned this one a while back after reading some other programmer's Ruby code and I discovered that if you have a Regex with placeholders then you can use the $1, $2 variables to access those:
    puts $1 if html_code =~ /\<a href = "(.*?)"\>/i
    This code will check some HTML code for links (I used a simpler regex for links to illustrate the example, this is insufficient for scanning real HTML) and output the URL of the first link it finds. Very handy.
  • Named parameters. I saw Rails do it, but never really understood how it worked. Now I do:
    def foo(p1, *params)
    ... some code ...
    end
    foo("asdf", :random_thing => 5, :test => "hello")
    When you call foo(), the first parameter you pass gets put into p1, and the rest of them get put into params, which is a hash of all the extra stuff you send. This is great, and I love it. Wish PHP had it, I hate having to go array(...) all the time for named parameters.

Some other stuff that I'm still wondering about:
  • What the heck do $< and $> do? I'm pretty sure they're some kind of I/O things, but I'm not sure what. Unfortunately googling for "ruby $<" doesn't find anything about it since Google seems to ignore $<.
  • What does & do when used as a unary operator? I'm still stuck in my C mindset where seeing & in front of something means "address of". From reading Raganwald, I think it calls the to_proc function of something, but I'm not sure. EDIT: Reg Braithwaite of Raganwald has answered this question for us.

If anyone has some answers to these things I don't know, or has anything neat they've discovered, feel free to comment.

Jun 26, 2008

Intuitive vs. Efficient

Many people, myself included, have complained about the lack of intuitive user interfaces in the Linux world when compared to the proprietary world. Here is my hypothesis on why this is the case.

First, what is intuitive? At reference.com, intuitive is defined as "direct perception of truth, fact, etc., independent of any reasoning process; immediate apprehension." Basically this means that if you fire up a new program, you should know how to use it right away.

What is efficient? From reference.com again, we see "performing or functioning in the best possible manner with the least waste of time and effort; having and using requisite knowledge, skill, and industry; competent; capable." For software, this means that we can do what we want with the least amount of time/effort.

Are these two things mutually exclusive? Not necessarily. For a low complexity interface, it can be both intuitive and efficient. Look at the Super Mario. D-pad to move, A to jump. Doesn't take long to figure this one out.
For more complex tasks however, it becomes more and more difficult to come up with a good interface. Generally to increase both efficiency and intuitiveness, there is an increased cost in time. So for some fixed amount of time in development, there is a trade-off in moving toward efficient or intuitive.

Proprietary software is usually made to provide revenue, usually by units sold or by advertising. Both of these are dependent upon the number of users using the software. So by making the software intuitive, you generally increase your target market. The time cost in developing software is better put to making the software intuitive, because it will increase overall revenue for the same amount of cost.

Compare this to open-source software. There is no revenue. Therefore, there is no real incentive to pick intuitive vs. efficient, since the developer really doesn't care how many people use the software. However, if the software being developed is to be used by the developer, there will be a time-cost in the future created from using an inefficient interface. Therefore, if a developer is developing an open-source program, they will focus more on making it efficient than intuitive. Many open-source programs follow this trend. They have a bit of a learning curve, but once you know what you're doing, the software meshes really well with the user in terms of productivity. An added benefit to being open-source is if there is a particular aspect that is slowing the user down, the user can modify the code itself to provide a productivity improvement.

Let's look back at the intuitive interface for a bit. Is there such a thing? Mario was pretty intuitive, but it does take some button pressing to figure out what does what. The B button doesn't do anything if you're standing still and don't have a fire flower, so that's not that intuitive. The only time you'll figure out that the B button does something is if you're moving while you push it, or when you get a fire flower.
Take a look at a word processor - we'll use Wordpad/Gedit as an example. It's pretty easy to write things, just type. By default, the cursor is in the editable area, so your text appears there. Then later on, you can move the cursor around with the arrow keys and backspace to get rid of text. You can even use the mouse to select text.
This interface, as it turns out, is horribly inefficient. What if you have stuff you want to move? Easy if you only have a few words to move, just delete them and retype them. But what if you want to switch Chapters 3 and 17? That are each 10-20 pages long? Here's where copy/cut-paste comes in. However, if you are just diving into a word processor for the first time, how do you know about copy-paste? Usually somebody has to tell you, you have to be shown, or some other way. You could muddle around with it yourself, but here's what might happen (note that when I say "click", it means "go to the edit menu and go down to the option and click it"): You click cut. Nothing happens. You click copy. Nothing also happens. You click paste. Again, nothing happens. Amazing. Now, let's try it when you have stuff selected on the rare chance that someone who doesn't know what they're doing has something selected when they try clicking the button. Cut: Removes the text. I remember doing this when I was like 8, and thinking that cut was a button to delete things, and never wanted to touch it again. Fortunately my wise father then told me how cut worked, and so I grew up with 1337 cutting skillz. Next, go to copy. Again, nothing happens. Copy doesn't do anything visible. Not very intuitive eh? Now here's where we figure things out. They click Paste. It pastes the copied text. So they go "aha!" - hopefully - and realize how copy-paste works. They still don't know where cut went, but hopefully they can figure it out eventually, based on their new knowledge of copy. However, the kicker here depends on the order that they click the actions. If they click cut or copy before they click paste, then they're in business. Other way around? Not so much. Fortunately many editors put it in this order: cut, copy, paste. So there is not really any problem. Some of them even grey out the paste button until there is something on the clipboard.

Now the next step is for them to figure out Ctrl+X/C/V. Ctrl+X and Ctrl+V aren't really that intuitive, but they are efficient since they are all in the same area and can be used with the left hand while the right hand is on the mouse. Some intuitiveness has been given up for the sake of efficiency.

The main point of this example is to say that even something that most of us consider basic (like copy-pasting) is not so basic to someone who has never done it before, and most of the time people need to be shown how to do it. Yet we don't go for something more "intuitive". Why not? Because people are capable of learning. People know that computers generally need some getting used to before you can get good at it. Same goes for pretty much everything else in the world. Want to play guitar? You have to sit down and learn it. Want to ride a bike? Gotta get on there and learn it. Want to drive? Get in there and learn it. Why should software be any different?

A second point I would like to make is that people confuse the two words "unintuitive" and "different". Take for example, vi. It predates the Ctrl-XCV standard. It even predates (to the best of my knowledge) people calling it copying and pasting. In vi, cutting is called deleting, copying is called yanking, and pasting is called putting. What are the keyboard shortcuts? d for delete, y for yank, and p for put. Comparing this to Ctrl-XCV, I would say that vi's shortcuts are more intuitive. However, since it is different, it is therefore labelled as unintuitive by most people.
Now I'm not saying vi is an intuitive piece of software. It has some other problems. There are two modes (I'd say there are even three modes, command mode has regular commands and : commands). People accustomed to other text editors don't like this. However, just like copy-pasting, it is something that can be learned fairly quickly when taught. So don't confuse unintuitive with different.

So next time you are writing a piece of software, think about whether you want to put more effort into making it intuitive or making it efficient. Many people nowadays seem to point to the intuitive side, but by moving to that extreme you may be crippling the power users (imagine playing Starcraft without hotkeys).
Also, next time you criticize a user interface for being unintuitive (I am very guilty of this, and I apologize), think to yourself if it is efficient or not. The developers may have cared more about their productivity than whether or not you want to use their software.

Jun 19, 2008

On Interviews

My company is currently interviewing for programmers to join my little group. However, what they say they are looking for makes me wonder about how we all think about the interview process. It seems like, judging from my experience and what I've read on blogs, it is always the company that makes the decisions or gets to have the last say about what happens. Of course, the company does incur the cost of paying the worker, but the worker also incurs the opportunity cost of getting better pay/benefits/experience/enjoyment at a different job. I believe that the interview is not necessarily the interviewer-interviewee relationship that everybody seems to see it as, but rather a discussion between two parties to see if they would like to work together for a time.

Remember I'm talking about software companies here, I have less experience looking at other fields.

Companies it seems want to hire the best and brightest. They also want experience. Finally, they want to pay the person as little as the person is willing to accept. But, the people at the company seem to think that the best and brightest will actually want to work for them. For example, when I graduated, I wanted to make video games and so I applied at Ubisoft and EA here in Montreal. They told me that they wanted at least 3-5 years experience making games. But now, after only one year of professional experience, I would probably decline any offer from Ubisoft or EA. Why? Because it would suck to work there. I wouldn't want to work for 70+ hours per week, I want to have a life. I don't want to be some mindless drone churning out software for a huge company, where my work will only result in a slightly higher paycheque, and the hard work that I do is taken and owned by the company. I'd rather have some sort of stake in the earnings that my work would produce, or be able to give some of my hard work to the community in the form of open-source, so that I can have that wonderful feeling that I've bettered the world for nothing in return (except that wonderful feeling). Compare EA to Fog Creek Software: you get stock plans, retirement plans, 4 weeks vacation (nice!), flextime, etc. Way better than most jobs I've seen. The work doesn't look incredibly thrilling, but most places are like that.

There are varying degrees of programmer skill out there, as many of us are aware. Same goes for experience, although I'd argue that you don't need much experience to be better than those with lots of experience - unfortunately HR people don't seem to understand that concept. So lots of companies want to hire the top notch guys/girls to work for them. But you have to think to yourself, "if I had the requirements that this company asked for, would I want to work for them?" When I ask this question about Ubisoft/EA, I say "fuck that!" If I had 3-5 years of experience in game programming I'd go somewhere that I get treated like a person and not a machine.
So just like there is varying degrees of programmer competence, there are varying degrees of job sucking-ness. The good people don't want to work for a job that sucks. Or even a job that is mediocre. And believe it or not, your company may not be the amazing place to work that you think it is. So when you're doing your interviewing, maybe lower the bar a little. The hotshots that you interview are probably going to go somewhere else, unless you're one of the rare companies that would actually be a pleasure to work for.

For programmers applying for jobs, remember that your skills are valuable. Without them, the company would probably not survive - especially if they are a software company. You should evaluate the company you're looking at and see, is it a place that I would enjoy working at? Ask questions, because the interview is as much you seeing if you like the company as much as it is them seeing if they like you.
Of course, when you're fresh out of university with loads of debt, you'll probably want to get the first job that comes your way. That's fine, but most of the time you won't get lucky and grab a good job, so you should keep an eye out for things that are a bit more interesting. Use the first job to support yourself, reduce your debt (if you have it) and gain a bit of experience, but don't feel like you owe them anything other than your 40 hours. Unless of course, it benefits you to give them more.

EDIT: http://www.igda.org/qol/whitepaper.php is an interesting study on the game development world.

Jun 18, 2008

Wheee

I picked up a Wii yesterday (or more specifically, my girlfriend picked up a Wii and I am owing half of the cost). Figured I'd give it a shot, it looks pretty cool. Ours came with a Nunchuck controller and Wii Sports, and we picked up a second Wii-mote so that I could kick people's asses at tennis. Also grabbed one of those recharger packs, hopefully it will be better than getting batteries all the time.

It is pretty cool. It is a great example of innovation. How many games have you played that made you sweat? With the exception of DDR. Other than on your hands where you hold the controller, not many. Doing the boxing and to a lesser extend tennis, you really start to work up a sweat. It's cool.

So I've come up with a list as to why the Wii is so much more awesome than the other consoles:
  • Cost - This is a big one. Now in the long run the overall cost if you buy everything yourself is not that much cheaper. The Wii itself runs at $100-$200 less than a PS3 or the 360, depending on which model of PS3 or 360 you get. The games are about the same price, same with the add-ons. So if you plan on just buying the thing yourself, then you're probably not going to save that much in the long run.
    This is where target audience comes in. Your non-gamer roommate (or in my case, girlfriend) is much more likely to pitch in to buy a Wii than the other consoles. That could potentially half the price or more of the console. Furthermore, since there are more Wii owners than owners of the other consoles, it's more likely that if you want to borrow a game from a friend you're going to be able to borrow it. So you don't have to shell out the cash to buy it. A lot of the people at my office have been doing this.

  • Exercise - With Wii Sports (I haven't played any other games yet) I got a good workout. My arm was sore after playing too much baseball, and from all the ducking and punching from boxing got me right tired. Much better than sitting on your butt mashing buttons and wiggling a joystick.
Then there are the problems I can think of:
  • Lack of games - most of the "cool" games aren't for Wii. They're on PS3 or 360. You know what? I don't really care. Most of the "cool" games are also on PC, and obviously I already have one of those. Plus I can do all sorts of other things on my PC, like program things or look at porn.

  • Pain - For once, there is more at risk to your body than getting carpal tunnel syndrome or blisters from joystick-rotating. You can actually pull muscles waving things around. Or you can whack a friend in the head (accidentally this time) while you're trying to hit their serve back. Or you can fall over. Etc. Watch out!
I'm really impressed with the Wii so far. Definitely thinking I went with the right console.

Jun 9, 2008

The Greatest Invention in Computer Science

I was reading Jeff Atwood's article titled "The Greatest Invention in Computer Science" (coincidentally the same title as this blog entry), where he states that the routine is the greatest invention in computer science.

It makes me wonder one thing: "Does Jeff Atwood really know that much about computer science?"

Routines are an excellent addition to software development, but contrary to the opinion of many, computer science is not just about software development.

Personally I'd say there are several greatest inventions in computer science. Routines are good, but pale in comparison to the concept of a high-level programming language. Hard to imagine software development without those. Networking was another one (although this is a bit of a cooperation between electrical engineering, computer engineering and computer science). Software development has been done without it, but it has definitely made a big difference, computers being able to talk to one another without the use of removable media. And the list goes on, from the various sub-domains of computer science that I have missed here. One of my favourites is the concept of reduction - converting a problem to another problem that we've already solved. (This reminds me of a quick joke: A computer scientist sees a house burning down. Solution? Call fire department. Later, that computer scientist sees a house that is not burning down. Solution? Light the house on fire, thus reducing it to a previous problem.)

Perhaps it's my youth, I haven't been around long enough to use a language that doesn't have routines. Oh wait, that's not true. I did assembly language in university. We had to write recursive algorithms for tree processing in assembly. Not too hard, but you have to remember all this stuff about pushing your registers and return address onto the stack to save their state when you recursively call the function, then remember all this stuff about what offset of the current stack pointer your value is located at, etc. Yes, routines are nice, except when you have to do them in assembly language - this is another reason why we don't use assembly language.

Computer science extends far beyond the realm of writing software, but many people seem to lose sight of that fact. They think, "oh, you took computer science in school, you must be able to fix my computer." I tell them that we didn't learn that kind of thing in school. Nor did we learn about how to use Excel. Or what all those business technical acronyms mean (although if one business major could tell me what SAT stands for, I'd be impressed). We learn about how the stuff we use in software development and other computer-related fields works today, so that we may improve upon those methods to make better stuff 10 years from now.

So, routines are nice, but to claim they are the greatest invention in computer science? Please.

Jun 5, 2008

We're All Bad Coders

Yes, I said it. We all suck. Unless of course, you're not a coder. You can stop reading now, this stuff is trade secrets.

Do you write enough tests before you start coding to test all the various requirements of the application? Do you plan out how the interface will work? Do you do all this other crap that constitutes "good" coding?

I didn't think so. Neither do I. Even "famous" coders like Reg Braithwaite or Zed Shaw probably write bad code sometimes (heaven forbid). Why? Because we're lazy. Because we want to get things done. Writing good code tends to require more work up front that is better in the long-run but more annoying in the short-run. It keeps us from seeing the results that we crave so much. A major part of the desire to code is to see things work. The joy of seeing things work is one of the reasons why we code. So a lot of the time we just try to get the code done as fast as possible so that we can see the results.

This is a problem that we are aware of. Things like static-typing and garbage collection are great examples of things that are put into languages to protect us from ourselves. There are no pointers in newer languages. Why not? Because too many people messed them up. There are no more goto statements in modern languages. Scala even does away with break and continue. Why not? It makes people write code that is difficult to follow (Scala has more technical reasons to do away with them, due to those fancy things called closures). Yet people still write bad code. And guess what? I bet even with more and more fancy things to "improve developer productivity" we will still write bad code. The development platform doesn't necessarily save us from being human.

So yeah, suck it up. You're a bad coder. The least you can do is accept it. After that, try to improve. There is always room for improvement.

On another note, sometimes the development platform does matter:
foreach ($result_set as $row){
$filename = getFilename($row);
if (!file_exists($filenmae))
deleteRow($row);
}
Let's hope there was a backup.

Jun 4, 2008

VirtualBox and Linux at work: Day 1

For some reason, the sysadmin at work refuses to let me use Ubuntu on my computer. I think it has to do with all the DNS stuff or LDAP stuff, or things like that. I doubt that would be much of a problem to set up, but since I don't know how to do it, he is the one who would be doing it. My guess is he doesn't know how, which is why he won't let me use Ubuntu.

We are, however, allowed to install software on the machine. Pretty much whatever we want. So I installed VirtualBox on the Windows box and set up my very own Ubuntu system. With their Guest Additions software, it integrates very well with my dual-monitor setup. You can even copy-paste things from Windows to Linux.

So now I'm attempting to do my web development at work in Ubuntu, to see how it goes. I'm running Quanta (I tried Geany but I didn't really like it's project handling compared to Quanta's) for editing/FTP and eSVN as a rather nice SVN client. It's not as nice as TortoiseSVN for Windows, but it does the job and is much better than the command-line SVN client. There's also KolourPaint for basic image editing and Samba for connecting to the network drives. All-in-all, it's a pretty good start!

I'll post in a few weeks about how well this setup is going for me.

See my follow-up post.

EDIT(S): It's slow as shit...

Jun 2, 2008

vi

I actually started to learn how to use vi after reading this article which explains why people use it - I found it by Googling "why use vi?" if you're wondering.

I'm using the graphical GVim one (or whatever it is called) for reference, apparently different versions do things differently.

It's actually not that bad, once you get over the first few bumps. You have to know about command-mode vs. insert mode, and how to go between them (i to go from command mode to insert mode - this isn't the only way but it is the simplest, Esc to go from insert mode to command mode). You also need to know the important commands :w and :q. Other than that, you don't really need a whole lot. Learning more commands though, is the main reason for using vi. They're actually quite handy.
Currently I still only have used a few, but they're handy. Typing dd (note that all these commands are done while in command mode) deletes a line - this is better than having to select the line and delete it, or going Home, Shift+End, Backspace, Backspace. o inserts a new line and puts you into insert mode onto the new line so you can just start typing. cc clears the line you have selected and puts you to insert mode, cw clears the word you have the cursor on and puts you in insert mode right there. These are all just little tiny things that all add up to make editing code that much easier.

The other thing that is very handy is that vi has been around for a long time. I mean, it predates Windows 95. It came before there was Google (believe it or not, there was a time like this). And in that time, I don't think it has changed much. Anything you need to figure out how to do - at least all the basic stuff I wanted to do - you just Google for it and the answer is right there. Want to turn on syntax highlighting? Google for "vi syntax highlighting" and you will have it. It's very easy to find information on it.

So yeah, there is a learning hump. However, I think that if you actually sit down and learn the basic commands to edit things, then you'll be set. Most people seem to just open it and expect that it works like every other text editor out there. Unfortunately this approach will fail horribly, and then you'll be left with the impression that vi is unusable. I'd say this is wrong, but it is a bit tricky and unintuitive at first. You have to actually sit down with a tutorial and learn it. However, I'm starting to think that for a young programmer who will probably be coding for the next few decades, it is a worthwhile investment to make.

Jun 1, 2008

Frameworks Galore

It seems like one of the cool things to do these days is to roll a web framework. Now that Rails has popularized this notion, frameworks seem to be coming out of the woodwork.

My company built a framework to manage several sites at once. The framework was good, although it was rather complex.
I started work on a new high-traffic site, and the framework just didn't cut it. It didn't have great support for memcached or our distributed cluster system.

So I rolled my own framework designed for high-traffic websites. It's based off the MVC (Model-View-Controller) architecture and uses the active record pattern for working with the database. Much of this functionality was inspired by Rails. Oh, and unfortunately it's done with PHP.

I drop some of Rails' "convention-over-configuration" idea. This is because when you have a high traffic site that is heavy in the DB usage, you can't just go "set my username to this, set my password to this, and choose this DB". You have to actually set up connections with some logic behind them. My system uses clustered MySQL databases in a master-slave architecture, with vertical partitioning between tables. This is when you split up a table by its columns. Sometimes the tables are put across different database clusters. This means a few things. You can't do writes (UPDATE/INSERT/DELETE) to a slave database, and sometimes you can't use JOIN, if the two tables are across separate databases. So the framework has to support both picking a connection based on the need for a master or slave, and it has to pick a connection based on which table you want to query.
Since choosing these databases are rather complex, I set it up so that there is an abstract DB class that manages the basic things, and you inherit from it in order to get more specific functionality. I have a default inherited class that uses a table map that maps from a table name to a connection number. Then when the base DB class requests a connection from the derived class (for a query or whatever), the inherited class will figure out what connection matches to the table and the need for a master or slave. It uses lazy loading for the connection resources so that a connection to a particular database is not made until the base DB class actually requests it.

The other major change is around using memcached. I set it up so that model objects may automatically use memcached as a data store alongside the database, by inheriting from the CachedModel class instead of the Model class (the CachedModel class inherits from the Model class anyway, so you still get the functionality of Model). Then the rows for this model are stored in memcached when they are requested, and the memcached version gets modified when the row gets modified. In fact, the CachedModel version of save() (which performs an INSERT/UPDATE normally) by default does not save directly to the database unless a particular heuristic decides that it is time to save (this can be overridden by passing true to save() ). This means that there are actually very little writes going to the database and it is instead going directly to memcached. It makes it far faster, and you notice live updates on the site. For example if you have a profile_viewed field for a user, you can actually sit there refreshing and you'll see that the count increases. Since the object is stored in a central location, any other place that the profile_viewed field is displayed will also increase.
EDIT: As the site grew, an issue with race conditions came up. This is because there is an amount of time between when the object is taken out of memcached, processed by a script and put back in. If the script is being accessed often enough, then you run into problems as data can get clobbered. I'm not working for the same company any more so I'm not sure what they did to fix this, but I assume it had something to do with memcached's atomic increment function. You'd have to slightly tweak the CachedModel object to add this functionality.
Queries can also be stored in memcached. When you call findAll(), you have an option to pass a cache name. If you do, it is stored in memcached under that name. To save on space, this query only fetches the id from the database, and then fetches the objects themselves from memcached (if they are there, otherwise it gets the from the database).
Note that all Model/CachedModel objects need an id field as the primary key. This makes the framework much simpler. If you have a table that does not follow this pattern (like a friends table, or favourite photos) then you'll have to use SQL. Fortunately each class inherited from Model gives you a findBySQL() function which returns objects of that type.

The views are very basic, the controller object has a data field, which is an associative array mapping the name of the view variable to its value. When you're in the view, you can access the variable itself:
Controller:
$this->data["myObj"] = "Hello";

View:
<?= $myObj ?>
I also add a plugins feature, which is like a mini-controller/view setup. It's used for small bits of functionality that you might use across your site. An example would be Facebook's commenting. You can comment on a profile (the wall), on photos, on videos, etc. This would be wrapped up a plugin so that in your controller you just go loadPlugin("pluginName", new MyPlugin()) and in the view: $this->plugins["pluginName"]->display(). It's handy.

Finally one other thing I think is cool about is it supports what I call shared applications. This means that you can have an application in one place, another application in another place, and have them share common things like model objects, plugins, etc. but have different controllers/views, connect using different database credentials, etc.

This framework is not really designed to "baby" the programmer. If you're afraid of SQL or command-line, then it's probably not for you. There are many cases in this framework when you'd have to use SQL, and other cases where you might have to go to the command line. However if you're working on a very high traffic site, you should be comfortable with these things anyway.

Unfortunately at the moment I can't release the framework as it is owned by my company and not by me. However I'm planning on rewriting it under an open-source license, probably MIT. If anybody has some suggestions, feel free to let me know.