Jul 9, 2008

Using SVN Effectively

After working a bit on websites where we use SVN as a version control system, I've been thinking that it is actually rather difficult to use it effectively, especially when you're programming in Windows.

Take this example. You have a group of programmers, a web server, and an SVN repository (which may or may not be on the web server). The programmers program away, doing their thing, occasionally updating and committing their changes as they go along. They upload their changes on occasion to see if what they want to do works.
There is a major problem with this approach. Suppose one person uploads a change, sees it doesn't work exactly as planned, makes a tweak to another file, uploads that one and sees things now work. But before committing, somebody else working on one of those files modifies it and uploads it. Now the main site might not work, mainly because people are overwriting each other's changes. This is incredibly annoying! I'm speaking from experience here. On a small team it isn't so bad because people probably aren't working on the same files, but as your team gets bigger it becomes more and more likely that something like this will occur.

This is the wrong way to code. It leads to bugs everywhere that will continue to re-appear. Sometimes, if the programmer hasn't realized that somebody has overwritten their changes and begins wondering why their code doesn't work anymore, they spend time trying to track down the bug. This is a waste of programmer effort that could easily be avoided. Here's how:
Everyone needs their own development space. With their own database information, memcached, etc. Completely independent from one another. This is way easier to do with Linux, because you can just install Apache/MySQL/etc. on your local machine and do everything you want there. With Windows you can do this, but it is a fair bit clunkier I find. Under Ubuntu you just go "sudo apt-get install apache2 mysql-server ..." and you're set.
First, you update and commit. That way, you have the most up-to-date version of the repository. Then make your changes. Make sure they work. Then update, resolve any conflicts that may appear. Make sure your code still works. Then commit. Then put it on the server somehow, either by using the SVN repository on the server and doing an update, or something. Don't upload your changes to the server while you are working on them!
Even better would be to do this with unit tests, that way before you commit you can run the unit tests to see if your code still works.

If you want to save your changes to the repository, but don't want to merge it with the main version, you can always branch. That's what branching is for.

I do find some things with SVN to be annoying. Branching is a bit annoying to do. Having to set up repositories is annoying, especially if you don't have write-access to the SVN server. You need to request a server admin create you a repository and set it all up.

I've been looking into using git for my projects instead of SVN. It seems to be much easier to use, and lots of geeks are getting excited about it. This usually means that it is awesome, and I should be checking it out to see if it is indeed awesome.

10 comments:

Mathieu Martin said...

Git is indeed a great tool. If you look at my blog you'll see that I enjoy it enough to work on a small utility to help with its rough edges :-)
With git, branching and merging is so easy that it's the default way to work. And now you can have a source control's big undo button capability on any crazy idea you wanna test, in your project This, of course, without disturbing the more stable development branches.
There's 2 ways you can go about it.
Each developer can have his branch on the central repo (therefore backed up and observable by colleagues) and merge his progress into master only when necessary.
This gets hard to manage fairly quickly however when things get hectic. So you can also just create a branch per feature (and also have them saved on the central repo). When the feature is finished, merged in and the ticket satisfactorily closed, you just delete the branch. VoilĂ ! The commits are now in master anyway.
There's only one huge caveat I've found, working with Git: I never want to use a centralized VCS again. So I just narrowed down my future potential employers or partners tremendously :-) Well if said person is on SVN there's always git-svn to the rescue. All hope is not lost...

Mathieu Martin said...

Ahhh, how I hate Blogger's handling of identity...
My blog is at http://programblings.com :-)

IllegalCharacter said...

Nice, looks cool.

You've further confirmed that looking into GIT would be a good idea.
My biggest problem with developing as a team using SVN is that it's too easy to ignore SVN. It takes a bit of work to do things with it like branching and all that, so the developers just ignore it, upload their changes to the production site - often without checking if it works - and then test it there. At the end of the day, or if someone asks them, they update and commit their changes.
This is the default way to work with SVN, in my opinion. Not really a good way, and it pretty much defeats the purpose of a lot of the features of a VCS. If what you say about GIT is true, that the default way is branching and merging, then I think it will definitely bring improvements to the professional world. That is, once it is adopted. Give it a few years. Or a decade.

Guillaume Theoret said...

That is so not the default way to work with SVN. Never (well never might be a bit strong I may have done it once or twice but it's extremely rare) when I worked with you (not with you of course but at the same company) nor when I worked at my last job did I SSH into the production box to edit code on the live server.

And I know I never used FTP to upload changes because I wouldn't even set up the FTP accounts in the editor.

All I ever did when I wanted to put something live was SSH into the production box and type "svn up". That's it. With a dev drive, as long as you test in your sandbox and don't commit broken code you won't have a problem.

Git will not fix the fact that people are bypassing your source control. It's a people problem, not a technology problem. Install Git and people will still FTP up broken files and edit on the live site and break stuff unless there are cultural changes.

Especially at my last job where it was just me and Tim alone developing we almost never had any trouble just by following the "commit early, commit often" rule along with testing the code before committing it.

Now, none of this means anything in the Git vs SVN debate, I plan on using Git for my next personal project to try it out as well but saying it's SVN's fault that people are FTPing files and not using source control at all is disingenuous.

IllegalCharacter said...

You're right, I probably should get some experience with GIT before I call it a silver bullet ;)
I wasn't really saying that it would solve the people problem, more that it would make it easier for people to do things like branching and merging and therefore make it more likely that they will actually do it. I still have my doubts though.

When you guys were there, things were a fair bit more organized because most of the people knew what they were doing and since it was small, everyone sorta knew that they had a responsibility to do things right.
Now the company is a fair bit bigger and being a programmer there seems much more like being a number. So, people don't necessarily do things the "right" way anymore.

I'm probably just looking at things from the way I see them at this job, but from what I've seen of the majority of programmers I would think they would do something similar.

dibblego said...

"It seems to be much easier to use, and lots of geeks are getting excited about it. This usually means that it is awesome..."

This logical fallacy (argumentum ad populum) is not even remotely true even without counter-examples; Java, Ruby, Windows (need I go on?).

Michael Mrozek said...

This seems like a fairly bad way to use SVN. I think the standard is to have the webserver use a checked out copy just like everyone else, and let developers update it when they want. I've also worked on projects where there are just multiple copies of the site running on the server, so developers save changes directly to that copy and look at them right away before committing

James H. said...

Take a look at Mercurial ( http://www.selenic.com/mercurial/wiki/ ):
- Does everything that Git does
- A lot simpler than Git
- Works better on Windows

I know quite a few people that first tried Git and then jumped on the Mercurial wagon :D

IllegalCharacter said...

@Michael: I think I mentioned the way you described somewhere in the post. It makes sense too.
I think at the place I work they used to do it that way, but not anymore. No idea why.

@James: I will definitely take a look at it. Simpler is usually better, so that is a plus. I don't really care about it working under Windows or not as I don't do any of my development under Windows, except at work. But there everything is under SVN, and the chances of that changing any time soon are next to nil.

Tony said...

Somebody already mentioned Mercurial so I'll plug the other major DVCS out there: Bazaar.