Jul 29, 2008

The Power of Destruction

I've been working a bit with C++ recently on some personal projects, and it reminded me a bit of the C++ nuances. Mainly, memory management and destructors. The memory management isn't nearly as hard as everybody seems to think, most of the time you can just use boost::shared_ptr which does reference counting and you're set. In cases where you may have cyclical references you'll probably need to keep other things in mind, also you'll need to keep track of exception safety when executing complex statements. You can use a garbage collection library if you are really worried.

However, I'm remembering how much easier it is to manage non-memory resources with C++. In Java/Ruby/PHP/etc. you can have destructors or things that emulate destructors, but they have an element of non-determinism added: the garbage collector. You don't know when your object will be destroyed (in the case of PHP since the lifetime of your program is usually very short, this doesn't matter all that much).
Why might this be a problem? It means that you either have to manually release any resources that an object may use, or accept that the resources will stop being used at an undefined point in your program.

Here's an example. Suppose you have a mutex. Here is how you would normally use it:
lock(mutex)

code that needs to be locked

unlock(mutex)
What is wrong with this model? First, it requires you to remember to unlock. Not a huge deal, but it lets in a certain degree of human error.
Second, and more important: what if the code between the lock and unlock is not exception safe? What if something in there throws? Will your mutex be unlocked? You now have to handle any exceptions that may be thrown in that area before the unlock() statement. You could put the unlock() in a catch block, but what if lock() throws and the mutex doesn't get set? Also, what if you don't want to catch the exception here, you want it to be handled higher up in the code?

These are all things that are completely avoided in C++:
{
Lock lock(mutex);

... code to be locked
}
In this case, Lock is an object. The constructor locks the mutex, the destructor unlocks it. Since lock is a stack object, it gets deallocated automatically (and its destructor called) when the exception is thrown, or the end of the block is hit. You no longer have to worry about exception safety in your block.

There are all sorts of other resources that are available to objects, like database connections, network connections, and file resources to name a few. Being able to have an automatic, deterministic mechanism for releasing these resources makes your life as a programmer much easier.

2 comments:

Matthew Gallant said...

Isn't that why Java encourages the use of "synchronized" blocks? They're a little less flexible, but they solve the problem you mention.

Rob Britton said...

Whoops, forgot about synchronized blocks... Guess it's been a while since I did any Java. You're right, they work well for the example I mentioned.

As a language feature, you're dependent on the language implementation for the functionality of the feature. In Java this isn't really an issue, since Java - from as far as I can tell - has a top-notch implementation from Sun.
Compare this to Ruby though, where you might use ensure for the above functionality. My sources (namely, programmer friends who know Ruby better than I do) tell me that the implementation for ensure is sketchy. Guess Java beats Ruby there - unless of course there is a synchronized thing in Ruby that I don't know about, which is very likely.

Anyway one point I guess I was trying to make was that destructors help you with exception safety in your program by making local stack variables able to clean up resources that they use/modify when there is a failure. It might not necessarily be with locks and mutexes, but maybe for network connections or file input streams or something like that.