Apr 28, 2009

Type Checking in Dynamic Languages

I was having a discussion with a friend of mine not too long ago debating the merits of dynamic languages. My friend believes that type-safety is a top priority and that a language that does not do compile-time (or for an interpreted language, parse-time I suppose) type checking is a disaster waiting to happen.

This is completely correct. Unless you're doing something trivial like a script to resize/rotate the pictures from your digital camera, a dynamically-typed language will have problems when the app gets more complex. For example, suppose while refactoring you want to change the parameter list of a function for some reason. Maybe some parameter is no longer needed, who knows. In a statically typed language, this is not a huge issue because the compiler will pick up all the places that the function is called and tell you if there are problems. Not so in a dynamically typed language, instead you have to actually execute the code. If your program is highly interactive, this will lead to problems because you will manually have to execute each version and catch the error at run-time. And if you forget to check something? Well, your users will find it.

Here is where the standard response comes in from dynamic-typing enthusiasts: testing suites. All non-trivial programs should have testing suites, regardless of the language. No programming language, even Haskell with its crazy type system, will save you from the requirement of clarifying your ideas - at least you better hope so, or you're out of a job. A type-checker can tell you if the types you are throwing around make sense, but they don't check whether the code is working the way it is supposed to. For example, one time I wrote a 3D vector class in C++. I made a typo in the cross product function and put a - instead of a +. Passed the type check just fine, but when the shading on objects was looking horribly fucked up, I had absolutely no idea what was going on. This kind of bug can be easily avoided by a simple unit test.
A good unit testing suite will help a lot, and catch many of the errors that would normally be caught by a compiler (like type-errors or spelling mistakes) while at the same time verifying the behaviour of your code.

Unfortunately, the type-safety depends on the quality of your unit tests. It is prone to human error. It is this fact that I believe a program written in a dynamically-typed language can never be as type-safe as one in a statically-typed one.

You could work around this by checking the types of the arguments at the start of every function. However this is paranoid. If you're doing this, why don't you just switch to a statically-typed language which will do this automatically, more quickly, and with less work for you?

So we come to a trade-off. Many programmers discover quite quickly that a dynamically-typed language will let you get code out the door much more quickly than with a statically-typed one. I've been mucking around with a gaming library in Python (I'll post about this another time) and I'm finding it is much easier to get simple stuff done than I ever did with C++. In fact it is so simple that there is no need for adding scripting languages on top, or config files, or all that. I can just write the stuff in Python. I've had similar experiences with Ruby.
But to get these benefits, you must accept the fact that you will not be having 100% type-safety. There may have to write more tests than you would have.

Is it worth it? Sacrificing type-safety for faster development times? Well, it depends. In fact, you may not always have faster development times. I believe that there is a threshold at some point where a dynamically-typed language will have slower development times than a statically-typed one. This threshold is dependent on many factors - how good the design is, how good the programmers are, how many programmers there are, how big the project is, how quickly requirements change, etc.

Finally one last thing to note is that statically-typed languages are easier to optimize for speed than dynamically-typed languages. In a statically-typed language, the compiler knows without a doubt the type of a variable, and can sometimes drop the type information from the resulting machine/byte code. The compiler knows at compile-time whether an object has a certain method or not, so there is no need to check at run-time, or even keep track of what methods an object really has. A lot of this is why C and C++ are still faster than well, anything else (of course there are other factors, like how most dynamically-typed languages these days are interpreted).

4 comments:

Isaac Gouy said...

> type-safety

What do you think type-safety means?

"Type safety is the property that no primitive operation ever applies to values of the wrong type." p263

Programming Languages: Application and Interpretation

Under that definition, some languages without static type checking but with run time type checking are type-safe, and some languages with static type checking are not type-safe.

Unknown said...

"I believe that there is a threshold at some point where a dynamically-typed language will have slower development times than a statically-typed one."

I in the interest of honest discussion about the benefits and drawbacks of a given programming paradigm, its best to lay out your assumptions and predispositions early on. It seems like a lot of the article was devoted to the merits of type safety but you're argument boils down to the above quote as to why type safety is in fact better: "I think you don't actually get more productivity from dynamically typed languages".

Jonathan Allen said...

> I'm finding it is much easier to get simple stuff done than I ever did with C++.

Well that's hardly a fair comparison. There are countless reasons Python is more productive than C++ that don't involve dynamic typing.

To be fair you need to compare it to something that is more or less equal in all ways except type checking.

Angel said...

I think the error is on this part:

You could work around this by checking the types of the arguments at the start of every function. However this is paranoid. If you're doing this, why don't you just switch to a statically-typed language which will do this automatically, more quickly, and with less work for you?
Dynamic typing, with test cases, is a lot less work, in both terms: Lines of code and complexity of the project, and, as you say, it also checks the behavior of the code.

I've been writing a MM7 and a SMPP gateway for the company I work for. The original version of those programs was in Delphi, and since most of the new codebase there is in Python I tougth it was a good idea to code it in that language. The result was much more less code, even with tests cases, a similar performance (thanks to the magic of multiprocessing and asyncore) and a pretty simple code. And the schedule was meet on time, I, just for trying, start coding the same in Delphi (language that I dominate more than Python) and was unable to finish it even on the double of the time.