Saturday, May 18, 2013

Comparing Python and Haskell

The case for Python

I used to program in Python for a long time. I liked Python a lot, it could encode the abstractions I wanted to implement relatively well. By "relatively", I mean compared to other languages I was deeply familiar with at the time (C, C++, Pascal) and did so in a way I found much more aesthetically pleasing than alternatives (Ruby, Perl).

Python sacrificed a lot in order to well encode these abstractions. But it seemed to be a good trade-off for many programs. After all, look at these pagefuls of Python become monstrous C or C++ modules.  Loss of compile-time type checking meant we had to spend more time testing our programs, but we'd have so much time free, because we could write much smaller programs!


Why I no longer believe in Python

This attitude has all changed, when I discovered that the trade-off made by Python is not a trade-off at all! That is, encoding useful abstractions painlessly with short, concise and DRY programs does not require making the huge sacrifices that Python has made. We can have our cake and eat it too!

The ML family of languages, with useful type inference, can indeed concisely encode the abstractions we need from the Python world while maintaining type safety and as a bonus: better performance.


Some anecdotes

To validate this idea, I used Haskell. I wanted to show that indeed Haskell is capable of encoding Python's useful abstractions -- and what better way is there than transliterating Python programs with roughly the same abstractions directly into Haskell, and seeing how they cope?

I transliterated some Python programs into Haskell to get a feel for whether this was actually true.

Here are some example programs that I transliterated into Haskell.


Gravity simulator

Here is the original Python code, by Thanassis Tsiodras.
Here is my transliteration of this code into Haskell.

HaskellPython
Code Lines179175
Without import lines162170
CPU use12%100%

In short, the code is roughly the same size, but in the Haskell version we get type safety and almost an order of magnitude increase in efficiency.


Ambiguous parser

Tomer Filiba wrote an implementation of the "Tree Insertion Grammar" algorithm in Python (test program).

I transliterated it to Haskell (test program).
HaskellPython
Code Lines514408
Without import lines493407
Without optional type signatures444407
Execution time0.8s2.5s

Haskell's speed improvement here is more modest. Instead of order of magnitude, it is just 212% faster (factor of 3.1).
Also, the Haskell code requires more imports and good style dictates adding optional type signatures, which bloat up the code to 25% more lines than the Python code.
However, I believe that since import lines can harbor no bugs, and optional type signatures can be added automatically by the editor (e.g: in emacs's haskell-mode), it is useful to compare the code without them.  Writing and maintaining these extra lines is of negligible cost.

However, my transliteration doesn't shine as well here, even the stripped down version is 9% longer than the Python version.

Is a factor of 3 improvement in performance and a hell of a lot of extra runtime safety worth an extra 9% of code? I believe so.

Cartesian Tree Product

Again, Tomer Filiba has written an interesting blog post about tree products.
I extracted the code from that blog post into a single Python file.
I transliterated it into Haskell.
HaskellPython
Code Lines4126
Without import lines3825
Without optional type signatures3025
Execution time0.076s3.628s

57% more code at first glance, ouch! However, again, if we ignore import lines and optional type signatures we end up with 20% more code.

This relatively high cost in code, however, may be offset by a factor of 48 performance boost.
Is writing 20% more code for extra safety and a factor of 48 performance boost worth it? Probably.

Summary

Transliterations of Python code into Haskell code show that we can retain roughly the same code size (in the range of 95-120% code size). Note that despite the transliteration's inability to take advantage of a different program structure that fits Haskell better, Haskell fares very well.


Conclusion

This post is not really an attempt to show how great Haskell is. It is more of an attempt to show that Python's huge sacrifices, at least for the goal of making painless abstractions and concise programs are unnecessary.

Just by translating from Python to Haskell, we get huge speed benefits, slight code size increase (if at all) and a hell of a lot more safety. This safety is especially handy when you later want to maintain the code, and you can be far more confident that your change doesn't break anything than if you had a few unit tests for your Python code. If you add a comprehensive test suite to the Python code, confidence levels may become comparable, but then you've spent a lot more code and effort at the problem, and maintenance now needs to include a whole lot of test code!

If you're thinking of using Python for your next project, I'd recommend considering Haskell instead.