Wednesday, January 28, 2009

Happstack: An Interview with Matthew Elder

HAppS development has been all but stopped for almost a year (as have my posts about it). Recently Matthew Elder took up the flag with a new project called Happstack. I thought this would be a good time to catch up with Matthew and find out why he is doing this and where he is going with it. So without further ado, here is Matthew Elder on Happstack. Is Happstack a fork or a rename?
Happstack is both a fork and a rename :) But it is not a fork in the traditional sense -- the original code in Lemmih's words (the only active developer left) is orphaned. So even though we are forking the code base to a new repository under the name "Happstack", the original project is not being worked on so it is more of a direction change than a fork.
Why are you trying to take over? What happened to the previous developers?
The previous developers tapered off over the last year and the only one left is Lemmih. I started out by submitting a couple of patches -- nothing fancy -- just some code to cleanup some warnings. The patch wasn't being accepted in a timely fashion, so I asked Lemmih, and he basically said that the code is not being actively developed, and that he only applies patches in his free time. His free time isn't in abundance either because he is actively developing on the LHC project (a haskell compiler that focuses on cross-module optimization). I am taking over because the project has lost its momentum and vision. Alex Jacobson got burnt out at one point and just sort of faded away from being involved. I want to renew the spirits of the community, and focus on tangible, achievable goals. I really think that the HAppS has a chance because it is so different from any other frameworks out there and I don't want that ingenious flame to die.
Who are you, what is your experience, and why should I feel you are the right person to take the HAppS reins?
My name is Matthew Elder, I have dabbled in many different programming languages, but in the past specialized in Ruby. I started in functional programming about a year ago and haven't looked back to the mutable-state world since :) One significant thing I have done in the past is to migrate a small company of php programmers in India from using zip files passed around via instant message to using mercurial across the board for all their projects. For those that aren't familiar with Mercurial it is a distributed SCM similar to darcs. It was challenging because of the language barriers, the culture, and the learning curve -- but in the end I was successful in teaching pragmatic techniques in version control. I don't claim to be any sort of Haskell wizard but I am very much a believer in the open source movement and having studied the culture for a number of years now feel that I am the right man for this job :)
Some of this information is already out there, but can you give us a high-level summary of what HAppS is in plain English?
Happstack, formerly known as HAppS, is a Haskell applications server stack for web apps. It's main killer feature is that it saves your persistent data in it's own typesafe and ACID guaranteed data store. It supports multimaster replication for redundancy and in the near future will support sharding for scalability. The idea is that since you don't have to worry about data marshalling from an sql system, that many parts of the application become much easier to compose and distribute (the data types follow the same rules and style as the Haskell code itself).
The common criticism of using the type system for your data store is that your state size is limited to what you can fit in RAM. Obviously sharding reduces this problem by adding more machines. But what would you say to the critic who argues that you're unnecessarily increasing the cost of scaling your site by limiting yourself to costly RAM and ignoring the much cheaper disk storage?
I would say that the MACID datastore isn't intended to fulfill all kinds of storage needs. It is highly optimized for performance of retrieval without the need for increased complexity by using systems such as memcached, for instance. Also, I think it is important to note that reduced complexity and less code means that you will need less labor overall. For a programmer who makes 2OOO dollars per month I can instead rent 20 dedicated server nodes which should address any scalability problems that macid might encounter.
What differentiates HAppS (and Happstack) from other frameworks?
The MACID data system. Your code is your data, no leaky abstractions, no sql exceptions, everything is handled in the haskell system.
Are you making any substantive technical changes to the project's direction?
Yes, in the future we plan to support dynamic code loading (using the ghc api) and also in general we are taking the direction of more documentation and tools to make the end-users life easier. The other substantive change is that we plan to have a fully operational sharding implementation in a few releases. This was originally planned in HAppS but was never completed. Sharding allows you to scale your in-memory data greater than a single nodes' capacity.
If Amazon can do without MACID, why does HAppS place such emphasis on it?
Amazon can do without MACID, but they also have a large team of engineers responsible for overseeing this architecture. Various engineers have to oversee the database administration and even more engineers have to code to take advantage of this massively scaleable architecture. Happstack aims to provide scalability benefits similar to amazon but with less effort.