Sunday, April 15, 2012

Four Tips for New Haskell Programmers

The Haskell programming language is widely considered to have a fairly steep learning curve--at least compared with mainstream languages.  In my experience with Haskell and specifically helping newcomers I've noticed a few common issues that seem to come up again and again.  Some of these issues might be more avoidable if the Haskell community did a better job communicating them.  Four points I have noticed are:
  1. Read Haddock API docs
  2. Pay attention to type class instances
  3. Learn about kinds
  4. Learn monad transformers

Read Haddock API Docs

I mention this point at the risk of stating the obvious.  If you are going to become a proficient Haskell programmer, it's absolutely essential that you get used to reading the API docs for the packages you use.  I often hear newcomers ask for tutorials demonstrating how to use packages.  Our community would definitely be better off with tutorials for every package, but it would also be better if newcomers would pay more attention to the API docs.  Now I know you're probably thinking, "yeah, but most packages are poorly documented."  That is true, but Haskell's type signatures tell you a lot more about a function than other languages.  For instance, consider the following example:

readChan :: Chan a -> IO a

A Chan is essentially a queue.  You can put values in and get them out in FIFO order.  When you are trying to understand the above function, one of the first things you might wonder about it is whether it blocks or not.  But if you think about it a little more, you'll realize that this function has to block.  If it didn't block, then there might not be a value and there would be no way to return something of type a (because Haskell has no null pointers).  A non-blocking version would have to have a type signature like this:

tryReadChan :: Chan -> IO (Maybe a)

So even with no prose documentation added at all, you can still learn a lot from the type signatures.  This is more significant in Haskell than other languages because of Haskell's purity and its strong type system.  Also, I would recommend that you bookmark the url "http://hackage.haskell.org/package/".  Actually, better yet, type it into your web browser frequently so it is the first thing given to you by the autocorrect when you start typing "hack...".  Once you auto-complete that url, you can just type the name of the package you are looking for and you'll immediately get to the most recent API docs for that package.  It's way faster than hitting control-f and searching through the package list on that page.

Also, bookmark http://www.haskell.org/ghc/docs/latest/html/libraries/index.html because it has links to documentation for the core libraries.

Pay Attention to Type Class Instances

I can't emphasize this point enough.  One of the most common questions I get from beginners is how to run an IO function in some random monad.  This information is trivially visible in the API docs, but for some reason beginners never seem to notice.  Take for example the Snap monad.  Go ahead, click that link and look at the documentation.  The clue that you can run IO actions inside that monad is tucked away near the end of the "Instances" block.  It's one little innocuous line MonadIO Snap.  Newcomers might not be familiar with MonadIO, but if they click the link they'll see that it defines one function liftIO :: IO a -> m a that converts any function in the IO monad to a function in the current m monad.  Those instance lines contain a treasure trove of information.  Don't neglect them.

Learn About Kinds

In Haskell all values have a type.  Analogously, all types have a kind.  This topic is often not brought up until later, but I feel that understanding kinds gives beginners a big advantage in understanding type signatures.  I'm not going to go into it in detail here, but I think it's an important concept that is too often ignored.

Learn About Monad Transformers

This is another one of those topics that is often put off until later.  When I was learning monads, I distinctly remember getting the impression that monad transformers were a much more complicated beast and that I didn't need to learn about them at that time.  But when I finally did learn about monad transformers a lot of things became clearer for me.  Also, monad transformers are used all the time in real world applications.  Transformers are a much simpler concept than monads, especially if you know about kinds.  There's no reason not to learn them at the same time.

5 comments:

Anonymous said...

I would add: don't hesitate to read the source code for library functions you don't understand. In Java, C++, etc, the source code is of limited use in understanding a library because you typically have to wade through a maze of classes, interfaces, state mutations, etc, to have any idea what's going on in there. Haskell code, by contrast, tends to be quite compact and readable.

This is especially important given the generally poor state of Haskell library docs. I don't understand why the Haskell elite think a haddock file consisting of only type signatures counts as documentation. And not too get too ranty, but until library developers learn to take some care in showing cookbook-style examples showing how to use a library, a lot of people are going to continue giving up in frustration when trying to use haskell to do e.g. simple signal processing, XML processing, graphics work, and so on.

Orclev said...

Although it's true the type signatures give you a really big clue how individual functions operate, they're next to useless in providing clues on how to assemble the various functions into a working whole. When approaching a new library to attempt to solve some concrete problem (say, load and parse an XML file, and then find a particular node) knowing where to even start can often be a daunting task. To begin with it's often unclear which module to start looking in, particularly with the "main" module often simply re-exporting every other module in the library. When you do finally find a promising looking module you'll often find a function that looks like it will do exactly what you want, but it either takes some argument you're unsure how to construct, or else it operates inside of some monad that it's unclear how to run an instance of (or worse yet, some kind of newtype wrapper around a monad transformer stack that you only find out about after digging through the source code of the library).

TL;DR; version, type signatures are good if you want to know "what does this function do", but proper documentation is necessary to answer "how do I use this library".

mightybyte said...

Orclev,

I both agree and disagree. Individual type signatures obviously don't tell you how to put things together. But collectively they do. Hackage provides a pretty decent interface for hyperlinking around the libraries you would need. Yes, in some cases library authors do need to put in some effort, but even when they don't the type signatures alone are still useable. I will concede that it takes some time to learn how to do this effectively. But in my experience it makes learning new libraries WAY easier than my experience has been with dynamic languages.

Orclev said...

@mightybyte

I can agree that given two libraries of equal documentation quality, one for Haskell, one for some other dynamic language, the Haskell one will tend to be much easier to understand how to use. For simpler libraries (particularly ones without any complicated state machinery) the types alone are sufficient. However for larger libraries type signatures are no replacement for proper documentation. It's extremely difficult to hold an entire library in your head, particularly one you're using for the very first time. At the bare minimum a one line example (preferably from inside of IO showing all the state setup) can go a long way towards making a library more accessible by pointing out at least one of the entry points.

Once you've got that initial function to start building from, following the type signatures can usually get you the rest of the way, but finding that first function shouldn't be a game of where's Waldo, and that's where proper documentation comes in.

Another thing documentation can be a major help with is exploring the more advanced but esoteric aspects of a library. Most libraries will have a handful of functions that exist specifically to handle unusual edge cases, and properly documenting what those are and how to deal with them can be a huge time saver for your users.

So, while it's true that the powerful type system in Haskell allows for much stronger inference about a function given only its type signature (and name), that's still no excuse for the generally shoddy documentation that you see in the Haskell community.

By all means encourage people to learn to "follow the types", but please don't use that as an excuse to forgo proper documentation on libraries. As the first commenter on here pointed out, a lot of users give up in frustration either before learning how to infer properties from type signatures, or else when confronted by a library with a particularly pernicious set of type signatures (I myself have skipped over certain libraries because they required more effort to understand than one of their alternatives).

mightybyte said...

@Orclev

> but please don't use that as an excuse to forgo proper documentation on libraries.

I wasn't.