Monday, December 26, 2016

On Haskell Documentation

The following started out as a response to a Hacker News comment, but got long enough to merit a standalone blog post.

I think the root of the Haskell documentation debate lies in a pretty fundamental difference in how you go about finding, reading, and understanding documentation in Haskell compared to mainstream languages.  Just last week I ran into a situation that really highlighted this difference.

I was working on creating a Haskell wrapper around the ACE editor.  I initially wrote the wrapper some time ago and got it integrated into a small app.  Last week I needed ACE integration in another app I'm working on and came back to the code.  But I ran into a problem...ACE automatically makes AJAX requests for JS files needed for pluggable syntax highlighters and themes.  But it was making the AJAX requests in the wrong place and I needed to tell it to request them from somewhere else.  Depending on how interested you are in this, you might try looking through the ACE documentation on your own before reading on to see if you can find the answer to this problem.

When you go to the ACE home page, the most obvious place to start seems to be the embedding guide.  This kind of guide seems to be what people are talking about when they complain about Haskell's documentation.  But this guide gave me no clues as to how to solve my problem.  The embedding guide then refers you to the how-to guide.  That documentation didn't help me either.  The next place I go is the API reference.  I'm no stranger to API references.  This is exactly what I'm used to from Haskell!  I look at the docs for the top-level Ace module.  There are only three functions here.  None of them what I want.  They do have some type signatures that seem to help a little, but it doesn't tell me the type of the `edit` function, which is the one that seems most likely to be what I want.  At this point I'm dying for a hyperlink to the actual code, but there are none to be found.  To make a long story short, the thing I want is nowhere to be found in the API reference either.

I only solved the problem when a co-worker who has done a lot of JS work found the answer buried in a closed GitHub issue.  There's even a comment on that issue by someone saying he had been looking for it "for days".  The solution was to call ace.config.set('basePath', myPath);.  This illustrates the biggest problem with tutorial/how-to documentation: they're always incomplete.  There will always be use cases that the tutorials didn't think of.  They also take effort to maintain, and can easily get out of sync over time.

I found this whole experience with ACE documentation very frustrating, and came away feeling that I vastly prefer Haskell documentation.  With Haskell, API docs literally give you all the information needed to solve any problem that the API can solve (with very few exceptions).  If the ACE editor was written natively in Haskell, the basePath solution would be pretty much guaranteed to show up somewhere in the API documentation.  In my ACE wrapper you can find it here.

Now I need to be very clear that I am not saying the types are enough and saying the state of Haskell documentation is good enough.  There are definitely plenty of situations where it is not at all obvious how to wrangle the types to accomplish what you want to accomplish.  Haskell definitely needs to improve its documentation.  But this takes time and effort.  The Haskell community is growing but still relatively small, and resources are limited.  Haskell programmers should keep in mind that newcomers will probably be more used to tutorials and how-tos.  And I think newcomers should keep in mind that API docs in Haskell tell you a lot more than in other languages, and be willing to put some effort into learning how to use these resources effectively.


Francesco said...

Ah! I recall having the same experience, you really feel the missing types+purity when browsing documentation in other languages!

ivanmiljenovic said...

I've been having the same issue of late dealing with a Scala + Spark + Cassandra project at work (the first time I've used any of these three). So much of the documentation is just "here's an example of how to do a trivial situation" without explaining what the fundamentals are. Even the API is littered with things like stringly-typed enumerations (to specify the type of join; there's only four accepted options but they're all strings) and auto-magic behaviour hidden behind implicit conversions that aren't documented (e.g. to convert an RDD to a DataFrame they say you just use the handy .df method... without saying that the fundamental type needs to be a Product, so you need to trivially add a useless column to be able to use a single columned RDD as a DataFrame).

mightybyte said...

The situation you describe with enums is one of the first things that comes to mind that makes API docs so much more useful in Haskell. You could still certainly do it the stringly typed way in Haskell, but Haskell syntax for enums is so light that I rarely see people doing that. Haskell's lack of null pointers seems to also push people in the right direction here. If you used a string instead of an enum, you'd have to return a Maybe or have a partial function, both of which are almost universally understood to be something to be avoided when possible.