LTMT Part 3: The Monad Cookbook

Introduction

The previous two posts in my Less Traveled Monad Tutorial series have not had much in the way of directly practical content. In other words, if you only read those posts and nothing else about monads, you probably wouldn't be able to use monads in real code. This was intentional because I felt that the practical stuff (like do notation) had adequate treatment in other resources. In this post I'm still not going to talk about the details of do notation--you should definitely read about that elsewhere--but I am going to talk about some of the most common things I have seen beginners struggle with and give you cookbook-style patterns that you can use to solve these issues.

Problem: Getting at the pure value inside the monad

This is perhaps the most common problem for Haskell newcomers. It usually manifests itself as something like this:

main = do
    lineList <- lines $ readFile "myfile.txt"
    -- ... do something with lineList here

That code generates the following error from GHC:

    Couldn't match type `IO String' with `[Char]'
    Expected type: String
      Actual type: IO String
    In the return type of a call of `readFile'

Many newcomers seem puzzled by this error message, but it tells you EXACTLY what the problem is. The return type of readFile has type IO String, but the thing that is expected in that spot is a String. (Note: String is a synonym for [Char].) The problem is, this isn't very helpful. You could understand that error completely and still not know how to solve the problem. First, let's look at the types involved.

readFile :: FilePath -> IO String
lines :: String -> [String]

Both of these functions are defined in Prelude. These two type signatures show the problem very clearly. readFile returns an IO String, but the lines function is expecting a String as its first argument. IO String != String. Somehow we need to extract the String out of the IO in order to pass it to the lines function. This is exactly what do notation was designed to help you with.

Solution #1

main :: IO ()
main = do
    contents <- readFile "myfile.txt"
    let lineList = lines contents
    -- ... do something with lineList here

This solution demonstrates two things about do notation. First, the left arrow lets you pull things out of the monad. Second, if you're not pulling something out of a monad, use "let foo =". One metaphor that might help you remember this is to think of "IO String" as a computation in the IO monad that returns a String. A do block lets you run these computations and assign names to the resulting pure values.

Solution #2

We could also attack the problem a different way. Instead of pulling the result of readFile out of the monad, we can lift the lines function into the monad. The function we use to do that is called liftM.

liftM :: Monad m => (a -> b) -> m a -> m b
liftM :: Monad m => (a -> b) -> (m a -> m b)

The associativity of the -> operator is such that these two type signatures are equivalent. If you've ever heard Haskell people saying that all functions are single argument functions, this is what they are talking about. You can think of liftM as a function that takes one argument, a function (a -> b), and returns another function, a function (m a -> m b). When you think about it this way, you see that the liftM function converts a function of pure values into a function of monadic values. This is exactly what we were looking for.

main :: IO ()
main = do
    lineList <- liftM lines (readFile "myfile.txt")
    -- ... do something with lineList here

This is more concise than our previous solution, so in this simple example it is probably what we would use. But if we needed to use contents in more than one place, then the first solution would be better.

Problem: Making pure values monadic

Consider the following program:

import Control.Monad
import System.Environment
main :: IO ()
main = do
    args <- getArgs
    output <- case args of
                [] -> "cat: must specify some files"
                fs -> liftM concat (mapM readFile fs)
    putStrLn output

This program also has an error. GHC actually gives you three errors here because there's no way for it to know exactly what you meant. But the first error is the one we're interested in.

    Couldn't match type `[]' with `IO'
    Expected type: IO Char
      Actual type: [Char]
    In the expression: "cat: must specify some files"

Just like before, this error tells us exactly what's wrong. We're supposed to have an IO something, but we only have a String (remember, String is the same as [Char]). It's not convenient for us to get the pure result out of the readFile functions like we did before because of the structure of what we're trying to do. The two patterns in the case statement must have the same type, so that means that we need to somehow convert our String into an IO String. This is exactly what the return function is for.

Solution: return

return :: a -> m a

This type signature tells us that return takes any type a as input and returns "m a". So all we have to do is use the return function.

import Control.Monad
import System.Environment
main :: IO ()
main = do
    args <- getArgs
    output <- case args of
                [] -> return "cat: must specify some files"
                fs -> liftM concat (mapM readFile fs)
    putStrLn output

The 'm' that the return function wraps its argument in, is determined by the context. In this case, main is in the IO monad, so that's what return uses.

Problem: Chaining multiple monadic operations

import System.Environment
main :: IO ()
main = do
    [from,to] <- getArgs
    writeFile to $ readFile from

As you probably guessed, this function also has an error. Hopefully you have an idea of what it might be. It's the same problem of needing a pure value when we actually have a monadic one. You could solve it like we did in solution #1 on the first problem (you might want to go ahead and give that a try before reading further). But this particular case has a pattern that makes a different solution work nicely. Unlike the first problem, you can't use liftM here.

Solution: bind

When we used liftM, we had a pure function lines :: String -> [String]. But here we have writeFile :: FilePath -> String -> IO (). We've already supplied the first argument, so what we actually have is writeFile to :: String -> IO (). And again, readFile returns IO String instead of the pure String that we need. To solve this we can use another function that you've probably heard about when people talk about monads...the bind function.

(=<<) :: Monad m => (a -> m b) -> m a -> m b
(=<<) :: Monad m => (a -> m b) -> (m a -> m b)

Notice how the pattern here is different from the first example. In that example we had (a -> b) and we needed to convert it to (m a -> m b). Here we have (a -> m b) and we need to convert it to (m a -> m b). In other words, we're only adding an 'm' onto the 'a', which is exactly the pattern we need here. Here are the two patterns next to each other to show the correspondence.

writeFile to :: String -> IO ()
                     a ->  m b

From this we see that "writeFile to" is the first argument to the =<< function. readFile from :: IO String fits perfectly as the second argument to =<<, and then the return value is the result of the writeFile. It all fits together like this:

import System.Environment
main :: IO ()
main = do
    [from,to] <- getArgs
    writeFile to =<< readFile from

Some might point out that this third problem is really the same as the first problem. That is true, but I think it's useful to see the varying patterns laid out in this cookbook style so you can figure out what you need to use when you encounter these patterns as you're writing code. Everything I've said here can be discovered by carefully studying the Control.Monad module. There are lots of other convenience functions there that make working with monads easier. In fact, I already used one of them: mapM.

When you're first learning Haskell, I would recommend that you keep the documentation for Control.Monad close by at all times. Whenever you need to do something new involving monadic values, odds are good that there's a function in there to help you. I would not recommend spending 10 hours studying Control.Monad all at once. You'll probably be better off writing lots of code and referring to it whenever you think there should be an easier way to do what you want to do. Over time the patterns will sink in as form new connections between different concepts in your brain.

It takes effort. Some people do pick these things up more quickly than others, but I don't know anyone who just read through Control.Monad and then immediately had a working knowledge of everything in there. The patterns you're grappling with here will almost definitely be foreign to you because no other mainstream language enforces this distinction between pure values and side effecting values. But I think the payoff of being able to separate pure and impure code is well worth the effort.

Comments

Anonymous said…
Why do you use "=<<" instead of the more common ">>="?
mightybyte said…
I like how =<< has a right-to-left flow that fits nicely with do notation's <- as well as the (.) and ($) operators. I discuss this a little in my previous post.

Popular posts from this blog

Ember.js is driving me crazy

Dependent Types are a Runtime Maybe

Adding Authentication to the Blog App