Thursday, November 8, 2012

Using Cabal With Large Projects

In the last post we talked about basic cabal usage. That all works fine as long as you're working on a single project and all your dependencies are in hackage. When Cabal is aware of everything that you want to build, it's actually pretty good at dependency resolution. But if you have several packages that depend on each other and you're working on development versions of these packages that have not yet been released to hackage, then life becomes more difficult. In this post I'll describe my workflow for handling the development of multiple local packages. I make no claim that this is the best way to do it. But it works pretty well for me, and hopefully others will find this information helpful.

Consider a situation where package B depends on package A and both of them depend on bytestring. Package A has wide version bounds for its bytestring dependency while package B has narrower bounds. Because you're working on improving both packages you can't just do "cabal install" in package B's directory because the correct version of package A isn't on hackage. But if you install package A first, Cabal might choose a version of bytestring that won't work with package B. It's a frustrating situation because eventually you'll have to end up worrying about dependencies issues that Cabal should be handling for you.

The best solution I've found to the above problem is cabal-meta. It lets you specify a sources.txt file in your project root directory with paths to other projects that you want included in the package's build environment. For example, I maintain the snap package, which depends on several other packages that are part of the Snap Framework. Here's what my sources.txt file looks like for the snap package:

./
../xmlhtml
../heist
../snap-core
../snap-server

My development versions of the other four packages reside in the parent directory on my local machine. When I build the snap package with cabal-meta install, cabal-meta tells Cabal to look in these directories in addition to whatever is in hackage. If you do this initially for the top-level package, it will correctly take into consideration all your local packages when resolving dependencies. Once you have all the dependencies installed, you can go back to using Cabal and ghci to build and test your packages. In my experience this takes most of the pain out of building large-scale Haskell applications.

Another tool that is frequently recommended for handling this large-scale package development problem is cabal-dev. cabal-dev allows you to sandbox builds so that differing build configurations of libraries can coexist without causing problems like they do with plain Cabal. It also has a mechanism for handling this local package problem above. I personally tend to avoid cabal-dev because in my experience it hasn't played nicely with ghci. It tries to solve the problem by giving you the cabal-dev ghci command to execute ghci using the sandboxed environment, but I found that it made my ghci workflow difficult, so I prefer using cabal-meta which doesn't have these problems.

I should note that cabal-dev does solve another problem that cabal-meta does not. There may be cases where two different packages may be completely unable to coexist in the same Cabal "sandbox" if their set of dependencies are not compatible. In that case, you'll need cabal-dev's sandboxes instead of the single user-level package repository used by Cabal. I am usually only working on one major project at a time, so this problem has never been an issue for me. My understanding is that people are currently working on adding this kind of local sandboxing to Cabal/cabal-install. Hopefully this will fix my complaints about ghci integration and should make cabal-dev unnecessary.

There are definitely things that need to be done to improve the cabal tool chain. But in my experience working on several different large Haskell projects both open and proprietary I have found that the current state of Cabal combined with cabal-meta (and maybe cabal-dev) does a reasonable job at handling large project development within a very fast moving ecosystem.

2 comments:

Johan Tibell said...

The new cabal sandboxing will support both the use cases of cabal-dev and of cabal-meta. We've implemented a concept called package environments, which is general enough to implement both sandboxing and adding more than on source directory to a build. More concretely you would do:

cabal sandbox init
cabal sandbox add-source ../xmlhtml
cabal sandbox add-source ../heist
cabal sandbox add-source ../snap-core
cabal sandbox add-source ../snap-server

Note that this add-source, unlike cabal-dev's, creates a link to the package, so it gets rebuilt properly if it changes.

The package environment also lets you lock down specific version constraints for packages.

Dmitry Dzhus said...

> creates a link to the package

Will this feature support remote repositores, like cabal-meta does (I guess not)?