Posts

Setting Up A Private Nix Cache

I recently went through the process of setting up a private Nix binary cache. It was not obvious to me how to go about it at first, so I thought I would document what I did here. There are a few different ways of going about this that might be appropriate in one situation or another, but I’ll just describe the one I ended up using. I need to serve a cache for proprietary code, so I ended up using a cache served via SSH.Setting up the serverFor my cache server I’m using an Amazon EC2 instance with NixOS. It’s pretty easy to create these using the public NixOS 18.03 AMI. I ended up using a t2.medium with 1 TB of storage, but the jury is still out on the ideal tradeoff of specs and cost for our purposes. YMMV.The NixOS AMI puts the SSH credentials in the root user, so log in like this:ssh -i /path/to/your/key.pem root@nixcache.example.comTo get your new NixOS machine working as an SSH binary cache there are two things you need to do: generate a signing key and turn on cache serving.Gener…

Simplicity

I'm reposting this here because it cannot be overstated.  I've been saying roughly the same thing for awhile now.  It's even the title of this blog!  Drew DeVault sums it up very nicely in his opening sentences:
The single most important quality in a piece of software is simplicity. It’s more important than doing the task you set out to achieve. It’s more important than performance. Here's the full post: https://drewdevault.com/2018/07/09/Simple-correct-fast.html

I would also like to add another idea.  It has been my observation that the attributes of "smarter" and "more junior" tend to be more highly correlated with losing focus on simplicity.  Intuitively this makes sense because smarter people will tend to grasp complicated concepts more easily.  Also, smarter people tend to be enamored by clever complex solutions.  Junior engineers usually don't appreciate how important simplicity is as much as senior engineers--at least I know I didn't! …

Fake: Generating Realistic Test Data in Haskell

On a number of occasions over the years I've found myself wanting to generate realistic looking values for Haskell data structures.  Perhaps I'm writing a UI and want to fill it in with example data during development so I can see how the UI behaves with large lists.  In this situation you don't want to generate a bunch of completely random unicode characters.  You want things that look plausible so you can see how it will likely look to the user with realistic word wrapping, etc.  Later, when you build the backend you actually want to populate the database with this data.  Passing around DB dumps to other members of the team so they can test is a pain, so you want this stuff to be auto-generated.  This saves time for your QA people because if you didn't have it, they'd have to manually create it.  Even later you get to performance testing and you find yourself wanting to generate several orders of magnitude more data so you can load test the database, but you stil…

Efficiently Improving Test Coverage with Algebraic Data Types

Think of a time you've written tests for (de)serialization code of some kind, say for a data structure called Foo.  If you were using the lowest level of sophistication you probably defined a few values by hand, serialized them, deserialized that, and verified that you ended up with the same value you started with.  In Haskell nomenclature we'd say that you manually verified that parse . render == id.  If you were a little more sophisticated, you might have used the QuickCheck library (or any of the numerous similar packages it inspired in other languages) to verify the parse . render == id property for a bunch of randomly generated values.  The first level of sophistication is often referred to as unit testing.  The second frequently goes by the term property testing or sometimes fuzz testing.

Both unit testing and property testing have some drawbacks.  With unit testing you have to write fairly tedious boilerplate of listing by hand all the values you want to test with.  Wit…

Armor Your Data Structures Against Backwards-Incompatible Serializations

As almost everyone with significant experience managing production software systems should know, backwards compatibility is incredibly important for any data that is persisted by an application. If you make a change to a data structure that is not backwards compatible with the existing serialized formats, your app will break as soon as it encounters the existing format. Even if you have 100% test coverage, your tests still might not catch this problem. It’s not a problem with your app at any single point in time, but a problem with how your app evolves over time.

One might think that wire formats which are only used for communication between components and not persisted in any way would not be susceptible to this problem. But these too can cause issues if a message is generated and a new version of the app is deployed before the the message is consumed. The longer the message remains in a queue, redis cache, etc the higher the chances of this occurring.

More subtly, if you deploy a ba…

Talk: Real World Reflex

I recently gave a talk at BayHac about some of the things I've learned in building production Reflex applications. If you're interested, you can find it here: videoslidesgithub

On Haskell Documentation

The following started out as a response to a Hacker News comment, but got long enough to merit a standalone blog post.

I think the root of the Haskell documentation debate lies in a pretty fundamental difference in how you go about finding, reading, and understanding documentation in Haskell compared to mainstream languages.  Just last week I ran into a situation that really highlighted this difference.

I was working on creating a Haskell wrapper around the ACE editor.  I initially wrote the wrapper some time ago and got it integrated into a small app.  Last week I needed ACE integration in another app I'm working on and came back to the code.  But I ran into a problem...ACE automatically makes AJAX requests for JS files needed for pluggable syntax highlighters and themes.  But it was making the AJAX requests in the wrong place and I needed to tell it to request them from somewhere else.  Depending on how interested you are in this, you might try looking through the ACE documentat…

How to Get a Haskell Job

Over and over again I have seen people ask how to get a full time job programming in Haskell. So I thought I would write a blog post with tips that have worked for me as well as others I know who write Haskell professionally. For the impatient, here's the tl;dr in order from easiest to hardest: IRCLocal meetupsRegional gatherings/hackathonsOpen source contributionsWork where Haskell people work First, you need to at least start learning Haskell on your own time. You had already started learning how to program before you got your first programming job. The same is true of Haskell programming. You have to show some initiative. I understand that for people with families this can be hard. But you at least need to start. After that, far and away the most important thing is to interact with other Haskell developers so you can learn from them. That point is so important it bears repeating: interacting with experienced Haskell programmers is by far the most important thing to do.

Measuring Software Fragility

While writing this comment on reddit I came up with an interesting question that I think might be a useful way of thinking about programming languages. What percentage of single non-whitespace characters in your source code could be changed to a different character such that the change would pass your CI build system but would result in a runtime bug? Let's call this the software fragility number because I think that metric gives a potentially useful measure of how bug prone your software is. At the end of the day software is a mountain of bytes and you're trying to get them into a particular configuration. Whether you're writing a new app from scratch, fixing bugs, or adding new features, the number of bytes of source code you have (similar to LOC, SLOC, or maybe the compressed number of bytes) is rough indication of the complexity of your project. If we model programmer actions as random byte mutations over all of a project's source and we're trying to predic…

"cabal gen-bounds": easy generation of dependency version bounds

In my last post I showed how release dates are not a good way of inferring version bounds. The package repository should not make assumptions about what versions you have tested against. You need to tell it. But from what I've seen there are two problems with specifying version bounds: Lack of knowledge about how to specify proper boundsUnwillingness to take the time to do so Early in my Haskell days, the first time I wrote a cabal file I distinctly remember getting to the dependencies section and having no idea what to put for the version bounds. So I just ignored them and moved on. The result of that decision is that I can no longer build that app today. I would really like to, but it's just not worth the effort to try. It wasn't until much later that I learned about the PVP and how to properly set bounds. But even then, there was still an obstacle. It can take some time to add appropriate version bounds to all of a package's dependencies. So even if you k…

Why version bounds cannot be inferred retroactively (using dates)

Image
In past debates about Haskell's Package Versioning Policy (PVP), some have suggested that package developers don't need to put upper bounds on their version constraints because those bounds can be inferred by looking at what versions were available on the date the package was uploaded. This strategy cannot work in practice, and here's why. Imagine someone creates a small new package called foo. It's a simple package, say something along the lines of the formattable package that I recently released. One of the dependencies for foo is errors, a popular package supplying frequently used error handling infrastructure. The developer happens to already have errors-1.4.7 installed on their system, so this new package gets built against that version. The author uploads it to hackage on August 16, 2015 with no upper bounds on its dependencies. Let's for simplicity imagine that errors is the only dependency, so the .cabal file looks like this: name: foo build-depend…

The Problem with Curation

Recently I received a question from a user asking about "cabal hell" when installing one of my packages. The scenario in question worked fine for us, but for some reason it wasn't working for the user. When users report problems like this they usually do not provide enough information for us to solve it. So then we begin the sometimes arduous back and forth process of gathering the information we need to diagnose the problem and suggest a workaround or implement a fix.In this particular case luck was on our side and the user's second message just happened to include the key piece of information. The problem in this case was that they were using stackage instead of the normal hackage build that people usually use. Using stackage locks down your dependency bounds to a single version. The user reporting the problem was trying to add additional dependencies to his project and those dependencies required different versions. Stackage was taking away degrees of freed…

LTMT Part 3: The Monad Cookbook

IntroductionThe previous twoposts in my Less Traveled Monad Tutorial series have not had much in the way of directly practical content. In other words, if you only read those posts and nothing else about monads, you probably wouldn't be able to use monads in real code. This was intentional because I felt that the practical stuff (like do notation) had adequate treatment in other resources. In this post I'm still not going to talk about the details of do notation--you should definitely read about that elsewhere--but I am going to talk about some of the most common things I have seen beginners struggle with and give you cookbook-style patterns that you can use to solve these issues.Problem: Getting at the pure value inside the monadThis is perhaps the most common problem for Haskell newcomers. It usually manifests itself as something like this: main = do lineList <- lines $ readFile "myfile.txt" -- ... do something with lineList here That code generates …

Announcing C◦mp◦se :: Conference

Since most of my content is about Haskell, I would like to take this opportunity to inform my readers of a new conference that I and the other co-organizers of the New York Haskell Meetup are hosting at the end of January. It's called C◦mp◦se, and it's a conference for typed functional programmers. Check out the website at http://www.composeconference.org/. We recently issued a call for papers. I know it's short notice, but the deadline is November 30. If you have something that you think would be interesting to typed functional programmers, we'd love to hear from you. Along with the conference we'll also be having one day be a less formal hackathon/unconference. If you would like to give a tutorial/demo at the unconference, email us at info@composeconference.org.

Field Accessors Considered Harmful

It's pretty well known these days that Haskell's field accessors are rather cumbersome syntactically and not composable.  The lens abstraction that has gotten much more popular recently (thanks in part to Edward Kmett's lens package) solves these problems.  But I recently ran into a bigger problem with field accessors that I had not thought about before.  Consider the following scenario.  You have a package with code something like this:

data Config = Config { configFieldA :: [Text] }

So your Config data structure gives your users getters and setters for field A (and any other fields you might have).  Your users are happy and life is good.  Then one day you decide to add a new feature and that feature requires expanding and restructuring Config.  Now you have this:
data MConfig = MConfig { mconfigFieldA :: [Text] }
data Config = Config { configMC :: MConfig , configFieldX :: Text , configFieldY :: Bool }
This is a nice solution beca…

Haskell Best Practices for Avoiding "Cabal Hell"

I posted this as a reddit comment and it was really well received, so I thought I'd post it here so it would be more linkable.  A lot of people complain about "cabal hell" and ask what they can do to solve it.  There are definitely things about the cabal/hackage ecosystem that can be improved, but on the whole it serves me quite well.  I think a significant amount of the difficulty is a result of how fast things move in the Haskell community and how much more reusable Haskell is than other languages.

With that preface, here are my best practices that seem to make Cabal work pretty well for me in my development.

1. I make sure that I have no more than the absolute minimum number of packages installed as --global.  This means that I don't use the Haskell Platform or any OS haskell packages.  I install GHC directly.  Some might think this casts too much of a negative light on the Haskell Platform.  But everyone will agree that having multiple versions of a package insta…

Implicit Blacklisting for Cabal

I've been thinking about all the Haskell PVP discussion that's been going on lately. It should be no secret by now that I am a PVP proponent. I'm not here to debate the PVP in this post, so for this discussion let's assume that the PVP is a good thing and should be adopted by all packages published on Hackage. More specifically, let's assume this to mean that every package should specify upper bounds on all dependencies, and that most of the time these bounds will be of the form "< a.b".Recently there has been discussion about problems encountered when packages that have not been using upper bounds change and start using them. The recent issue with the HTTP package is a good example of this. Roughly speaking the problem is that if foo-1.2 does not provide upper bounds on it's dependency bar, the constraint solver is perpetually "poisoned" because foo-1.2 will always be a candidate even long after bar has become incompatible with foo-…

Ember.js is driving me crazy

For the past few months I've been working on a project with a fairly complex interactive web interface. This required me to venture into the wild and unpredictable jungle of Javascript development. I was totally unprepared for what I would find. Soon after starting the project it became clear that just using JQuery would not be sufficient for my project. I needed a higher level Javascript framework. After a doing a little research I settled on Ember.js. The Zombie Code Apocalypse Ember was definitely a big improvement over straight JQuery, and allowed me to get some fairly complex UI behavior working very quickly. But recently I've run into some problems. The other day I had a UI widget defined like this: App.FooController = Ember.ObjectController.extend({ // ... }); App.FooView = Ember.View.extend({ // ... }); It was used somewhere on the page, but at some point I decided that the widget was no longer needed, so I commented out the widget's markup. I …

Haskell Web Framework Matrix

A comparison of the big three Haskell web frameworks on the most informative two axes I can think of. image/svg+xmlDSLsCombinatorsSnapHappstackYesodNSSome thingsdynamicEverythingtype-safeNote that this is not intended to be a definitive statement of what is and isn't possible in each of these frameworks. As I've written elsewhere, most of the features of each of the frameworks are interchangeable and can be mixed and matched. The idea of this matrix is to reflect the general attitude each of the frameworks seem to be taking, because sometimes generalizations are useful.

Using Cabal With Large Projects

In the last post we talked about basic cabal usage. That all works fine as long as you're working on a single project and all your dependencies are in hackage. When Cabal is aware of everything that you want to build, it's actually pretty good at dependency resolution. But if you have several packages that depend on each other and you're working on development versions of these packages that have not yet been released to hackage, then life becomes more difficult. In this post I'll describe my workflow for handling the development of multiple local packages. I make no claim that this is the best way to do it. But it works pretty well for me, and hopefully others will find this information helpful.Consider a situation where package B depends on package A and both of them depend on bytestring. Package A has wide version bounds for its bytestring dependency while package B has narrower bounds. Because you're working on improving both packages you can't just…