Thursday, February 28, 2008

Transactional Integrity Problem

An astute reader pointed out that there is a transactional integrity problem with the HAppS application built over the last 4 posts. The function checkAndAdd in Finished HAppS Application contains a call to "query $ IsUser" as well as a call to "update $ AddUser". This violates that ACID guarantee that was desired from the checkAndAdd function. If two people simultaneously try to create the same username, it's possible that both of them could get past the "query" and "if exists" statements before either of the "update AddUser" statements are executed. In this case, both of the AddUser updates would succeed and both users would think their account was created. But if they had the same username, then first one would be overwritten by the second one. The second user wouldn't notice a problem, but the first user would not be able to log in to the newly created account because his password would probably be different from the password chosen by the second user. This wouldn't be the end of the world, but it would certainly create frustration for the first user.

The problem exists because HAppS gives us transactional guarantees at the query and update level only. I just didn't think about it when I originally wrote the code. I could just give the fix, but I'll outline incorrect attempts I made before I got to the fix. Hopefully it will be more beneficial to see some wrong solutions and how they got corrected. Since I'm still trying to overcome tendencies learned from years of imperative programming, maybe this will be useful to other people in the same position. First we need to modify the addUser function to check for the existence of the user first. What we want is something like the following:

addUser name u = do
                 exists <- isUser name
                 if not exists
                   then modUsers $ M.insert name u

Haskell's if statement requires an else clause. The else clause must be the same type as modUsers, so that means it has to be "else return ()". Then I found out that Haskell's "unless" function does exactly the same thing. So you can replace the whole if statement with "unless exists $ modUsers $ M.insert name u".

For those who are still trying to understand monads, I should point out that it won't work to avoid exists binding with "if not (isUser name)". The problem here is that isUser is of type "m Bool", and the not function needs a Bool. The bind operator is the mechanism responsible for allowing us to effectively pull a result out of the monad to be passed to another function. Information never actually comes out of the monad though, because the function has to return a monad-encapsulated value.

The old type signature for addUser was:

addUser :: MonadState State m => String -> User -> m ()

The new one (now inferred by the compiler) is:

addUser :: (MonadReader State m, MonadState State m) =>
            String -> User -> m ()

The reason for the change is that isUser is a MonadReader action and modUsers is a MonadState action. The new type signature is saying that code using this function must be an instance of both MonadReader and MonadState. Fortunately, the update method has both of these, so we can get away with using the two different monads.

There's still one problem with this definition of addUser. There is no way for the caller to find out whether a new user was created or not. The simplest way to communicate this information is to just return exists at the end of the computation. So our finished addUser function in Session.hs looks like this:

addUser name u = do
                 exists <- isUser name
                 unless exists $ modUsers $ M.insert name u
                 return exists

And our checkAndAdd function becomes:

checkAndAdd user pass = do
  existed <- update $ AddUser user $ User user pass
  if existed
    then ok $ toResponse $ "User already exists."
    else ok $ toResponse $ "User created."

Now we have a single transaction and arguably cleaner code. What have we learned from this? Your update and query functions need to be carefully designed to provide a framework with the appropriate transactional guarantees needed in your system.

1 comment:

lf said...

There's a whenM and unlessM (with the obvious definitions) in Control.Monad.Extras or some such, I think.