Aeson Revisited
As many of you know, the documentation situation in Haskell leaves something to be desired. Sure, if you are enlightened, and can read the types, you’re supposedly good. Personally, I prefer a little more documentation than “clearly this type is a monoid in the category of endofunctors”, but them’s the breaks.
Long ago, I wrote about some tricks I found out about using Aeson, and I found myself using Aeson again today, and I’d like to revisit one of my suggestions.
Types With Multiple Constructors
Last time we were here, I wrote about parsing JSON objects into Haskell types with multiple constructors. I proposed a solution that works fine for plain enumerations, but not types with fields.
Today I had parse some JSON into the following type:
data Term b a =
App b [Term b a]
| Var VarId
| UVar a
I thought “I’ve done something like this before!” and pulled up my notes. They weren’t terribly helpful. So I delved into the haddocs for Aeson, and noticed that Aeson's Result type is an instance of MonadPlus
. Could I use mplus
to try all three different constructors, and take whichever one works?
instance (FromJSON b, FromJSON a)
=> FromJSON (Term b a) where
parseJSON (Object v) = parseVar `mplus` parseUVar
`mplus` parseApp
where
parseApp = do
ident <- v .: "id"
terms <- v .: "terms"
return $ App ident terms
parseVar = Var <$> v .: "var"
parseUVar = UVar <$> v .: "uvar"
parseJSON _ = mzero
It turns out that I can.
Of Semicolons and Sadness
Lately, I’ve been writing a lot of Haskell. If you’ve been following the blog, you likely are aware of this. However, I’ve been feeling the itch to try something different. I’ve been dabbling in things here and there, but for some reason I just can’t settle into a groove. It seems that Haskell has ruined programming for me.
Why? Is it the functional purity of Haskell? Is it the way that lazy evaluation changes the way you think about programming? Is it the sublime type system that catches whole classes of bugs at compile time? No, I’m afraid not. My issue is much more basic; semicolons and angle brackets.
Your Mission
…should you choose to accept it: write a function fun
that does the following: takes an integer argument acc
, and two arguments of the same generic type. This function should enforce that the two generic arguments cannot be changed. This function calls another function given
that takes two arguments of the same generic type, that also cannot be changed. given
does something, then returns an integer as a result. given
can fail!
After calling given
, if it failed, return acc - 1
, if it succeeded, return its result + acc
.
In C
C is a pretty simple language. I like simplicity (it’s the whole premise of this post, right?) but C’s simplicity turns into overly verbose code in the face of the requirements of fun
. Here is the signature of given
// please ensure that lhs and rhs have the same type
int given(int * result,
void const * const lhs,
void const * const rhs)
So, what’s going on here? In C, the dominant idiom for indicating success or failure is to return an error code, thus we return an int
. However, this isn’t our return value! We use the out parameter result
instead. Finally, we have lhs
and rhs
, which have the confusing types void const * const
. These read as “constant pointer to constant void”, and ensures that neither the where the pointer points, or the value of what it points to can change. (thus documenting the requirement that it not change its arguments!) Finally, we have our comment asking the caller to pretty please, ensure lhs
and rhs
have the same type. There is no way in C to ensure two void *
are actually the same type, so this is the best we can do… Now, let’s see my implementation of fun
:
// please ensure that lhs and rhs have the same type
int fun(int acc,
void const * const lhs,
void const * const rhs)
{
int res = 0;
if (given(&res, lhs, rhs))
{
return acc + res;
}
else
{
return acc - 1;
}
}
We have the same confusing sort of signature, except here the return value is our actual return value, and acc
is not an out parameter. (which is really obvious, right?)
Inside of this function, we allocate res
on the stack and initialize it to 0. Then we call given
with the address of res
, and lhs
and rhs
. Taking the address of a something requires a special symbol. This function is called within an if block because given
returns 0 on failure and something on success.
Some of this noise can be reduced, but overall it’s pretty minimal. The use of void
pointers and our type checking comment are troubling, but that’s not really the point of this post. However, the many semicolons, curly braces, and commas add a lot of clutter.
Noise level: Moderate
Strangeness: High
Confidence that there are no bugs in my implementation: Very Low
In C++
Next, let’s check out some C++. Here is the signature of given
:
// throws an std::exception on failure!
template <typename T> int given(const T & lhs,
const T & rhs)
To me, this is kind of a wash from the C implementation. On one hand, we have this template <typename T>
silliness, on the other hand we don’t have an out parameter, we only have one const
per argument, and the compiler can actually enforce that lhs
is of the same type as rhs
. Next, the implementation:
template <typename T> int fun(int acc,
const T & lhs,
const T & rhs)
{
try
{
return acc + given<T>(lhs, rhs);
}
catch (std::exception & e)
{
return acc - 1;
}
}
Here we have that same yucky sort of signature, and we have these try
and catch
blocks. Personally, I don’t care for exceptions. However, they are a tried and true strategy accepted by the majority of the programming community, and who am I to dispute that? However, what I don’t like is the fact that we have no indication that given
can throw. We had to read a comment. Additional weirdness here is the call to given
: given<T>(lhs, rhs)
. Why do I have to do <T>
?
The usual complaints about curly braces and semicolons apply here as well.
Noise Level: High
Strangeness: Low
Confidence that there are no bugs in my implementation: Low
In Rust
A newcomer to the scene is Rust. Rust is a low level systems programming language that has a very strong emphasis on safety. Excellent, right? Let’s see the signature of given
:
fn given<T>(lhs: &T, rhs: &T) -> Option<i64>
… eew. We have twice as many angle brackets as the C++ implementation did, which is unfortunate. We’ve also added some colons. On the plus side, we have option types in rust, so no icky exceptions! Let’s see my implementation of fun
:
fn fun<T>(acc: i64, lhs: &T, rhs: &T) -> i64
{
let res = given(&lhs, &rhs);
let ret = match res {
Some(x) => acc + x,
None => acc - 1,
};
ret
}
… and things get strange. Let’s go through this line by line:
fn fun<T>(acc: i64, lhs: &T, rhs: &T) -> i64
: A bit noisier than I’d like, but perfectly reasonable.
let res = given(&lhs, &rhs);
: Here we call given
, and for some reason I have to put an & in front of lhs
and rhs
. At least rustc inferred the type for us!
let ret = match res {};
: Yay, Rust has pattern matching! It’s a bit unfortunate about all those curly braces and semicolons though…
Some(x) => acc + x,
: Nothing wrong here, in a world where commas and such are a thing…
None => acc - 1,
: And here we have the last pattern of this match block. Why is there a comma there?! For reasons that science will never explain, in Rust, you have to put commas after the last entry in things like this! This oddity extends to structs and enums as well…
ret
: Finally, we return. No, ret
isn’t a keyword, it’s a variable that I bound in my pattern match. Notice the lack of a semicolon. In Rust, the last line in a function is the return value, and does not have a semicolon! However, rust has a return
statement for early returns, and this does take a semicolon. Yes, you can use a return
statement here, but according to the docs doing this makes you a bad person.
Noise Level: Very High
Strangeness: High
Confidence that there are no bugs in my implementation: Reasonable
In Haskell
Finally, let’s take a look at what I consider to be a reasonable syntax: Haskell. The signature of given
is as follows:
given :: a -> a -> Maybe Int
Once you learn to read Haskell, you’ll see how simple this is. given
is a function that takes an a, another a, and returns a Maybe Int
. Next, let’s see my implementation of fun
:
fun :: Int -> a -> a -> Int
fun acc lhs rhs = case given lhs rhs of
Just res -> acc + res
_ -> acc - 1
Not counting the optional function signature, this is 3 lines. The first is our pattern match, which calls given
. The second line handles success, and the third handles failure. There is no return statement here, the result of the expression is the return value. And unlike Rust, there’s no special case that breaks the mold.
Noise Level: Low
Strangeness: Low (If your definition of strange isn’t “doesn’t look like C”)
Confidence that there are no bugs in my implementation: High
Which Brings Me To My Point
Honestly, I believe that the reason for all of this is: “We’ll make our language look like C so that we don’t scare people away!” Personally, I believe that this is a sad state of affairs.
Strictly speaking, I do mean to criticize. C, C++, and Rust are all fine languages. They do what they set out to do, and do it well. Many people successfully use these languages every day! They are all languages worth knowing and languages worth using. However, there is no reason for all these glyphs and extra keywords!
My top feature for the new version of C++: Get rid of all uses of the angle bracket!
My top feature for the new version of Rust: Get rid of those trailing commas and the return statement!
Of course this will never happen, and for good reason: It will cause massive breakages in everybody’s codebases. But a man can dream…
Do We Really Need All These Monad Transformers?
Since I first learned about them, I’ve been a fan of Monad Transformers. Just stick them all together and we can pretend we’re writing Java, who doesn’t want that? Mutable State, Logging, Configuration, Exceptions, they’re all there. Heck, stick some IO in that if you like. Apparently, there’s even a pre-built Reader/Writer/State Monad in there. However, lately I’ve been working on a fairly large Haskell project, and this project doesn’t use Monad transformers. And you know what? It’s working out for them just fine.
Lately, I’ve been wondering if all these transformers are really worth the effort? After you’ve chained together some massive runFoo (runBar (runBaz (runMonadRun ))) foobarbaz
function, and got it all working right and not returning some ridiculous nested tuple, what have you gained?
ReaderT
First up is ReaderT
. ReaderT
let’s us carry around a read-only configuration. In an ReaderT
Monad, we can do the following:
foo :: (MonadReader c m) => a -> m b
foo a = do -- stuff
config <- ask
-- more stuff, presumably using config
…without having to have a config parameter on our function. This is nice because it improves readability. Right? Because this is so bad:
foo :: c -> a -> b
foo c a = -- stuff, presumably using config
“But ReaderT
gets you local
!” you say:
foo :: (MonadReader c m) => a -> m b
foo a = do -- stuff
local modifyC bar
where
modifyC c = -- change the config
bar = -- some monad
Nifty, I agree. Or we chould just do:
foo :: c -> a -> b
foo c a = bar (modifyC c)
where
modifyC c = -- change the config
bar c = -- some function of c
…which I believe is much clearer, because I didn’t have to go to Hackage to figure out what local
does.
StateT
Conceptually kind of a combination of ReaderT
and WriterT
(I’m going to skip WriterT
for the sake of brevity), StateT
lets us use mutable state in a Monadic function:
foo :: (MonadState s m) => a -> m b
foo a = do state <- get
-- do stuff, presumably change the state
put state'
-- more stuff
So, what’s the non-monadic alternative? I imagine something like this:
foo :: s -> a -> (s, b)
foo s a = (s', a')
where
s' = -- do something to change the state
a' = -- do something using s'
I suppose that’d be workable, but now we have this tuple to deal with. We have a few options. We can do pattern matching:
bar :: a -> (s, b)
bar a = (s', b')
where
s = -- some initial state
(s', b) = foo s a
b' = -- do something, using b but not s'
…or we can use something like uncurry
:
-- uncurry :: (a -> b -> c) -> (a, b) -> c
baz :: s -> a -> (s, b)
baz a = foo (uncurry $ bar a)
Both of these are much harder to understand than the Monadic version in my opinion. For both, we have to shimmy our data to fit the interface, and these are just contrived examples.
ExceptT
Finally, I’d like to talk about ExceptT
. This monad lets us simulate exceptions. Unlike Haskell’s normal exception system, where exceptions can only be caught in IO
, these exceptions can be caught within the ExceptT
monad:
crud :: (MonadError String m) => a -> m b
crud a = throwError "Oh crud!"
-- doesn't catch
foo :: (MonadError String m) => a -> m c
foo a = do -- stuff
res <- curd a
-- doesn't make it this far
-- catches the exception
bar :: (MonadError String m) => a -> m c
bar a = do -- stuff
res <- catchError handler (crud a)
-- still rolling
where
handler e = -- exception handler
Seems reasonable right? Or, we could just use Either
:
crud :: a -> Either String b
crud a = Left "Oh crud!"
-- must handle the error
foo :: a -> c
foo a = case (crud a) of
Left e -> -- handle error
Right c -> -- all good
Personally, I find these two to be a wash. Both are understandable. The Monadic version has the potential for the programmer to forget to handle the error, but the non-monadic version involved a noisy case statement.
Not As Clear As It Seemed Before
As you can see, these things are hit and miss. Having thought about it while typing this out, if I had some function that just needed to read a config and throw an error, I’d probably define it like this:
foo :: c -> a -> Either String b
…however, if I needed to throw some state in there:
foo :: (MonadReader c m,
MonadState s m,
MonadError e m) =>
a -> m b
Suddenly the ReaderT
and StateT
become more attractive; why not throw them on the stack? I suppose the point is that maybe using a huge transformer isn’t the best way, not that it certainly isn’t the best way. Just some food for thought.
Checking Our Heads With Liquid Haskell
Consider this function:
headInTail :: [a] -> [a]
headInTail l = (tail l) ++ [head l]
Pretty straightforward, right? It takes a list, extracts the head and sticks it in the tail. Surely you’ve written something like this before. It should be fine, right?
*Main> headInTail [1,2,3]
[2,3,1]
…checks out. Let’s try a few more:
*Main> headInTail "hello"
"elloh"
*Main> headInTail ["cat"]
["cat"]
…good. And the moment we’ve all been waiting for:
*Main> headInTail []
*** Exception: Prelude.tail: empty list
Oh yeah, head
and tail
don’t work for empty lists… Normally, we have some choices on how to proceed here. We could wrap the function in a Maybe
:
maybeHeadInTail :: [a] -> Maybe [a]
maybeHeadInTail [] = Nothing
maybeHeadInTail l = Just $ headInTail l
…which introduces an annoying Maybe
to deal with just to stick our heads in our tails. Or, we could just do something with the empty list:
headInTail :: [a] -> [a]
headInTail [] = []
headInTail l = (tail l) ++ [head l]
…but what if returning the empty list isn’t the correct thing to do?
Another choice is to document this behavior (as head
and tail
do), and just never call headInTail []
. But how can we guarantee that we never attempt to call this function on an empty list? Shouldn’t this be the type system’s problem?
Unfortunately, not all is roses and puppies. Sometimes the type system cannot help us. Sometimes somebody thought it’d be a good idea to use Haskell’s Evil exception system. Whatever the case, there are tools to help us.
Liquid Haskell
Liquid Haskell is a static code analysis tool that is used to catch just these sorts of problems. Liquid Haskell allows us to define invariants which will be enforced by the tool. Liquid Haskell is a research project that is still in development. As such, it has some rough spots, however it’s still very much capable of helping us with our problem here. But before we begin, we need to get the tool installed.
To install the tool, execute:
cabal install liquidhaskell
…simple right? Unfortunately, we’re not quite done. We need to install an SMT solver. This tool is used by Liquid Haskell. Currently, the tool defaults to Z3. I’m not sure how to use a different solver (and Z3 works just fine), so I suggest you you Z3. You’ll have to build Z3, and place the binary somewhere on the PATH
. After this is done, and assuming your .cabal/bin
directory is also on the PATH
, testing your source file is a simple as:
liquid [FILE].hs
Let’s Have A Look
Create a haskell source file that contains the following:
headInTail :: [a] -> [a]
headInTail l = (tail l) ++ [head l]
After that’s done, let Liquid Haskell analyze your file:
liquid [YOUR_FILE].hs
A bunch of stuff should scroll by, then in a second you’ll see something similar to the following:
**** UNSAFE *********************************************
Doop.hs:5:22: Error: Liquid Type Mismatch
Inferred type
VV : [a] | VV == l && len VV >= 0
not a subtype of Required type
VV : [a] | len VV > 0
In Context
VV : [a] | VV == l && len VV >= 0
l : [a] | len l >= 0
If you go to the line and column indicated, you’ll find the argument to tail. Conveniently, it seems that Liquid Haskell comes pre-loaded with definitions for some library functions. Normally, you’ll have to define those yourself. In fact, let’s do just that.
The next logical step here is to write a specification for our function. This specification is a statement about what sorts of values the function can take. Add the following to your source file, in the line above the signature for headInTail:
{-@ headInTail :: {l:[a] | len l > 0} -> [a] @-}
If you re-run liquid
on your source file, you’ll see that the warning went away, and the program now indicates that your source is “SAFE”. So, what does this refinement mean?
Basically, these refinements are machine-checked comments. They have no impact on the program, they exist for Liquid Haskell. Think of it as being like an enhanced type signature. Like a normal type signature, we start with the name of the function, then two colons. This part, however:
{l:[a] | len l > 0}
…should be new. Basically, this part says that the list should not be empty. You should read it as “l is a [a] such that len l is greater than zero.” A lot of the notation used by Liquid Haskell comes from formal logic. Let’s break this down some more:
l:[a]
Here we bind a symbol, l, to the first list argument. At any point to the right of this symbol until the end of the scope defined by {}
, we can reference this symbol.
{... | ...}
The pipe symbol indicates that we are going to make some statement about the type on the left hand side.
len l > 0
Here we state that the length of l must be greater than 0. It looks like we are calling a function, and we sort of are; len is a measure which is a special function that is used in specifications. However, the subject of measures is a post for another day.
You may now be thinking: “Well this is all well and good, but what’s to stop me from calling this function on an empty list?” To answer that, let’s implement main:
main =
do i <- getLine
putStrLn $ headInTail i
Add this to your source file, and then run liquid [YOUR_FILE].hs
and you’ll notice that Liquid Haskell has a problem with your attempt to call headInTail
:
**** UNSAFE *********************************************
Doop.hs:3:29: Error: Liquid Type Mismatch
Inferred type
VV : [Char] | VV == i && len VV >= 0
not a subtype of Required type
VV : [Char] | len VV > 0
In Context
VV : [Char] | VV == i && len VV >= 0
i : [Char] | len i >= 0
Liquid Haskell is telling you that it can’t prove that the length of i
is greater than 0. If you execute your main function, you should see that it works as expected. Type in a string, and it’ll do the right thing. Push enter right away and you’ll get an exception.
*Main> main
hello
elloh
*Main> main
*** Exception: Prelude.tail: empty list
…ick… Let’s fix this:
main =
do i <- getLine
case i of
[] -> putStrLn "Get your head checked."
_ -> putStrLn $ headInTail i
Now if you analyze your file, Liquid Haskell should be happy. Honestly, you should be happy too: the tool caught this problem for you. It didn’t go to testing messed up, and the bug certainly didn’t escape testing unfound. Your program is now objectively better:
*Main> main
isn't that neat?
sn't that neat?i
*Main> main
Get your head checked.
Object Oriented Haskell
You may recall, earlier this year I wrote about object orientation in C. The basic Idea being that “Object Oriented Programming” is more a mindset, then a language feature. You can do object orientation and access control in C using free-floating functions and opaque structs. Well, guess what? You can do Object Oriented Programming in Haskell as well!
As a quick recap, if you didn’t read "C With Classes", for our purposes there are three components that need to be present to be considered “an object”: data consolidation, access control, and method calling. Inheritance is also important to real “Object Oriented Programming” (or OOP, as I’ll start calling it), but is really just gravy. Inheritance is largely supported by typeclasses in Haskell, so I won’t be going into it today.
Functional OOP
What would OOP look like in Haskell? Thanks to the magic of higher-order functions, we can actually very closely approximate what you’d see in a traditional OOP language, such as Java or C++.
Data
This part is trivial, there is literally a data
keyword for this purpose. You should have this down by now.
Access Control
Access Control can be accomplished by way of the module
keyword. Simply expose what you’d like to be public, and don’t expose your private members. Obviously, if you have private fields in your “Class”, you should make a factory function for your class instead of exposing its constructor. This is all pretty standard stuff.
Method Calls
This is an area that Haskell shines, and C shows its ugly side. In Haskell, we can actually create a method call operator, and create something that looks very much like the traditional Class.method
calling convention that we’re used to!
The Method Call Operator
First, we need some contrived classes. Let’s go with everybody’s favorite: the Employee
!
data Employee = Employee {name :: String,
employeeId :: Integer,
title :: String}
Nothing fancy or non-standard here. We created an Employee
“Class” in the traditional Haskell way. Due to our use of record syntax, we already have three getters: name
, employeeId
, and title
.
Let’s make another function, so that we can have a method that takes arguments:
isSeniorTo :: Employee
-> Employee
-> Bool
isSeniorTo s f = (employeeId s) > (employeeId f)
There is something extremely important to note about this function: the last argument is the object, not the first as it was in “C With Classes”. The reason for this is to allow us to partially apply this function, this will be crucial.
Now, let’s give it a shot to make sure everything is working:
*ghci> let boss = newEmployee "Chris" 1 "Author"
*ghci> let notBoss = newEmployee "Geoffrey" 2 "Associate"
*ghci> name boss
"Chris"
*ghci> isSeniorTo notBoss boss
True
All looks well, we create two Employee objects. Since Chris
‘s employee ID is lower than Geoffrey
‘s, the isSeniorTo
method returns True
. But this is all very wonky still, let’s create that method call operator already!
(>-) :: a
-> (a -> b)
-> b
(>-) o f = f o
Since .
and ->
are already spoken for, I’ve gone with >-
for my method call operator. The method call operator takes an arbitrary type a
, and a function that takes the same type a
, and returns a type b
. Finally a type b
is returned. The definition of this function couldn’t be simpler: we call the function on the passed-in object! Let’s see this in action:
*ghci> notBoss >- title
"Associate"
*ghci> boss >- isSeniorTo notBoss
True
This is just about as elegant as it gets; using the method call operator we just defined, we call a method on an object! In the case of isSeniorTo
, we partially apply the function with all but the last argument, then the method call operator is able to use it just as any other function.
But something doesn’t sit right. Isn’t the definition of isSeniorTo
a bit smelly? It calls methods on objects, shouldn’t it use the method call operator? Now that we have our method call operator, let’s fix that function:
isSeniorTo :: Employee
-> Employee
-> Bool
isSeniorTo s f = (s >- employeeId) > (f >- employeeId)
There, that’s better. isSeniorTo
now properly calls methods on s
and f
, and all is again well in the world.
Here’s The Kicker
You may be experiencing some deja vu from all of this. It feels like we’ve done this sort of thing before. But where? You may recall this little operator:
*ghci> :t (>>=)
(>>=) :: Monad m => m a -> (a -> m b) -> m b
That’s right, the monadic bind operator looks suspiciously like our method call operator. But does it behave the same?
*ghci> 1 >- (+) 1
2
*ghci> Just 1 >>= (\a -> return $ a + 1)
Just 2
Let’s ignore for a second that we had to use a monad for the bind operator, and see the fact that the two basically did the same thing. In fact, to further illustrate the point, let’s make a Class
monad:
data Class a = Class a deriving (Show)
instance Monad Class where
(Class a) >>= f = f a
return = Class
Unlike most monads, which do important work, our Class
monad does nothing but prove a point. However, you should note that the implementation of the bind operator is essentially the same as the implementation of the method call operator. Now, let’s see our monad in action!
let boss = Employee "Chris" 1 "Author"
*ghci> boss >- name
"Chris"
*ghci> Class boss >>= return . name
Class "Chris"
As you can see, they essentially do the same thing. It turns out that Haskell had classes all along!
Of course, this isn’t exactly true. I’d say that the State
monad serves many of the same purposes as objects in OOP languages. However, the semantics of things like Reader
, Maybe
, and IO
have little in common with objects. But much like we implemented Objects in Haskell, the various OOP languages are implementing monads into their standard libraries.
Indeed, OOP and functional programming are not supported by languages, they are supported by programmers. Some languages may make one or the other easier, but the differences are small, and get smaller as time passes.
Again, This Time In Reverse!
Today we’ll be doing Exercise 5 from The C Programming Language. Today’s exercises aren’t terribly challenging, but there is some good stuff in here. The problem is:
Modify the temperature conversion program to print the table in reverse order, that is, from 300 degrees to 0.
I bet you thought we were done with temperature conversion, didn’t you? I’m right there with you, but luckily for us, this is also the section that introduces the for
loop, so this is much less tedious. The new program for modification:
#include <stdio.h>
int main (int argc, char ** argv)
{
for (int fahr = 0; fahr <= 300; fahr = fahr + 20)
{
printf("%3d %6.1f\n", fahr, (5.0/9.0)*(fahr-32));
}
}
I made a few minor modifications. I gave main
the correct signature to keep gcc from complaining, and I also moved the initialization of fahr
to the loop. As you may know this is not valid C, and the compiler will throw an error. You might also know that this was made to be valid C in the C 99 revision. To compile this, you just need to add the -std=c99
flag to your gcc call.
To complete this exercise, you only need to modify the loop declaration. Change this line…
for (int fahr = 0; fahr <= 300; fahr = fahr + 20)
…to this…
for (int fahr = 300; fahr >= 0; fahr = fahr - 20)
Pretty straight forward stuff here. fahr
starts at 300 instead of 0, and the loop continues so long as it remains greater than 0. Instead of adding 20 each iteration, we subtract it. This can be compiled like so:
gcc -std=c99 -Wall ex5.c -o ex5
In Haskell
Unfortunately, the authors of The C Programming Language did not see fit to include a starting Haskell program for us to modify. This seems like a pretty serious oversight to me, but we’ll just have to make due. Luckily for us, we can use the temperature conversion program I wrote for the last entry in this series. This program handles conversions, and we can use it to produce a table of all conversions between 0 and 300 degrees.
While I’ve established that Haskell does in fact have loops, we won’t be using them here. Instead we’ll be using ranges, functors, and sequencing to solve this problem in a more functional way.
First, we need a list of every 20th number between 0 and 300:
[300.0, 280.0 .. 0.0]
The ..
there basically says “You get the idea, do that.” If you put enough information into it so that the compiler can figure out the pattern, and start and end points, the compiler will fill the rest in. In this case, I gave the first two values, telling the compiler I want increments of 20, and I gave it the last value so it knows where to stop.
Unfortunately, as you recall, my conversion program needs members of the Temperature
typeclass. Surely I don’t plan to make Double
a member, right? Right. We need to map a function over this list to produce a list of Temperature
s. But what function should we use?
Something that beginning Haskellers may not realize is that constructors are in fact functions!
Prelude> :t Just
Just :: a -> Maybe a
Prelude> :t Left
Left :: a -> Either a b
…and of course…
*Main> :t Fahrenheit
Fahrenheit :: Double -> Fahrenheit
That’s right, we can map Fahrenheit
over a functor just like any other function!
map Fahrenheit [300.0, 280.0 .. 0.0]
This will produce a list of every 20th Fahrenheit
temperature between 300 and 0. However, we aren’t done yet. We actually need a list of Conversion
s, because this type is required for our insertConv
function. To get this we can map toConversion
over this list:
map toConversion $ map Fahrenheit [300.0, 280.0 .. 0.0]
…but that’s starting to get a bit ugly, there must be a better way. Normally, I’m not terribly fond of function composition, in this case it will make our life easier.
Function Composition
Haskell provides an operator .
that is used for function composition. Let’s take a look at its type:
Prelude> :t (.)
(.) :: (b -> c) -> (a -> b) -> a -> c
The composition operator takes a function that takes some type b
as an argument, and that returns some type c
. It takes a second function that takes some type a
as an argument, and that returns some type b
. Finally, it returns some type c
.
What does this do for us? Basically, it takes a function a -> b
, and a function b -> c
, and turns them into a function a -> c
. Let’s see an example:
Prelude> :t show
show :: Show a => a -> String
The function show
takes a member of the Show
typeclass and returns a String
.
Prelude> :t putStr
putStr :: String -> IO ()
The function putStr
takes a String
, and returns an IO
action.
Prelude> :t putStr . show
putStr . show :: Show a => a -> IO ()
When we compose the two functions, we get a function that takes a member of the Show
typeclass, and returns an IO
action.
Logically, calling putStr . show someShowable
is the same as calling putStr $ show someShowable
or putStr (show someShowable)
. However, putStr . show
is a valid function, where putStr $ show
and putStr (show)
are compiler errors if you don’t give show
an argument.
How does this help us?
*Main> :t Fahrenheit
Fahrenheit :: Double -> Fahrenheit
*Main> :t toConversion
toConversion :: (Temperature a, Temperature b) => a -> (a, b)
*Main> :t toConversion . Fahrenheit
toConversion . Fahrenheit :: Temperature b => Double -> (Fahrenheit, b)
By composing toConversion
and Fahrenheit
, we end up with a function that takes a Double
as an argument and returns a Conversion
from Fahrenheit
. We can then map
this function over our list of Double
s:
map (toConversion . Fahrenheit) [300.0, 280.0 .. 0.0]
Next, we need to turn these Conversion
s into TableBuilder
s. This is a simple matter of composing insertConv
:
map (insertConv . toConversion . Fahrenheit) [300.0, 280.0 .. 0.0]
The type of this new function is:
:t (insertConv . toConversion . Fahrenheit)
(insertConv . toConversion . Fahrenheit) :: Double -> TableBuilder ()
…or at least it would be if the type inferrer could figure out the type of toConversion
. Unfortunately for us, it can’t because the implementation is purposely ambiguous. We need to add a type signature to it:
(insertConv . (toConversion :: Fahrenheit -> (Fahrenheit, Celsius))
. Fahrenheit)
Sequencing
Now we are ready to create our table. This part is actually very easy. Thanks to the library function sequence_
we won’t even need a do
block. What does this function do? Let’s look at its signature:
Prelude> :t sequence_
sequence_ :: Monad m => [m a] -> m ()
Prelude> :t sequence
sequence :: Monad m => [m a] -> m [a]
There are two variations of this function. The first, sequence_
evaluates each monadic action, and ignores the results. Basically, suppose we have some function: func :: a -> Monad b
let mList = map func ["foo", "bar", "baz"]
in sequence_ mList
…is equivalent to…
do func "foo"
func "bar"
func "baz"
return ()
The second variation evaluates each monadic action, and returns a list of all the results. Suppose we have our same function: func :: a -> Monad b
let mList = map func ["foo", "bar", "baz"]
in sequence mList
…is equivalent to…
do foo <- func "foo"
bar <- func "bar"
baz <- func "baz"
return [foo, bar, baz]
If you have a list of monads, you can use sequence
or sequence_
to have them all evaluated. Use sequence
if you care about the resulting values, use sequence_
if you only care about the monad’s side effect. Which do we want? Let’s look at the signature of insertConv
:
*Main> :t insertConv
insertConv
:: (Temperature a, Temperature b) =>
Conversion a b -> TableBuilder ()
If we were to use sequence
, we’d get a resulting value with the type TableBuilder [()]
. Since nobody has any use for a list of empty tuples, we’ll be using sequence_
.
So, what does our main look like?
main = do temps <- return (map (insertConv . (toConversion :: Fahrenheit -> (Fahrenheit, Celsius))
. Fahrenheit)
[300.0, 280.0 .. 0.0])
putStrLn $ prettyPrint $ buildTable $ do sequence_ temps
This produces the following output:
------------------------------
|| Fahrenheit || Celsius ||
------------------------------
|| 300.0 |> 148.9 ||
|| 280.0 |> 137.8 ||
|| 260.0 |> 126.7 ||
|| 240.0 |> 115.6 ||
|| 220.0 |> 104.4 ||
|| 200.0 |> 93.3 ||
|| 180.0 |> 82.2 ||
|| 160.0 |> 71.1 ||
|| 140.0 |> 60.0 ||
|| 120.0 |> 48.9 ||
|| 100.0 |> 37.8 ||
|| 80.0 |> 26.7 ||
|| 60.0 |> 15.6 ||
|| 40.0 |> 4.4 ||
|| 20.0 |> -6.7 ||
|| 0.0 |> -17.8 ||
------------------------------
K&R Challenge 3 and 4: Functional Temperature Conversion
The other day, I implemented the C solution to exercises 3 and 4 in The C Programming Language, today I’ll be implementing the Haskell solution. As a reminder, the requirements are:
Modify the temperature conversion program to print a heading above the table
… and …
Write a program to print the corresponding Celsius to Fahrenheit table.
I could take the easy way out, and implement the Haskell solution almost identically to the C solution, replacing the for
loop with a call to map
, but that’s neither interesting to do, nor is it interesting to read about. I’ll be taking this problem to the next level.
Requirements
For my temperature program, I’d like it to be able to convert between any arbitrary temperature unit. For the purposes of this post, I will be implementing Celsius, Fahrenheit, Kelvin (equivalent to Celsius, except 0 degrees is absolute zero, not the freezing point of water), and Rankine (the Fahrenheit version of Kelvin).
That said, nothing about my solution should rely on the fact that these four temperature scales are being implemented. A programmer should be able to implement a new temperature unit with minimal changes to the program. Ideally, just by implementing a new type.
Additionally, given a bunch of conversions, the program should be able to output a pretty table showing any number of temperature types and indicating what converts to what.
First, Conversion
This problem is clearly broken up into two sub-problems: conversion, and the table. First, we need to handle conversion. This being Haskell, the right answer is likely to start by defining some types. Let’s create types for our units:
newtype Celsius = Celsius Double deriving (Show)
newtype Fahrenheit = Fahrenheit Double deriving (Show)
newtype Kelvin = Kelvin Double deriving (Show)
newtype Rankine = Rankine Double deriving (Show)
I’ve chosen to make a type for each unit, and since they only contain one constructor with one field, I use a newtype
instead of a data
. Now, how to convert these? A straightforward solution to this would be to define functions that convert each type to each other type. Functions that look like:
celsiusToFahrenheit :: Celsius -> Fahrenheit
celsiusToKelvin :: Celsius -> Kelvin
...
rankineToKelvin :: Rankine -> Kelvin
rankineToCelsius :: Rankine -> Celsius
A diagram for these conversions looks like this:
That’s a lot of conversions! One might argue that it’s manageable, but it certainly doesn’t meet requirement #1 that implementing a new unit would require minimal work; to implement a new conversion, you’d need to define many conversion functions as well! There must be a better way.
Let’s think back to chemistry class. You’re tasked with converting litres to hours or somesuch. Did you do that in one operation? No, you used a bunch of intermediate conversion to get to what you needed. If you know that X litres are Y dollars, and Z dollars is one 1 hour, then you know how many litres are in 1 hour! These are called conversion factors.
Luckily for us, our conversions are much simpler. For any temperature unit, if we can convert it to and from celsius, then we can convert it to and from any other unit! Let’s define a typeclass for Temperature
:
class Temperature a where
toCelsius :: a ->
Celsius
fromCelsius :: Celsius ->
a
value :: a ->
Double
scaleName :: a ->
String
convert :: (Temperature b) =>
a ->
b
convert f = fromCelsius $ toCelsius f
Our Temperature
typeclass has five functions: functions to convert to and from celsius, a function to get the value from a unit, a function to get the name of a unit, and a final function convert
. This final function has a default implementation that converts a unit to celsius, then from celsius. Using the type inferrer, this will convert any unit to any other unit!
convert Rankine 811 :: Kelvin
convert Celsius 123 :: Fahrenheit
convert Kelvin 10000 :: RelativeToHeck
Now to implement a new temperature, you only need to implement four functions, as convert
is a sane solution for all cases. This arrangement gives us a conversion diagram that looks like:
Much better. Let’s go through our Temperature
implementations for our types:
instance Temperature Celsius where
toCelsius c = c
fromCelsius c = c
value (Celsius c) = c
scaleName _ = "Celsius"
Of course Celsius
itself has to implement Temperature
. It’s implementation is trivial though; no work needs to be done.
instance Temperature Fahrenheit where
toCelsius (Fahrenheit f) = Celsius ((5 / 9) * (f - 32))
fromCelsius (Celsius c) = Fahrenheit ((9 / 5) * c + 32)
value (Fahrenheit f) = f
scaleName _ = "Fahrenheit"
Now things are heating up. The conversion functions are identical to the C implementation.
instance Temperature Kelvin where
toCelsius (Kelvin k) = Celsius (k - 273.15)
fromCelsius (Celsius c) = Kelvin (c + 273.15)
value (Kelvin k) = k
scaleName _ = "Kelvin"
The Kelvin
implementation looks much like the Fahrenheit
one.
instance Temperature Rankine where
toCelsius (Rankine r) = toCelsius $ Fahrenheit (r - 459.67)
fromCelsius c = Rankine $ 459.67 + value (fromCelsius c :: Fahrenheit)
value (Rankine r) = r
scaleName _ = "Rankine"
The conversion between Fahrenheit and Rankine is much simpler than the conversion between Celsius and Rankine; therefore I will do just that. After converting to and from Fahrenheit, it’s a simple matter of calling toCelsius
and fromCelsius
.
Bringing The Monads
Now that the easy part is done, we get to create the table. Our table should have as many columns as it needs to display an arbitrary number of conversions. To that end, let’s define a data structure or two:
data ConversionTable = ConversionTable [String]
[[TableRowElem]] deriving (Show)
data TableRowElem = From Double | To Double
| NullConv Double
| Empty deriving (Show)
The ConverstionTable
, like the name suggests, is our table. The list of strings is the header, and the list of lists of TableRowElem
are our conversions. Why not just have a list of Double
? We need to have our cells contain information on what they mean.
To that end, I created a TableRowElem
type. From
is an original value, To
is a converted value, NullConv
represents the case were we convert from some type to the same type, and Empty
is an empty cell. The problem of how to place elements into this data structure still remains however. To solve that, things are going to get a bit monadic. Let’s define some intermediate builder types:
type Conversion a b = (a, b)
toConversion :: (Temperature a, Temperature b) =>
a ->
(a, b)
toConversion a = (a, convert a)
First we have Conversion
, and the corresponding toConversion
function. This simply takes a unit, and places it in a tuple with its corresponding conversion. Next, we have the TableBuilder
:
type TableBuilder a = WriterT [[TableRowElem]]
(State [String]) a
Here we have a WriterT
stacked on top of a State
monad. The writer transformer contains the list of table rows, and the state monad contains the header. The idea is that as rows are “logged” into the writer, the header is checked to make sure no new units were introduced. To this end, if only two units are introduced, the table will have two columns. If 100 units are used, then the table will have 100 columns.
NOTE: I realize that WriterT
and State
are not in the standard library. I only promised to limit the usage of libraries for Haskell solutions. This means avoiding the use of things like Parsec or Happstack. Frameworks and libraries that vastly simplify some problem or change the way you approach it. To this end, if I feel a monad transformer or anything along these lines are appropriate to a problem, I will use them. I’ll try to point out when I do though. Besides, I could have just re-implemented these things, but in the interest of not being a bad person and re-inventing the wheel, I’ve decided to use a wheel off the shelf.
So, how do we use this TableBuilder
? I’ve defined a function for use with this monad:
insertConv :: (Temperature a, Temperature b) =>
Conversion a b ->
TableBuilder ()
insertConv (a, b) =
do oldHeader <- lift $ get
workingHeader <- ensureElem a oldHeader
finalHeader <- ensureElem b workingHeader
lift $ put finalHeader
tell [buildRow a b finalHeader []]
where ensureElem a h = return $ case ((scaleName a) `elem` h)
of True -> h
False -> h ++ [(scaleName a)]
buildRow _ _ [] r = r
buildRow a b (h:xs) r
| (scaleName a) == (scaleName b) && (scaleName a) == h = r ++ [NullConv $ value a]
| (scaleName a) == h = buildRow a b xs (r ++ [From $ value a])
| (scaleName b) == h = buildRow a b xs (r ++ [To $ value b])
| otherwise = buildRow a b xs (r ++ [Empty])
Yeah, that one is kind of a doosey. Let me walk you through it. This function takes a Conversion
, and returns a TableBuilder
.
In the first four lines of the do
block, we update the header. We lift
the State
monad, then get
we call ensureElem
with the first and second units, then we put
the new updated header back into the State
monad.
The ensureElem
function checks the header list to see if the current unit is a member. If it is, the header list is returned unchanged, if it’s not the unit is appended to the end and the new list is returned. In this way, whenever a conversion is added to the table, the header is updated.
After updating the header, we call tell
with the result of buildRow
, “writing” the row into the Writer
monad. The buildRow
function recursively adds TableRowElem
s to the result list depending on the current heading. In this way, conversions are placed in the appropriate column.
In addition to that function, I’ve defined a function to simplify working with the TableBuilder
:
buildTable :: TableBuilder a ->
ConversionTable
buildTable b = let result = runState (runWriterT b) []
in ConversionTable (snd result)
(snd $ fst result)
Working with some of these MTL
monads can be confusing for people coming from imperative backgrounds. I’ve been working with Haskell for almost a year now and I still get extremely confused by them. It can take some muddling through haddoc pages to work them out, but the good news is that you mainly just need to define one function that takes a monad (in the form of a do
block), and returns a whatever. The buildTable
function takes a TableBuilder
, and returns a ConversionTable
. It handles calls to runState
and runWriterT
, and then unwraps the resulting tuple and builds the ConversionTable
.
This function can be called like this:
buildTable $ do insertConv someConversion
insertConv someOtherConversion
… and so on. The only thing to remember is that the final value of a
for the do
block must be ()
. Conveniently, insertConv
return a value of type TableBuilder ()
, so if the last call is to this function, then you are good. You can also always end it with return ()
if you like.
Pretty Printing
Finally, we have the matter of printing a nice pretty table. For that, we need yet another function:
prettyPrint :: ConversionTable ->
String
prettyPrint (ConversionTable h r) = let widestCol = last $ sort $ map length h
columnCount = length h
doubleCell = printf ("%-" ++ (show widestCol) ++ ".1f")
stringCell = printf ("| %-" ++ (show widestCol) ++ "s |")
emptyCell = replicate widestCol ' '
horizontalR = (replicate (((widestCol + 4) * columnCount) + 2) '-') ++ "\n"
formatRow row = "|" ++ (concat $ map formatCell row) ++ "|\n"
formatCell (From from) = "| " ++ (doubleCell from) ++ " |"
formatCell (To to) = "> " ++ (doubleCell to) ++ " |"
formatCell Empty = "| " ++ emptyCell ++ " |"
formatCell (NullConv nc) = "| " ++ (doubleCell nc) ++ " |"
in horizontalR
++ ("|" ++(concat $ map stringCell h) ++ "|\n")
++ horizontalR
++ (concat $ map formatRow (normalizeRowLen (columnCount) r))
++ horizontalR
where normalizeRowLen len rows = map (nRL' len) rows
where nRL' len' row
| (length row) < len' = nRL' len' (row ++ [Empty])
| otherwise = row
Yeah… Sometimes the littlest things take the most work. You’d think all this plumbing we’ve been doing would be the most complecated bit, but you’d be wrong. Let’s try to make sense of this mess function by function:
widestCol = last $ sort $ map length h
This function determines the widest column based on the header. Typically, this is going to be “Fahrenheit”, but it doesn’t have to be. It should be noted that if a data cell is wider than this, then the pretty printer will mess up. Like most things in life, there is room for improvement here. That said, unless you’re converting the temperature of the core of the sun, you probably won’t have an issue here.
columnCount = length h
Returns the number of columns in the table. Used by the horizontal rule function.
doubleCell = printf ("%-" ++ (show widestCol) ++ ".1f")
Ahh, our old friend printf
. It exists in Haskell and works in much the same way as it did in C. The doubleCell
function converts a temperature value to a string, left aligns it, pads it by widestCol
, and has it show one decimal place.
stringCell = printf ("| %-" ++ (show widestCol) ++ "s |")
Much like with doubleCell
, this function pads, and left-aligns a string. This is used by the header.
emptyCell = replicate widestCol ' '
This one is pretty self-explanatory. It prints an empty cell of the appropriate width.
horizontalR = (replicate (((widestCol + 4) * columnCount) + 2) '-') ++ "\n"
This function prints a horizontal rule. This will be a solid line of “-” across the width of the table.
formatRow row = "|" ++ (concat $ map formatCell row) ++ "|\n"
This function formats a table data row. It maps formatCell
over the list of cells, flattens it, then adds a pretty border around it.
formatCell (From from) = "| " ++ (doubleCell from) ++ " |"
formatCell (To to) = "> " ++ (doubleCell to) ++ " |"
formatCell Empty = "| " ++ emptyCell ++ " |"
formatCell (NullConv nc) = "| " ++ (doubleCell nc) ++ " |"
In this function, much of the work is done. It formats the cell using doubleCell
or emptyCell
, the applies a border to the cell. It denotes a cell containing a To
by adding a >
on the left.
Now that we’ve covered the let
-bound functions, let’s talk about the actual function body:
horizontalR
concat $ map stringCell h) ++ "|\n")
horizontalR
concat $ map formatRow (normalizeRowLen (columnCount) r))
horizontalR
This bit is prett straightforward. First, it prints a horizontal line. Second, it maps stringCell
over the header list, flattens it, and gives it a border. Third it prints another horizontal line. Fourth is maps formatRow
over the normalized row list, then flattens it. Finally, one last horizontal line. After this is all said and done, it concats it all together.
You may be wondering about that normalizeRowLen
function. If you were paying particularly close attention to the insertConv
function, you may have noticed an issue. Let’s walk through it in ghci:
*Main> let fc = toConversion (Fahrenheit 100) :: (Fahrenheit, Celsius)
*Main> buildTable $ do insertConv fc
ConversionTable ["Fahrenheit","Celsius"] [[From 100.0,To 37.77777777777778]]
We add one conversion, we get two columns. Everything seems to be in order here, but let’s add another conversion and see what happens:
*Main> let fc = toConversion (Fahrenheit 100) :: (Fahrenheit, Celsius)
*Main> let cr = toConversion (Celsius 100) :: (Celsius, Rankine)
*Main> buildTable $ do {insertConv fc; insertConv cr;}
ConversionTable ["Fahrenheit","Celsius","Rankine"] [[From 100.0,To 37.77777777777778],[Empty,From 100.0,To 671.6700000000001]]
See the problem? Let’s add some newlines to make it clearer:
ConversionTable ["Fahrenheit","Celsius","Rankine"]
[[From 100.0,To 37.77777777777778],
[Empty,From 100.0,To 671.6700000000001]]
As we add more columns, the rows with less columns are never updated to have the new column count. Logically, this is fine, since the extra entries would just be Empty
anyways, but our pretty printer would print this table like so:
--------------------------------------------
|| Fahrenheit || Celsius || Rankine ||
--------------------------------------------
|| 100.0 |> 37.8 ||
|| || 100.0 |> 671.7 ||
--------------------------------------------
As you add more and more columns, the problem gets worse and worse. Enter our normalizeRowLen
function:
normalizeRowLen len rows = map (nRL' len) rows
where nRL' len' row
| (length row) < len' = nRL' len' (row ++ [Empty])
| otherwise = row
This is another fairly straightforward function. If the row has the same number of columns as the header, it is returned unchanged. If it doesn’t, Empty
is added to the end until it does.
With that, our program is complete. Let’s try it out:
main = do k <- return (toConversion $ Kelvin 100 :: (Kelvin, Rankine))
f <- return (toConversion $ Fahrenheit 451 :: (Fahrenheit, Kelvin))
r <- return (toConversion $ Rankine 234 :: (Rankine, Celsius))
c <- return (toConversion $ Celsius 9 :: (Celsius, Fahrenheit))
nc <- return (toConversion $ Rankine 123 :: (Rankine, Rankine))
putStrLn $ prettyPrint $ buildTable $ do insertConv k
insertConv f
insertConv r
insertConv c
insertConv nc
In our main
, we create a bunch of conversions. Then we prettyPrint
them and putStrLn
the result. The following will be printed to the console:
----------------------------------------------------------
|| Kelvin || Rankine || Fahrenheit || Celsius ||
----------------------------------------------------------
|| 100.0 |> 180.0 || || ||
|> 505.9 || || 451.0 || ||
|| || 234.0 || |> -143.2 ||
|| || |> 48.2 || 9.0 ||
|| || 123.0 || || ||
----------------------------------------------------------
Any type that implements Temperature
can be put into a table this way. To add a new unit to the program, it’s as easy as implementing four one-line functions!
“Haskell Doesn’t Have Loops”
Among the first thing any prospective Haskeller learns is that “Haskell doesn’t have loops.” You know what I say to that?
Challenge Accepted.
While it’s true that loops are not a primitive control structure in Haskell like they are in imperative languages, that doesn’t mean we can’t add them. But what should a loop look like? In C, a while
loop looks like this:
while (some_predicate() != TRUE)
{
do_stuff();
}
It’s basically the same in any C-like language; first, you type the type of loop (while
, in this case), then you type some predicate that evaluates to a boolean value, then on the next line, an arbitrary block of code.
I feel that it would be in poor taste to just define a function while
that takes another function that represents the block of code to execute. It feels out of line with the spirit of the problem. Luckily for us, Haskell’s do
notation has a very “arbitrary block of code” feel to it. But that requires us to use a monad…
Which brings us to the root of the issue: what exactly do we *do* in loops? Basically, while the predicate is false, we mutate some state, then test the predicate again. If the predicate is still false, we mutate some more state until it evaluates true. The key word there is state.
Take a look at my sample loop above. The loop calls two functions: some_predicate
, and do_stuff()
. Neither function takes any arguments. If this were Haskell, then we’d have a problem, because the loop would never exit. If a function always returns the same result given the same input, then a function with no input will always only return one thing. some_predicate
would always either return True
(resulting the the loop not being entered), or False
, resulting in an infinite loop. However, in C this loop probably works fine. This is because C has state.
To this end, my Haskell loop must operate on a State
monad. Now, with this out of the way, let’s jump to the implementation!
The Implementation
So, how would we implement a loop in Haskell? I think a good source of inspiration here is the Monad
typeclass. Monads have a very “language-feature” feeling to them, but they are not primitives. Monads are a typeclass, some functions, and some data types. Similarly, I’ve created a Loop
typeclass:
class Loop a where
exec :: a -> a
The Loop
typeclass is very simple. It defines one function, exec
that takes a loop, and returns another loop of the same type. Different types of loops behave differently, so it’s up to the implementor to determine how the loop should function. For this example, I’ve implemented the while
loop. Here is my While
type:
data While a = While {whilePred :: (a -> Bool),
whileState :: a,
whileBody :: State a ()}
My While
type has three fields. The whileState field is a state value for use by the loop’s state monad. The whileBody field is the State
monad itself. Finally the whilePred field is a predicate function that is used to determine if the loop should terminate.
Next, we have an implementation for class Loop
:
instance Loop (While a) where
exec w = let predicate = (whilePred w) (whileState w)
s' = execState (whileBody w) (whileState w)
w' = While {whileState = s',
whilePred = (whilePred w),
whileBody = (whileBody w)}
in case (predicate) of True -> w
False -> exec w'
It’s a little verbose, but it’s not too bad when you break it down. The meat of the function is in the case
statement: we test the predicate, by calling whilePred
on whileState
. If the predicate returns True
, we return the current While
object. If it evaluates False
, we iterate once by calling execState
using whileState
as the initial state, and whileBody
as the State monad. When that’s done, we construct a new While
object with the old State
monad, old predicate function, and new state, and then recursively call exec
. This continues until the predicate evaluates True
. Like any good while loop, this could potentially go on forever.
So, where does that leave us? We have a while loop type, and a function to “run” the loop. So, as it stands, we have to build the While
, then call exec
on the While
, then finally call whileState
on the While
to get the result. Doesn’t really feel like a C-like while
loop, does it? Well, we’re not done yet. Let’s define one last function:
while :: a -> (a -> Bool) -> (State a ()) -> a
while s p b = whileState $
exec (While {whileState = s,
whilePred = p,
whileBody = b})
This function is where the magic happens. This function approximates the while
statement from C. By calling this function, we can perform a “while loop” that looks like so:
while initialState
predicateFunction
$ do oldState <- get
newState <- mutate oldState
put newState
The while
function will return the final mutation of initialState
. Here’s an example main
function:
main = do res <- return $
while []
(\a -> (length a) == 10)
$ do old <- get
new <- return $
old ++ [(length old)]
put new
sequence_ $ map (putStrLn . show) res
In this example, we perform a while
. Our initial state is an empty list. Our predicate tests that the length of the list is equal to 10. Then we enter our State
block where we append the length of the list to the end of the list. Finally, after the loop exits we map (putStrLn . show)
over the resulting list to print each element of the list to the console. The following will be displayed:
0
1
2
3
4
5
6
7
8
9
Ladies and Gentlemen, it walks like a duck, and quacks like a duck. Therefore, it must be a duck.
Architecture of Evil: Writing C In Haskell
It’s been a long road, but today our hard work pays off. You can set globals in Haskell! No more fussing about with State
monads, I’m talking big boy non-thread-safe globals. This glorious piece of functionality thanks to the Data.IORef
package.
someGlobal = unsafePerformIO $ newIORef "foo"
This is how you set a global. We create a global called someGlobal
, and give it the initial value of foo
. You’ll notice I wrap that in unsafePerformIO
. Like the name implies, it is perfectly ok* to do this, after all: the IO Monad
exists only to annoy you, and serves no actual purpose.
* Seriously though, don’t ever do this; if I catch you doing this in real code, then we are no longer friends.
This global can be used like so:
main = do g1 <- readIORef someGlobal
putStrLn $ show $ g1
writeIORef someGlobal "bar"
g2 <- readIORef someGlobal
putStrLn $ show $ g2
return ()
Running this code produces the following output:
"foo"
"bar"
Rejoice!
On A Serious Note
Now that we’ve all had a good laugh; please don’t ever do this. I discovered this module while messing around with the haskell bindings to OpenGL. If you’ve ever used OpenGL for even a second, you surely noticed that there is crazy amounts of state being tracked. Given this, and the fact that nobody is just going to make a “Haskell replacement for OpenGL”, an IORef
seems like a reasonable alternative to some monolithic State
monad transformer stack.
However, aside from supporting legacy C libraries, I can see no good reason to ever use IORef
instead of State
Happstack’s “Helpful” Exception Handler: A Better Solution
Recently I wrote about Happstack’s mismanagement of exceptions. I offered up a solution thab basically amounted to swallowing the exception and returning mzero
. As a follow-up to that post, I’d like to offer a better solution.
I believe Happstack is on the right track. An uncaught exception does in fact represent an internal server error
, not a file not found
, however Happstack misses the mark in a few ways. First, it prints the exception message, which for reasons I went into last week, is bad. And second, it doesn’t provide a mechanism to override this behavior.
It was suggested to me that I could place a monad on top of the ServerPartT
transformer stack that re-implements fail
. This represents a few issues for me. If I were to do this, I wouldn’t be able to use the basic ServerPart
monad, greatly complicating my code as this locks me out of all library functions that take this type. Secondly, it’s an unnessesary level of indirection. Why should I add a monad to the stack that does nothing except re-implement fail
? Why can’t I configure Happstack to use a custom exception response?
So, What Are You Going To Do About It?
Good question. I’ve come up with a solution that I’m fairly satisfied with. At the heart of this issue are IO
actions that throw exceptions. To perform an IO
action inside of a ServerPart
, you must lift ServerPart
‘s IO
monad. This is done like so:
foo <- liftIO $ bar
This snippet executes the IO
action bar
and binds its result to foo
. Since, typically speaking, only IO
actions throw, this represents a good place to start working. We can insert a call to try
here, but this can get tedious given enough IO
actions. Why don’t we create a function?
tryIO :: Maybe (SomeException -> ServerPart Response)
-> IO a
-> ServerPart a
tryIO mf io = do result <- liftIO $ try io
case (result) of Right good -> return good
Left exception -> handle exception mf
where handle exception (Just handler) = escape $ handler exception
handle _ Nothing = mzero
Meet my tryIO
function. This function wraps liftIO
and try
into one nice and neat function. It even takes an optional response generator. Let’s take a look at this bit-by-bit.
tryIO :: Maybe (SomeException -> ServerPart Response)
-> IO a
-> ServerPart a
This function takes a function that takes a SomeException
, and returns a ServerPart Response
. This function is wrapped in a Maybe
. The function also takes an IO
action, and returns a ServerPart a
. Basically, this works like a specialized version of liftIO
. Unlike a vanilla liftIO
, it can take an optional exception handler.
tryIO mf io = do result <- liftIO $ try io
case (result) of Right good -> return good
Left exception -> handle exception mf
Here we have the meat of this function. First, the IO
action is wrapped in a call to try
. Next this is promoted to a ServerPart
via the call to liftIO
. The result of this will be Either
a SomeException
, or a valid result. We use a case
statement to examine the result, and either return the good result, or call handle
and pass it the exception and the exception handler.
where handle exception (Just handler) = escape $ handler exception
handle _ Nothing = mzero
Finally, we handle the exception. If an exception handler was passed in, it’s called, and then the response is returned to Happstack via escape
. If no handler was passed in, the function returns mzero
.
The function escape
is very useful here. This function is basically like returning mzero
, only instead of the response handler failing, it returns a ServerPart Response
immediately. This allows you to break out of a deeply nested function easily without having to manually return from each function.
The result
The result of this new function is that I can use tryIO
as I would have used liftIO
in a ServerPart
function. I can either call…
foo <- tryIO Nothing bar
…and have my response handler fail with mzero
if an exception is thrown, or I can call…
foo <- tryIO (Just baz) bar
…and have my response handler fail and return the result of baz
. baz
could even do some cleanup and try to call the initial response handler over again if it wanted to! Currently, when an IO
action fails in my application, users see this lovely error message:
Much better. Unless you’re a hacker!