Archive | Magic

# Pure Functional Memoization

There are many computation problems that resonate through the ages. Important Problems. Problems that merit a capital P.

The Halting Problem…

P versus NP…

The Fibonacci sequence…

The Fibonacci sequence is a super-important computation Problem that has vexed mathematicians for generations. It’s so simple!

``````fibs 0 = 0
fibs 1 = 1
fibs n = (fibs \$ n - 1) + (fibs \$ n - 2)``````

Done! Right? Let’s fire up ghci and find out.

``````*Fibs> fibs 10
55``````

… looking good…

``````*Fibs> fibs 40

``````

… still waiting…

``````
102334155
``````

… Well that sure took an unfortunate amount of time. Let’s try 1000!

``````*Fibs> fibs 1000
*** Exception: <<You died before this returned>>``````

To this day, science wonders what `fibs(1000)` is. Well today we solve this!

## Memoization

The traditional way to solve this is to use Memoization. In an imperative language, we’d create an array of size n, and prepopulate `arr = 0` and `arr = 1`. Next we’d loop over 2 to n, and for each we’d set `arr[i] = arr[i-1] + arr[i-2]`.

Unfortunately for us, this is Haskell. What to do… Suppose we had a map of the solutions for 0 to i, we could calculate the solution for i + 1 pretty easily right?

``````fibsImpl _ 0 = 0
fibsImpl _ 1 = 1
fibsImpl m i = (mo + mt)
where
mo = Map.findWithDefault undefined (i - 1) m
mt = Map.findWithDefault undefined (i - 2) m``````

We return 0 and 1 for i = 0 and i = 1 as usual. Next we lookup n – 1 and n – 2 from the map and return their sum. This is all pretty standard. But where does the map come from?

It turns out that this is one of those times that laziness is our friend. Consider this code:

``````fibs' n = let m = fmap (fibsImpl m)
(Map.fromList (zip [0..n]
[0..n])) in
Map.findWithDefault undefined n m``````

When I first saw this pattern (which I call the Wizard Pattern, because it was clearly invented by a wizard), I was completely baffled. We pass the thing we’re creating into the function that’s creating it? Unthinkable!

It turns out that this is just what we need. Because of laziness, the fmap returns immediately, and `m` points to an unevaluated thunk. So, for i = 0, and i = 1, fibsImpl will return 0 and 1 respectively, and the map will map 0 -> 0 and 1 -> 1. Next for i = 2, Haskell will attempt to lookup from the map. When it does this, it will be forced to evaluate the result of i = 0 and i = 1, and it will add 2 -> 1 to the map. This will continue all the way through i = n. Finally, this function looks up and returns the value of fibs n in linearish time. (As we all know, Map lookup isn’t constant time, but this is a lot better than the exponential time we had before)

So let’s try it out.

``````*Fibs> fibs' 1
1
*Fibs> fibs' 10
55
*Fibs> fibs' 40
102334155``````

… so far so good…

``````*Fibs> fibs' 100
354224848179261915075
*Fibs> fibs' 1000
43466557686937456435688527675040625802564660517371780402481729089536555417949051890403879840079255169295922593080322634775209689623239873322471161642996440906533187938298969649928516003704476137795166849228875
*Fibs> fibs' 10000

*Fibs> fibs' 100000
``````

Neat. Even that last one only took a few seconds to return!

# Making C++ Exceptions Less Terrible

If you’re a fan of doingmyprogramming, you’ve likely heard me opine on the subject of exceptions. If you’re new around here, I’ll spare you a link to some required reading and just let you know: they’re poop. With the exception of Java and it’s checked exceptions, they just represent secret ways that a function can crash your whole program. Just about nobody gets it right, not C++, not C#, and the worst offender is probably my beloved Haskell where you can’t even catch them unless you’re in the IO monad.

C++ used to have an annotation (`throw()`), where you could specify what exceptions a function throws, and if none are specified then the function cannot throw. This was considered poor form and deprecated long before I came on the scene. It turns out that there is good news though; `throw()` wasn’t forgotten! It was replaced by `noexcept`, and that is what we’ll be talking about today.

## What is noexcept?

noexcept is an annotation that basically says “this function will never throw an exception”. You can think of it as being like const. If you some class method:

``void myClass::isPure() const;``

… then you know that, whatever it does, the state of the object will not be mutated. Similarly, if you see some function:

``void neverThrows() noexcept;``

… then you know that, whatever it does, the function will never throw an exception. Optimization implications aside, this is great because if you see a function declared `noexcept`, then you know for sure that you don’t have to put it in a try/catch block. If we ever start living in a world where people use this annotation, then a function not marked `noexcept` is a function that likely throws, and we can act accordingly without having to read through mountains of likely incorrect or incomplete documentation!

Specifically, the meaning of noexcept (according to Microsoft) is:

noexcept ( and its synonym noexcept(true)) specify that the function will never throw an exception or allow an exception to be propagated from any other function that it invokes either directly or indirectly. More specifically, noexcept means the function is noexcept only if all the functions that it calls are also noexcept or const, and there are no potentially evaluated dynamic casts that require a run-time check, typeid expressions applied to a glvalue expression whose type is a polymorphic class type, or throw expressions.

## The Burning Quetion

So I read that blurb, and thought to myself: “…only if all the functions it calls are noexcept? So, I can’t catch all possible exceptions and be noexcept?”. I felt despair wash over me as C++ got something wrong yet again. Then I thought “there’s no way that’s correct… To the Googler!”

Of course, the Googler didn’t help my case. I decided to give it a shot. First, let’s consider two functions: one that throws and one that is `noexcept`:

``````static void doesThrow() noexcept(false)
{
std::cerr << "Entering doesThrow..." << std::endl;
throw 42;
std::cerr
<< "Still in doesThrow. Up is down, cats and dogs living together!"
<< std::endl;
}

static void catchesAll() noexcept(true)
{
std::cerr << "About to enter the try/catch..." << std::endl;
try
{
doesThrow();
}
catch (...)
{
std::cerr << "Caught the exception..." << std::endl;
}
std::cerr << "Exited the try/catch..." << std::endl;
}``````

`doesThrow` is annotated `noexcept(false)`, which is the same as saying “This function can throw an exception.” And sure enough, it throws in 100% of its code paths.

`catchesAll` is annotated `noexcept(true)`, which is just the long form of `noexcept`. `noexcept` accepts arguments that evalutate to `bool`, and compile-time magic happens to allow conditional `noexcept`-ness. `noexcept(expr)` can also be used in an if/then/else to conditionally do something based on if some `expr` is `noexcept`.

In `catchesAll`, we have a try/catch block that prevents the exception thrown by `doesThrow()` from propagating out of `catchesAll()`. Does this work? Let’s write a test harness and see:

``````int main(int argc, char ** argv)
{
std::cerr << "About to call catchesAll..." << std::endl;
catchesAll();
std::cerr << "Managed to not die!" << std::endl;
}``````

… if we compile it with gcc (with `-std=c++11`) and run it, we see that all is well!

``````About to call catchesAll...
About to enter the try/catch...
Entering doesThrow...
Caught the exception...
Exited the try/catch...
Managed to not die!``````

You may be wondering what happens if we don’t catch the exception. Let’s find out! If we remove the try/catch, but leave everything else the same:

``````About to call catchesAll...
About to enter the try/catch...
Entering doesThrow...
terminate called after throwing an instance of 'int'
Aborted (core dumped)``````

We see that the exception propagated out of `catchesAll()` and caused a core dump. This is the expected behavior. In fact, there’s no way we could have prevented this. If some function marked `noexcept` throws, then that function has a bug. If you wrote it, you have to fix it. If somebody else wrote it, then they have to fix it. There’s no band-aid that will keep it from crashing your program. If you find this happening often, then you should consider finding an alternative library that does what the broken dependency claims to do, since they clearly don’t know what they’re doing…

# Demystifying Turing Reductions

Among the many computation theory topics that seem completely inexplicable are Turing Reductions. The basic idea here is that there are certain problems that have been proven to be unsolvable. Given that these problems exist, we can prove other prove that other problems are unsolvable by reducing these problems to a known unsolvable problem.

## Turing Machines?

A Turing Machine is a theoretical device. It has an infinitely long tape, divided up into cells. Each cell has a symbol written on it. The Turing Machine can do a few things with this tape: it can read from a cell, write to a cell, move the tape one cell to the left, or move the tape one cell to the right. If the Turing Machine tries to move the cursor off the left side of the tape, the head won’t move. Additionally, the Turing Machine can halt and accept or reject. A Turing Machine has a starting configuration: the tape head begins on the leftmost end of the tape, and the tape has an input written on it.

What the Turing Machine does is implementation specific; one Turing Machine may accept a different input than another. This implementation has a mathematical definition, but we care about the high level definition, that looks similar to this:

``````M = "On input w:
1. Read the first cell.
2. If the cell is not blank, accept
3. If the cell is blank, output "Was empty"
onto the tape then reject"``````

This Turing Machine will accept if there is something written on the tape. If the tape is initially empty, it will write “Was empty” onto the tape and reject. This Turing Machine is a decider because it always halts (I.E. has no infinite loop). We say that the Language of M is the set of all non-empty strings, as it accepts all inputs except the empty string. As you can see, this looks very much like a function written in a programming language. I could implement this in Haskell:

``````m :: String -> (String, Bool)
m "" = ("Was empty", False)
m w = (w, True)``````

Consider the following Turing Machine:

``````M = "On input w:
1. Move the head to the leftmost tape cell.
2. Read from the cell
3. If the cell is blank, accept
4. If the cell is not blank, move the head one cell to
the right
5. Go to step 2"``````

This Turing Machine will accept all inputs including the empty string. It’s language is the set of all strings. In Haskell, this might look like this:

``````m :: String -> Bool
m "" = True
m (c:w) = m w``````

Spot the bug? What happens if we call this on an infinite list?

``m ['a'..]``

That’s right, infinite recursion. Despite the bad implementation, this is still a valid Turing Machine. We call this a recognizer. It halts on all inputs that are in the language, and it either halts and rejects, or doesn’t halt on all inputs not in the langauge. A language is decidable if there is a decider for it.

## Decidability is Undecidable

Much like higher-order functions in programming, Turing Machines can be used as the input to other Turing Machines. Take the following:

``````A_TM = "On input <M, w> =
1. If M is not the encoding of a Turing Machine, reject
2. Simulate M on w, if M accepts w accept. If M rejects
w, reject``````

This very simple Turing Machine accepts the encoding of a Turing Machine, and a string, and if the types are right, returns the result of running the machine on the string. In Haskell:

``````aTM :: (String -> (String, Bool))
-> String -> (String, Bool)
aTM m w = m w``````

Seems straight-forward enough; A_TM takes a Turing Machine, and a string and runs them. This is a recognizer because all Turing Machines are at least recognizers, but is it a decider? To figure that out, we need some more tooling.

Much like in programming, if function `a` calls function `b`, and `b` might infinitely loop, then `a` can also infinitely loop. But let’s suppose we have a function that can take a function as an argument, and get its result, that is guaranteed to never infinitely loop:

``````alwaysReturns :: (String -> (String, Bool))
-> String -> (String, Bool)
alwaysReturns f arg = -- Magic``````

It doesn’t matter how this function is implemented, it just magically can get the result of calling `f` with `arg` and it will return the correct result 100% of the time, without running it. We can think of it as some perfect static analysis tool. What a world, right? Now suppose I wrote the following function:

``````negate :: (String -> (String, Bool)) -> (String, Bool)
negate f = (result, (not accRej))
where
(result, accRej) = alwaysReturns f (show f)``````

Ignoring the fact that functions don’t have `Show` instances, let’s see what’s going on there. The function `negate` takes a function as an argument, and calls `alwaysReturns` on the function, and with `show`‘d function as its argument. `negate` then negates the accept/reject result and returns the opposite. In other words, if `alwaysReturns f (show f)` returns `(result, True)`, then `negate` returns `(result, False)`. This function will never infinitely loop thanks to `alwaysReturns`. So, what happens if I do this?

``negate negate``

Let’s follow the logic: `negate` delegates to `alwaysReturns`, which does stuff, then returns the result of `negate negate`. Next negate returns the opposite of what `alwaysReturns` said it would. Thus, `alwaysReturns` did not correctly determine the output of `negate`. Because of this example, we can definitively know that a function that always correctly decides another function can’t exist. Thus there is at least one Turing Machine that is not decidable, and A_TM cannot be a decider.

## So, How About Those Reductions?

Now that we have that all out of the way, let’s talk reductions. It was a bit convoluted, but we’ve proven that the decidability of Turing Machines is undecidable. How can we use this? It turns out that we can use this to prove other problems undecidable. Let’s use my Turing Machines as functions idea to talk about some other undecidable problems.

#### If a TM Halts is Undecidable

Suppose we had the following function:

``````alwaysHalts :: (String -> (String, Bool)) -> Bool
alwaysHalts f = -- Magic``````

This function returns True if the passed-in function will never infinitely loop. If this function existed, then we could implement a function to decide any Turing Machine:

``````perfectDecider :: (String -> (String, Bool))
-> String -> (String, Bool))
perfectDecider f w
| alwaysHalts w = f w
| otherwise = (w, False)``````

This function first tests to see if it’s safe to call `f`, and if so, returns `f w`. If it’s not safe to call `f`, it just returns `(w, False)`. However, we already proved that this function couldn’t exist, so the thing that makes it possible must also be impossible.

Suppose we had a Turing Machine `R` that is an implementation of `alwaysHalts`, that accepts all Turing Machines over `<M, w>` that halt. The Turing Machine language implementation of `perfectDecider` would look like this:

``````M = "On input <M, w>:
1. Simulate TM R on <M, w>, if R rejects, reject
2. Simulate M on w. If M accepts, accept.
If M rejects, reject``````

As you can see, the logic is similar. We run `R` on `<M, w>`. If `R` rejects, that means that `M` didn’t halt, so we reject. Otherwise we return the result of running `M` on `w`.

#### If the Language of a TM is empty is undecidable

Suppose we had the following function:

``````lIsEmpty :: (String -> (String, Bool)) -> Bool
lIsEmpty f = -- Magic``````

This function will return True if the language of the provided function is empty, and False if not. Similar to the halting problem above, we want to implement `perfectDecider`. But how would we do that?

``````perfectDecider :: (String -> (String, Bool))
-> String -> (String, Bool))
perfectDecider f w = not (lIsEmpty emptyAsReject)
where
emptyAsReject w' =
if (w' == w)
then f w'
else False``````

So, what’s going on here? We have a function that can decide if the language of a TM is empty. So if we want to construct a decider using this, we need to exploit this property. We write a closure that has this property. The function `emptyAsReject` rejects any string that does not equal `w` automatically. Then, it runs `f w'`. Thus, if `f` rejects `w'`, then the language of `emptyAsReject` is empty. otherwise it contains the single string `w`. Thus we treat the empty set as False, and anything else as True.

We can use this method to prove any arbitrary problem undecidable.

#### {<M> | M is a Turing Machine, and the language of M is {“happy”, “time”, “gatitos”}} is undecidable

In the culmination of this long-winded post, we’ll prove that it is undecidable if a Turing Machine’s language is {“happy”, “time”, “gatitos”}. As before, we’ll assume we have the following function:

``````lIsHappyTimeGatitos :: (String -> (String, Bool)) -> Bool
lIsHappyTimeGatitos f = -- Magic``````

We’ll use this to solve the ultimate problem in computer science!

``````perfectDecider :: (String -> (String, Bool))
-> String -> (String, Bool))
perfectDecider f w = lIsHappyTimeGatitos hgtAsAccept
where
hgtAsAccept "happy" = True
hgtAsAccept "time" = True
hgtAsAccept "gatitos" = f w
hgtAsAccept _ = False``````

Here, we construct a closure that has the desired property language if `f` accepts `w`, and does not have the desired language if `f` rejects `w`. `hgtAsAccept` always accepts “happy” and “time”, but only accepts “gatitos” if `f w == True`. Thus, if `f` does not accept `w`, then `lIsHappyTimeGatitos` will reject `hgtAsAccept`, and vice versa.

## Well, When You Put It That Way…

Math is nice and all, but we’re programmers. I feel that this is a concept that is not that hard, but when math runes get involved, things get difficult. I think you can see how this works, and grasp the concept. Thinking of these reductions as functions helped me, and I hope it helps you too.

# I Wonder… Trinary Operators in Haskell

Often, while out for my daily run, I think about programming. Sometimes I think about my project, sometimes I think about upcoming projects, sometimes I think about the blog. Then there’s the times where I think about completely silly things like “I wonder if you can make an operator that takes three arguments?”

Wouldn’t that be grand.

I suppose I could just google it, but where’s the fun in that? Let’s give it a shot!

``````(+++) :: (Num a)
=> a
-> a
-> a
-> a
(+++) a b c = a + b + c``````

Here I’ve created a fairly straight-forward function: It takes three arguments, and adds them, returning the sum. Let’s load this up in `ghci` and see if it barfs.

``````Prelude> :l SuperSum.hs
[1 of 1] Compiling SuperSum             ( SuperSum.hs, interpreted )
Ok, modules loaded: SuperSum.
*SuperSum>``````

…so far so good, let’s put it through the paces!

``````*SuperSum> :t (+++)
(+++) :: Num a => a -> a -> a -> a``````

…this checks out, let’s call it as prefix…

``````*SuperSum> (+++) 1 2 3
6``````

…check! Let’s partially apply it with two arguments like a regular operator…

``````*SuperSum> :t 1 +++ 2
1 +++ 2 :: Num a => a -> a``````

…makes sense, this returns a function that takes a `Num` and returns a `Num`. Let’s quit beating around the bush and call it!

``````*SuperSum> 1 +++ 2 3

<interactive>:21:3:
No instance for (Num a0) arising from a use of `+++'
The type variable `a0' is ambiguous
Possible fix: add a type signature that fixes these type variable(s)
Note: there are several potential instances:
instance Num Double -- Defined in `GHC.Float'
instance Num Float -- Defined in `GHC.Float'
instance Integral a => Num (GHC.Real.Ratio a)
-- Defined in `GHC.Real'
...plus three others
In the expression: 1 +++ 2 3
In an equation for `it': it = 1 +++ 2 3

<interactive>:21:7:
No instance for (Num (a1 -> a0)) arising from the literal `2'
Possible fix: add an instance declaration for (Num (a1 -> a0))
In the expression: 2
In the second argument of `(+++)', namely `2 3'
In the expression: 1 +++ 2 3

<interactive>:21:9:
No instance for (Num a1) arising from the literal `3'
The type variable `a1' is ambiguous
Possible fix: add a type signature that fixes these type variable(s)
Note: there are several potential instances:
instance Num Double -- Defined in `GHC.Float'
instance Num Float -- Defined in `GHC.Float'
instance Integral a => Num (GHC.Real.Ratio a)
-- Defined in `GHC.Real'
...plus three others
In the first argument of `2', namely `3'
In the second argument of `(+++)', namely `2 3'
In the expression: 1 +++ 2 3``````

Yikes! It’s like we’re doing some Java! Well, that message is certainly unhelpful, let’s see what ghci thinks the type of this is:

``````*SuperSum> :t 1 +++ 2 3
1 +++ 2 3 :: (Num a, Num (a1 -> a), Num a1) => a -> a``````

Also unhelpful. As far as I can tell, the issue here is precedence. Basically, if we add some parenthesis or a dollar sign, this will work just fine:

``````*SuperSum> (1 +++ 2) 3
6
*SuperSum> :t (1 +++ 2) 3
(1 +++ 2) 3 :: Num a => a
*SuperSum> 1 +++ 2 \$ 3
6
*SuperSum> :t 1 +++ 2 \$ 3
1 +++ 2 \$ 3 :: Num a => a``````

When we are explicit about our precedence, the functions work as expected, and we get an “infix function with more than two arguments”. The moral of the story? You can do it, but you shouldn’t.

Now that we’ve solved this mystery by ourselves, let’s see if there’s any documentation on the issue.

Google turns up very little on the subject, however the first result seems promising. From the bottom of that page:

for a function taking more than two arguments, you can do it but it’s not nearly as nice

Now let us never speak of this again…

# K&R Challenge 3 and 4: Functional Temperature Conversion

The other day, I implemented the C solution to exercises 3 and 4 in The C Programming Language, today I’ll be implementing the Haskell solution. As a reminder, the requirements are:

Modify the temperature conversion program to print a heading above the table

… and …

Write a program to print the corresponding Celsius to Fahrenheit table.

I could take the easy way out, and implement the Haskell solution almost identically to the C solution, replacing the `for` loop with a call to `map`, but that’s neither interesting to do, nor is it interesting to read about. I’ll be taking this problem to the next level.

## Requirements

For my temperature program, I’d like it to be able to convert between any arbitrary temperature unit. For the purposes of this post, I will be implementing Celsius, Fahrenheit, Kelvin (equivalent to Celsius, except 0 degrees is absolute zero, not the freezing point of water), and Rankine (the Fahrenheit version of Kelvin).

That said, nothing about my solution should rely on the fact that these four temperature scales are being implemented. A programmer should be able to implement a new temperature unit with minimal changes to the program. Ideally, just by implementing a new type.

Additionally, given a bunch of conversions, the program should be able to output a pretty table showing any number of temperature types and indicating what converts to what.

## First, Conversion

This problem is clearly broken up into two sub-problems: conversion, and the table. First, we need to handle conversion. This being Haskell, the right answer is likely to start by defining some types. Let’s create types for our units:

``````newtype Celsius = Celsius Double deriving (Show)
newtype Fahrenheit = Fahrenheit Double deriving (Show)
newtype Kelvin = Kelvin Double deriving (Show)
newtype Rankine = Rankine Double deriving (Show)``````

I’ve chosen to make a type for each unit, and since they only contain one constructor with one field, I use a `newtype` instead of a `data`. Now, how to convert these? A straightforward solution to this would be to define functions that convert each type to each other type. Functions that look like:

``````celsiusToFahrenheit :: Celsius -> Fahrenheit
celsiusToKelvin :: Celsius -> Kelvin

...

rankineToKelvin :: Rankine -> Kelvin
rankineToCelsius :: Rankine -> Celsius``````

A diagram for these conversions looks like this: That’s a lot of conversions! One might argue that it’s manageable, but it certainly doesn’t meet requirement #1 that implementing a new unit would require minimal work; to implement a new conversion, you’d need to define many conversion functions as well! There must be a better way.

Let’s think back to chemistry class. You’re tasked with converting litres to hours or somesuch. Did you do that in one operation? No, you used a bunch of intermediate conversion to get to what you needed. If you know that X litres are Y dollars, and Z dollars is one 1 hour, then you know how many litres are in 1 hour! These are called conversion factors.

Luckily for us, our conversions are much simpler. For any temperature unit, if we can convert it to and from celsius, then we can convert it to and from any other unit! Let’s define a typeclass for `Temperature`:

``````class Temperature a where
toCelsius :: a ->
Celsius
fromCelsius :: Celsius ->
a
value :: a ->
Double
scaleName :: a ->
String
convert :: (Temperature b) =>
a ->
b
convert f = fromCelsius \$ toCelsius f``````

Our `Temperature` typeclass has five functions: functions to convert to and from celsius, a function to get the value from a unit, a function to get the name of a unit, and a final function `convert`. This final function has a default implementation that converts a unit to celsius, then from celsius. Using the type inferrer, this will convert any unit to any other unit!

``````convert Rankine 811 :: Kelvin
convert Celsius 123 :: Fahrenheit
convert Kelvin 10000 :: RelativeToHeck``````

Now to implement a new temperature, you only need to implement four functions, as `convert` is a sane solution for all cases. This arrangement gives us a conversion diagram that looks like: Much better. Let’s go through our `Temperature` implementations for our types:

``````instance Temperature Celsius where
toCelsius c = c
fromCelsius c = c
value (Celsius c) = c
scaleName _ = "Celsius"``````

Of course `Celsius` itself has to implement `Temperature`. It’s implementation is trivial though; no work needs to be done.

``````instance Temperature Fahrenheit where
toCelsius (Fahrenheit f) = Celsius ((5 / 9) * (f - 32))
fromCelsius (Celsius c) = Fahrenheit ((9 / 5) * c + 32)
value (Fahrenheit f) = f
scaleName _ = "Fahrenheit"``````

Now things are heating up. The conversion functions are identical to the C implementation.

``````instance Temperature Kelvin where
toCelsius (Kelvin k) = Celsius (k - 273.15)
fromCelsius (Celsius c) = Kelvin (c + 273.15)
value (Kelvin k) = k
scaleName _ = "Kelvin"``````

The `Kelvin` implementation looks much like the `Fahrenheit` one.

``````instance Temperature Rankine where
toCelsius (Rankine r) = toCelsius \$ Fahrenheit (r - 459.67)
fromCelsius c = Rankine \$ 459.67 + value (fromCelsius c :: Fahrenheit)
value (Rankine r) = r
scaleName _ = "Rankine"``````

The conversion between Fahrenheit and Rankine is much simpler than the conversion between Celsius and Rankine; therefore I will do just that. After converting to and from Fahrenheit, it’s a simple matter of calling `toCelsius` and `fromCelsius`.

## Bringing The Monads

Now that the easy part is done, we get to create the table. Our table should have as many columns as it needs to display an arbitrary number of conversions. To that end, let’s define a data structure or two:

``````data ConversionTable = ConversionTable [String]
[[TableRowElem]] deriving (Show)

data TableRowElem = From Double | To Double
| NullConv Double
| Empty deriving (Show)``````

The `ConverstionTable`, like the name suggests, is our table. The list of strings is the header, and the list of lists of `TableRowElem` are our conversions. Why not just have a list of `Double`? We need to have our cells contain information on what they mean.

To that end, I created a `TableRowElem` type. `From` is an original value, `To` is a converted value, `NullConv` represents the case were we convert from some type to the same type, and `Empty` is an empty cell. The problem of how to place elements into this data structure still remains however. To solve that, things are going to get a bit monadic. Let’s define some intermediate builder types:

``````type Conversion a b = (a, b)

toConversion :: (Temperature a, Temperature b) =>
a ->
(a, b)
toConversion a = (a, convert a)``````

First we have `Conversion`, and the corresponding `toConversion` function. This simply takes a unit, and places it in a tuple with its corresponding conversion. Next, we have the `TableBuilder`:

``````type TableBuilder a = WriterT [[TableRowElem]]
(State [String]) a``````

Here we have a `WriterT` stacked on top of a `State` monad. The writer transformer contains the list of table rows, and the state monad contains the header. The idea is that as rows are “logged” into the writer, the header is checked to make sure no new units were introduced. To this end, if only two units are introduced, the table will have two columns. If 100 units are used, then the table will have 100 columns.

NOTE: I realize that `WriterT` and `State` are not in the standard library. I only promised to limit the usage of libraries for Haskell solutions. This means avoiding the use of things like Parsec or Happstack. Frameworks and libraries that vastly simplify some problem or change the way you approach it. To this end, if I feel a monad transformer or anything along these lines are appropriate to a problem, I will use them. I’ll try to point out when I do though. Besides, I could have just re-implemented these things, but in the interest of not being a bad person and re-inventing the wheel, I’ve decided to use a wheel off the shelf.

So, how do we use this `TableBuilder`? I’ve defined a function for use with this monad:

``````insertConv :: (Temperature a, Temperature b) =>
Conversion a b ->
TableBuilder ()
insertConv (a, b) =
do oldHeader <- lift \$ get
lift \$ put finalHeader
tell [buildRow a b finalHeader []]
where ensureElem a h = return \$ case ((scaleName a) `elem` h)
of True -> h
False -> h ++ [(scaleName a)]
buildRow _ _ [] r = r
buildRow a b (h:xs) r
| (scaleName a) == (scaleName b) && (scaleName a) == h = r ++ [NullConv \$ value a]
| (scaleName a) == h = buildRow a b xs (r ++ [From \$ value a])
| (scaleName b) == h = buildRow a b xs (r ++ [To \$ value b])
| otherwise = buildRow a b xs (r ++ [Empty])``````

Yeah, that one is kind of a doosey. Let me walk you through it. This function takes a `Conversion`, and returns a `TableBuilder`.

In the first four lines of the `do` block, we update the header. We `lift` the `State` monad, then `get` we call `ensureElem` with the first and second units, then we `put` the new updated header back into the `State` monad.

The `ensureElem` function checks the header list to see if the current unit is a member. If it is, the header list is returned unchanged, if it’s not the unit is appended to the end and the new list is returned. In this way, whenever a conversion is added to the table, the header is updated.

After updating the header, we call `tell` with the result of `buildRow`, “writing” the row into the `Writer` monad. The `buildRow` function recursively adds `TableRowElem`s to the result list depending on the current heading. In this way, conversions are placed in the appropriate column.

In addition to that function, I’ve defined a function to simplify working with the `TableBuilder`:

``````buildTable :: TableBuilder a ->
ConversionTable
buildTable b = let result = runState (runWriterT b) []
in ConversionTable (snd result)
(snd \$ fst result)``````

Working with some of these `MTL` monads can be confusing for people coming from imperative backgrounds. I’ve been working with Haskell for almost a year now and I still get extremely confused by them. It can take some muddling through haddoc pages to work them out, but the good news is that you mainly just need to define one function that takes a monad (in the form of a `do` block), and returns a whatever. The `buildTable` function takes a `TableBuilder`, and returns a `ConversionTable`. It handles calls to `runState` and `runWriterT`, and then unwraps the resulting tuple and builds the `ConversionTable`.

This function can be called like this:

``````buildTable \$ do insertConv someConversion
insertConv someOtherConversion``````

… and so on. The only thing to remember is that the final value of `a` for the `do` block must be `()`. Conveniently, `insertConv` return a value of type `TableBuilder ()`, so if the last call is to this function, then you are good. You can also always end it with `return ()` if you like.

## Pretty Printing

Finally, we have the matter of printing a nice pretty table. For that, we need yet another function:

``````prettyPrint :: ConversionTable ->
String
prettyPrint (ConversionTable h r) = let widestCol = last \$ sort \$ map length h
columnCount = length h
doubleCell = printf ("%-" ++ (show widestCol) ++ ".1f")
stringCell = printf ("| %-" ++ (show widestCol) ++ "s |")
emptyCell = replicate widestCol ' '
horizontalR = (replicate (((widestCol + 4) * columnCount) + 2) '-') ++ "\n"
formatRow row = "|" ++ (concat \$ map formatCell row) ++ "|\n"
formatCell (From from) = "| " ++ (doubleCell from) ++ " |"
formatCell (To to) = "> " ++ (doubleCell to) ++ " |"
formatCell Empty = "| " ++ emptyCell ++ " |"
formatCell (NullConv nc) = "| " ++ (doubleCell nc) ++ " |"
in horizontalR
++ ("|" ++(concat \$ map stringCell h) ++ "|\n")
++ horizontalR
++ (concat \$ map formatRow (normalizeRowLen (columnCount) r))
++ horizontalR
where normalizeRowLen len rows = map (nRL' len) rows
where nRL' len' row
| (length row) < len' = nRL' len' (row ++ [Empty])
| otherwise = row``````

Yeah… Sometimes the littlest things take the most work. You’d think all this plumbing we’ve been doing would be the most complecated bit, but you’d be wrong. Let’s try to make sense of this mess function by function:

``widestCol = last \$ sort \$ map length h``

This function determines the widest column based on the header. Typically, this is going to be “Fahrenheit”, but it doesn’t have to be. It should be noted that if a data cell is wider than this, then the pretty printer will mess up. Like most things in life, there is room for improvement here. That said, unless you’re converting the temperature of the core of the sun, you probably won’t have an issue here.

``columnCount = length h``

Returns the number of columns in the table. Used by the horizontal rule function.

``doubleCell = printf ("%-" ++ (show widestCol) ++ ".1f")``

Ahh, our old friend `printf`. It exists in Haskell and works in much the same way as it did in C. The `doubleCell` function converts a temperature value to a string, left aligns it, pads it by `widestCol`, and has it show one decimal place.

``stringCell = printf ("| %-" ++ (show widestCol) ++ "s |")``

Much like with `doubleCell`, this function pads, and left-aligns a string. This is used by the header.

``emptyCell = replicate widestCol ' '``

This one is pretty self-explanatory. It prints an empty cell of the appropriate width.

``horizontalR = (replicate (((widestCol + 4) * columnCount) + 2) '-') ++ "\n"``

This function prints a horizontal rule. This will be a solid line of “-” across the width of the table.

``formatRow row = "|" ++ (concat \$ map formatCell row) ++ "|\n"``

This function formats a table data row. It maps `formatCell` over the list of cells, flattens it, then adds a pretty border around it.

``````formatCell (From from) = "| " ++ (doubleCell from) ++ " |"
formatCell (To to) = "> " ++ (doubleCell to) ++ " |"
formatCell Empty = "| " ++ emptyCell ++ " |"
formatCell (NullConv nc) = "| " ++ (doubleCell nc) ++ " |"``````

In this function, much of the work is done. It formats the cell using `doubleCell` or `emptyCell`, the applies a border to the cell. It denotes a cell containing a `To` by adding a `>` on the left.

Now that we’ve covered the `let`-bound functions, let’s talk about the actual function body:

``````horizontalR
concat \$ map stringCell h) ++ "|\n")
horizontalR
concat \$ map formatRow (normalizeRowLen (columnCount) r))
horizontalR``````

This bit is prett straightforward. First, it prints a horizontal line. Second, it maps `stringCell` over the header list, flattens it, and gives it a border. Third it prints another horizontal line. Fourth is maps `formatRow` over the normalized row list, then flattens it. Finally, one last horizontal line. After this is all said and done, it concats it all together.

You may be wondering about that `normalizeRowLen` function. If you were paying particularly close attention to the `insertConv` function, you may have noticed an issue. Let’s walk through it in ghci:

``````*Main> let fc = toConversion (Fahrenheit 100) :: (Fahrenheit, Celsius)
*Main> buildTable \$ do insertConv fc
ConversionTable ["Fahrenheit","Celsius"] [[From 100.0,To 37.77777777777778]]``````

We add one conversion, we get two columns. Everything seems to be in order here, but let’s add another conversion and see what happens:

``````*Main> let fc = toConversion (Fahrenheit 100) :: (Fahrenheit, Celsius)
*Main> let cr = toConversion (Celsius 100) :: (Celsius, Rankine)
*Main> buildTable \$ do {insertConv fc; insertConv cr;}
ConversionTable ["Fahrenheit","Celsius","Rankine"] [[From 100.0,To 37.77777777777778],[Empty,From 100.0,To 671.6700000000001]]``````

See the problem? Let’s add some newlines to make it clearer:

``````ConversionTable ["Fahrenheit","Celsius","Rankine"]
[[From 100.0,To 37.77777777777778],
[Empty,From 100.0,To 671.6700000000001]]``````

As we add more columns, the rows with less columns are never updated to have the new column count. Logically, this is fine, since the extra entries would just be `Empty` anyways, but our pretty printer would print this table like so:

``````--------------------------------------------
|| Fahrenheit || Celsius    || Rankine    ||
--------------------------------------------
|| 100.0      |> 37.8       ||
||            || 100.0      |> 671.7      ||
--------------------------------------------``````

As you add more and more columns, the problem gets worse and worse. Enter our `normalizeRowLen` function:

``````normalizeRowLen len rows = map (nRL' len) rows
where nRL' len' row
| (length row) < len' = nRL' len' (row ++ [Empty])
| otherwise = row``````

This is another fairly straightforward function. If the row has the same number of columns as the header, it is returned unchanged. If it doesn’t, `Empty` is added to the end until it does.

With that, our program is complete. Let’s try it out:

``````main = do k <- return (toConversion \$ Kelvin 100 :: (Kelvin, Rankine))
f <- return (toConversion \$ Fahrenheit 451 :: (Fahrenheit, Kelvin))
r <- return (toConversion \$ Rankine 234 :: (Rankine, Celsius))
c <- return (toConversion \$ Celsius 9 :: (Celsius, Fahrenheit))
nc <- return (toConversion \$ Rankine 123 :: (Rankine, Rankine))

putStrLn \$ prettyPrint \$ buildTable \$ do insertConv k
insertConv f
insertConv r
insertConv c
insertConv nc``````

In our `main`, we create a bunch of conversions. Then we `prettyPrint` them and `putStrLn` the result. The following will be printed to the console:

``````----------------------------------------------------------
|| Kelvin     || Rankine    || Fahrenheit || Celsius    ||
----------------------------------------------------------
|| 100.0      |> 180.0      ||            ||            ||
|> 505.9      ||            || 451.0      ||            ||
||            || 234.0      ||            |> -143.2     ||
||            ||            |> 48.2       || 9.0        ||
||            || 123.0      ||            ||            ||
----------------------------------------------------------``````

Any type that implements `Temperature` can be put into a table this way. To add a new unit to the program, it’s as easy as implementing four one-line functions!

# Maybe I Should Be In The Maybe Monad

If you’ve spent any time with Haskell, then you’ve surely encountered `Maybe`. `Maybe` is Haskell’s version of testing your pointer for `NULL`, only its better because it’s impossible to accidentally dereference a `Nothing`.

You’ve also probably thought it was just so annoying. You test your `Maybe a` to ensure it’s not `Nothing`, but you still have to go about getting the value out of the `Maybe` so you can actually do something with it.

## The Problem

Let’s forgo the usual contrived examples and look at an actual problem I faced. While working on the Server Console, I was faced with dealing with a query string. The end of a query string contains key/value pairs. Happstack conveniently decodes these into this type:

``[(String, Input)]``

I needed to write a function to lookup a key within this pair, and return it’s value. As we all know, there’s no way to guarantee that a given key is in the list, so the function must be able to handle this. There are a few ways we could go about this, but this seems to me to be an ideal place to use a `Maybe`. Suppose we write our lookup function like so:

``lookup :: Request -> String -> Maybe String``

This is logically sound, but now we have an annoying `Maybe` to work with. Suppose we’re working in a `ServerPart Response`. We might write a response function like so:

``````handler :: ServerPart Response
handler = do req <- askRq
paths <- return \$ rqPaths req
page <- return \$ lookup req "page_number"
case page of Nothing -> mzero
(Just a) -> do items <- return \$ lookup req "items_per_page"
case items of Nothing -> mzero
(just b) -> h' paths a b``````

Yucky! After each call to lookup, we check to see if the call succeeded. This gives us a giant tree that’s surely pushing off the right side of my blog page. There must be a better way.

## Doing It Wrong

Shockingly, this is not the best way to do this. It turns out that writing our functions in the `Maybe` monad is the answer. Take the following function:

``````hTrpl :: Request -> Maybe ([String], String, String)
hTrpl r = do paths <- return \$ rqPaths r
page <- lookup r "page_number"
items <- lookup r "items_per_page"
return (paths, page, items)``````

… now we can re-write `handler` like so:

``````handler :: ServerPart Response
handler = do req <- askRq
triple <- return \$ hTrpl req
case triple of Nothing -> mzero
(Just (a, b, c)) -> h' a b c``````

Much better, right? But why don’t we have to test the return values of `lookup`? The answer to that question lies in the implementation of `Maybe`‘s `>>=` operator:

``````instance  Monad Maybe  where
(Just x) >>= k = k x
Nothing  >>= _ = Nothing``````

Recall that `do` notation is just syntactic sugar around `>>=` and a whole bunch of lambdas. With that in mind, you can see that we are actually binding functions together. Per `Maybe`‘s bind implementation, if you bind `Just a` to a function, it calls the function on `a`. If you bind `Nothing` to a function, it ignores the function, and just returns `Nothing`.

What this means for us is that so long as we’re inside the `Maybe` monad, we can pretend all functions return successful values. `Maybe` allows us to defer testing for failure! The first time a `Nothing` is returned, functions stop getting called, so we don’t even have to worry about performance losses from not immediately returning from the function! So long as we’re inside of `Maybe`, there will be Peace On Earth. We code our successful code branch, and then when all is said and done and the dust has settled, we can see if it all worked out.

Next time you find yourself testing a `Maybe` more than once in a function, ask yourself: should I be in the `Maybe` monad right now?

# An Intro To Parsec

Taking a break from the Photo Booth, I’ve been working on a re-write of dmp_helper. If you’ve been following me for a while, you may remember that as a quick, dirty hack I did in perl one day to avoid manually writing out HTML snippets. While it mostly works, it’s objectively terrible. I guess if my goal is to never get a job doing Perl (probably not the worst goal to have) then it might serve me well.

But I digress. `DmpHelper`, the successor to `dmp_helper`, is going to be written in Haskell, and make extensive use of the Parsec parser library. The old dmp_helper makes extensive use of regular expressions to convert my custom markup to HTML, but I’d prefer to avoid them here. I find regular expressions to be hard to use and hard to read.

So I set off to work. And not 3 lines into `main :: IO ()`, I came across a need to parse something! This is something that has been parsed many times before, but it’s good practice so it’s a wheel I’ll be re-inventing. I’m talking of course about parsing the ArgV!

## Some Choices To Make

But before I get to that, I need to make some choices about how my arguments will look. I’ve decided to use the “double dash” method. Therefore, all flags will take the form of `--X`, where X is the flag. Any flag can take an optional argument, such as `--i foo`, all text following a flag is considered to belong the the preceeding flag until another flag is encountered. Some flags are optional, and some are mandatory. The order flags appear is not important.

So far, I have 3 flags implemented: `--i [INPUT_FILE]`, the file to be converted, `--o [OUTPUT_FILE]`, the location to save the resulting HTML, and `--v`, which toggles on verbose mode. Other flags may be added in as development progresses.

## Data ArgV

Next, I’ll create a type for my ArgV.

``````data ArgV = ArgV {inputFile :: FilePath,
outputFile :: FilePath,
isVerbose :: Bool} deriving (Show)``````

As you can see, my type has three fields, which line up with my requirements above.

## Introducing Parsec

Parsec is a fairly straight-forward library to use. It has a lot of operators and functions, whcih require some thought. However they all amount to basic functions that do a specific parsing task. Your job is to tell these functions what to do. So let’s get right down to it.

``````parseArgV :: [String] -> Either ParseError ArgV
parseArgV i = parse pArgV "" \$ foldArgV i

foldArgV :: [String] -> String
foldArgV i = foldl' (++) [] i``````

Here we have two functions to get us started. The function `foldArgV` collapses the list of strings into a single string, to be parsed. Because of the way that arguments are parsed by the operating system, the argument string `--i foo --o bar --v` will be collapsed to `--ifoo--obar--v`. This is good for us because this means we don’t have to worry about parsing white space.

The second function, parseArgV is our entry point into Parsec. All of Parsec’s combinator functions return a `Parser` monad. The function `parse` parses the data in the parser monad and returns either an error, or a whatever. In our case, it’s parsing to an `ArgV`.

Parse takes 3 arguments: a parser, a “user state”, and a string to parse. The first and third are pretty self explanatory, but the second is a mystery. Unfortunately I don’t have an answer for you as to it’s nature. I spent a good hour or two researching it yesterday, and the best I could come up with is “don’t worry about it”. None of the tutorials use it, and the documentation basically just says “this is state. Feel free to pass an empty string, it’ll be fine”. I’ve decided I won’t worry my little head about it for the time being.

``````pArgV :: CharParser st ArgV
pArgV = do
permute \$ ArgV <\$\$> pInputFile
<||> pOutputFile
<|?> (False, pIsVerbose)``````

This function is the true heart of my parser. I’d like to draw your attention to the invocation of `permute`. This function implements the requirement that the order of arguments shouldn’t matter. Ordinarily, parsec will linearly go through some input and test it. But what do you do if you need to parse something that doesn’t have a set order? The library `Text.Parsec.Perm` solves this problem for us.

The function permute has a difficult type signature to follow, so a plain-english explanation is in order. Permute is used in a manner that resembles an applicative functor. Permute takes a function, and then using it’s operators, it takes functions that return a parser for the type of each field in the function in a pseudo-applicative style. These functions don’t need to come in the order that they’re parsed in, but in the order that the initial function expects them to be. Permute will magically parse them in the correct order and return a parser for your type. Let’s break this down line-by-line:

``````...
permute \$ ArgV <\$\$> pInputFile
...``````

Here we call permute on the constructor for `ArgV`. We feed it the first function, `pInputFile` using the `<\$\$>` operator. This operator is superficially, and logically similar to the applicative functor’s `<\$>` operator. It creates a new permutation parser for the type on the left using the function on the right.

``````...
<||> pOutputFile
...``````

Here we feed the second parser function, `pOutputFile` to the permutation parser using the `<||>` operator. This operator is superficially, and logically similar to applicative’s `<*>` operator.

``````...
<|?> (False, pIsVerbose)
...``````

Finally, we feed it a parser for `isVerbose`. The operator `<|?>` works the same as the `<||>` operator, except that this operator is allowed to fail. If `<||>` fails, the whole parser fails. If `<|?>` fails, then the default value (False, in this case) is used. This allows us to have optional parameters.

Moving along from here, here are some extra parsing functions we need.

``````pInputFile :: CharParser st FilePath
pInputFile = do
try \$ string "--i"
manyTill anyChar pEndOfArg``````

This parser parses the `--i` parameter. The first line of the do expression parses the input to see if it is `--i`. Because it is wrapped in a call to `try`, it will only consume input if it succeeds. Since we don’t actually need this text, we don’t bind it to a name. If the parser had failed, it would exit the function and not evaluated the second line. On the second line, we take all the text until `pEndOfArg` succeeds. This text is what is returned from the function.

``````pEndOfArg :: CharParser st ()
pEndOfArg = nextArg <|> eof
where nextArg = do
lookAhead \$ try \$ string "--"
return ()``````

This function detects if the parser is at the end of an argument. Notice the `<|>` operator. This is the choice operator. Basically, if the parser on the left fails, it runs the parser on the right. You can think of it like a boolean OR.

The parser `eof` is provided by parsec, and succeeds if there is no more input. The function `lookAhead` is the inverse of `try`. It only consumes input if the parser fails. By wrapping a `try` inside of a `lookAhead`, we create a parser that never consumes input.

The parser functions `pOutputFile` and `pIsVerbose` are almost identical to `pInputFile`, so I’m not going to bother typeing them out here.