Archive | Undefined Behavior RSS for this section

K&R Challenge 2: The Specter of Undefined Behavior

Continuing with the K&R Challenge, today we’ll be doing exercise 2:

Experiment to find out what happens when printf‘s argument string contains \c, where c is some character not listed above.

Basically, we’ll be trying to print “\c” to the screen, and seeing what C does.

In C

The code for this is very simple:

int main (int argc, char ** argv) { printf("Trying to print: \c\n"); return 0; }

Like last time, we’ll compile it using:

gcc -Wall ex2.c -o ex2

Right away, you should get a compiler warning:

ex2.c: In function ‘main’: ex2.c:10:11: warning: unknown escape sequence: '\c' [enabled by default] printf("Trying to print: \c\n");

Notice that it is a warning, not an error, and that the compilation still succeeded and produced an executable. My friends, you have just witnessed Undefined Behavior. Undefined behavior is one of the defining aspects of C. Technically, it’s a feature, not a bug, and it is both a blessing and a curse.

It’s a blessing in that it’s one of the things that enables C’s unmatched performance. It’s a curse in that the compiler is allowed by the spec to do whatever it wants when it encounters undefined behavior: it could do something really helpful, it could segfault, or it could delete your hard drive. It could even unleash lions in your house and there’s nothing the police or congress could do about it.

Moral of the story: don’t use undefined behavior unless you really know what you’re doing. Even then, I’d argue it’s a code smell.

So, what happens when you run the compiled executable? Always fearless in the pursuit of truth, I ran it. The following was printed to the console:

Trying to print: c

The GNU C Compiler decided to be nice and just strip the “\”, then print the message. It’s a good thing I have managed to avoid angering Richard Stallman, because gcc could very well have kidnapped my wife. That would have been a problem.

And In Haskell

Nothing about this exercise relies on the behavior of printf, therefore even though Haskell has it, I’ve elected not to use it. The implementation of this exercise in Haskell looks like this:

main :: IO () main = putStrLn "Trying to print: \c\n"

As before, this can be run using:

runhaskell ex2.hs

…but when you try it, the following error is displayed, and compilation fails:

ex2.hs:2:62: lexical error in string/character literal at character 'c'

Where C allowed the undefined escape sequence as undefined behavior, Haskell throws a compilation error. This is a Good Thing. Undefined behavior is a prolific source of bugs, and while I understand why it’s allowed, I can’t say that I agree that the tradeoff is worth it. Luckily for us, this isn’t a feature unique to Haskell. Most high level languages don’t have undefined behavior. Unfortunately, this is one of the reasons that most programming languages are orders of magnitude slower than C.

Regardless, undefined behavior is a fact of life for the C programmer. It’s one of the tradeoffs they make to use the language. These design decisions were made decades ago, and are unlikely to change. For this reason, this is a good exercise.

The families of those who were turned to stone by their compilers after doing this exercise are in our prayers…