10 things I learnt diving in the functional programming deep end – with Haskell

Haskell-Beginner-Guide1.png

Having recently joined Pusher as a Junior Platform Engineer, I had my hands full from day one; cleaning up infrastructure code, writing integration tests, implementing client certificates and a fully blown Haskell project. Yes, Haskell! My colleague, Will, and I were tasked with replacing our existing integration test framework written in Ruby with a more \[…\]

Introduction

Having recently joined Pusher as a Junior Platform Engineer, I had my hands full from day one; cleaning up infrastructure code, writing integration tests, implementing client certificates and a fully blown Haskell project. Yes, Haskell! My colleague, Will, and I were tasked with replacing our existing integration test framework written in Ruby with a more composable and extensible one written in Haskell. While I was excited at the prospect of learning a new paradigm of programming, I did shudder a bit.

Coming from an imperative background, there was nothing to be done except to jump in and get my hands dirty. Here is a list of things (with a few examples) that I found awesome, and hopefully provide a jumping off point for your own exploration!

Perspective

Haskell is a unique language. Compared to the set of tools that are usually a part of a programmer’s arsenal, Haskell offers new ways to think when writing software. We renounce ideas that might seem fundamental and ingrained. For example, we abandon ideas such as having a for loop. There is a stark contrast in the way OO languages and Haskell should be looked at. In OO languages, we try to answer the question “What can we do with data?” whereas in functional languages we try to answer the question “How is data constructed?”. It is tough adapting to this, but learning Haskell is a hugely rewarding experience! The next few sections shed some light on concepts that are unique to Haskell and other functional languages.

Think more and type less

Haskell is an exhaustive and occasionally exhaust_ing_ language. It is elegant and has a very terse syntax particularly due to its high level concepts. It forces you to think intensively before writing code. The ability to add clean and simple abstractions make Haskell very declarative and comprehensible.

It takes some getting used to the fact that the compiler insists on resolving every little thing which doesn’t make sense. Since Haskell is lazy, and has a strict type system, it is tough to drop in debug statements or assert the order of their evaluation. But once compilation errors have been fixed, the program will usually work on the first try!

Modularity and Robustness

While I still need a few more projects under my belt to completely vouch for this, I can see how Haskell leads to modularity in code. Haskell’s strong and flexible type system encourages writing components that can be used in several places without duplication. Function purity makes testing code easier. Functions can be tested in isolation since they do not have access to the global state. Furthermore, it is easy to define expected effects because they are all trivial in what they return. The type system makes sure that the code is safe and tightly knit. Haskell programs are comparatively smaller than their imperative counterparts, making them easier to maintain, with fewer bugs. Our integration test framework was built keeping modularity in mind, and it was the biggest reason to consider a rewrite in its entirety.

Laziness, Immutability and Purity

Haskell is lazy. There is no particular order in which functions will be evaluated, they’re evaluated only when required. Variables in Haskell are immutable, unlike imperative languages. In imperative languages, we can set a variable x to 5, do something with it and update its value. In Haskell however, that is not the case. Functions are limited only to taking some input values and producing an output, which might seem strange and limiting at first. A function called twice with the same arguments will always produce the same result. This is called referential transparency. It is easy to assert that a function is correct, and consequently build several small functions and glue them together. Functions do not talk to the external world and are pure. There is a clear distinction between code that is pure and code that performs IO operations. Pure functions cannot corrupt the state of the system when evaluated. That is, they do not have any side-effects. They are consistent and profoundly affect the way in which we write programs.

Type System

Haskell types are introduced fairly quickly in most tutorials. A type can be thought of as a category which each expression fits into. For example, True is a boolean and "hey" is a string. Haskell’s type system is extremely powerful. I found that it leads to writing code that has fewer errors since type errors surface at compile time. After passing the initial hurdle of getting used to the type system, it is brilliant to work with. The type system is strict and quite punishing at times about what it expects.

1printToScreen :: String -> IO ()
2printToScreen word = putStrLn word

On calling the above function with printToScreen 5, Haskell will reply with an error

1<interactive>:3:15:
2    No instance for (Num String) arising from the literal ‘5’
3    In the first argument of ‘printToScreen’, namely ‘5’
4    In the expression: printToScreen 5
5    In an equation for ‘it’: it = printToScreen 5

This is because Haskell expects you to pass a String, but in reality a Num is passed to it. In a language like Ruby, for example, this would have worked just fine. Due to this type strictness, writing error free code becomes easier!

It is also easy to use generics, where you can define one function that works for several types. The compiler will infer the actual types using type inference. There is no need to explicitly label code with a type because the type system is smart enough to figure it out. A good example of this would be for data structures where the type of elements does not matter. This allows for code reusability while maintaining the strong type safety, something that might be an issue in other languages. No one likes duplicated code!

1pair :: [a] -> [b] -> [(a, b)]
2pair l1 l2 = [(x, y) | x <- l1, y <- l2]

This function takes two lists of any type and pairs them up. The function definition uses list comprehension to accomplish this. We have bound the list l1 to x and l2 to y which then draws x and y from lists l1 and l2 respectively, for every element. Haskell’s type inference will infer the type of elements in the list automatically. As an example, we could call this function with

1ghci> pair [1, 2, 3, 4] ["a", "b", "c"]
2[(1,"a"),(1,"b"),(1,"c"),(2,"a"),(2,"b"),(2,"c"),(3,"a"),(3,"b"),(3,"c"),(4,"a"),(4,"b"),(4,"c")]

Haskell will infer that the type of the first list is an [Num] and the second list as [String] to produce a list of type Num t => [(t, [Char])]. [Char] is a list of characters, which is how String types in Haskell are referred to internally.

Function currying

My eyes lit up when I learnt about currying functions, because it allows partial application. You can have a function that takes four parameters, call it with just two parameters and then pass it around to another function. You get a new function that takes the remaining two arguments. While this is used behind the scenes in Haskell extensively, knowing when to use it helps writing better code. The notion behind this comes from the fact that everything in Haskell is a function. The example below might further illustrate this.

1multThree :: (Num a) => a -> a -> a -> a  
2multThree x y z = x * y * z
3
4ghci> let multWithNine = multThree 9  
5ghci> multWithNine 2 3  
654

In the example above, the multThree function multiplies three numbers. We have an intermediate binding multWithNine where we apply one argument. We then apply the remaining two arguments to it to give us the final result. Now, if we want another result that is multiplied by 9, we can just use multWithNine and apply another set of arguments. These are called higher order functions. A higher order function can take functions as parameters and return functions as return values. It is extremely easy to use functions by applying them partially and passing them around!

Recursion

There are no loops in Haskell. Recursion is the only way to iterate and it is awesome! I found myself thinking recursively quite often to accomplish something. While this takes getting used to, it is an extremely powerful feature in Haskell. Recursion makes operations on lists and tuples a breeze. Combined with pattern matching, the possibilities are endless.

1repeat' :: a -> [a]  
2repeat' x = x:repeat' x

The function above takes an element and repeats it infinitely since Haskell supports infinite lists. The : distinguishes between the head and the tail of the list. The newly populated list contains the argument x and the tail contains an infinite list of x by recursively calling repeat'. If we call repeat' 5, it would evaluate as 5: repeat' 5, which is then 5: (5: repeat' 5). This will continue evaluating and never finish giving us an infinite list of 5’s. However, combining with Haskell’s laziness, if we do take 5, we can get the first 5 elements of the list and the rest is never evaluated.

1ghci> let x = take 5 (repeat' 5)
2[5,5,5,5,5]

Pattern Matching

Pattern matching is used everywhere and there are several ways to do it. There is significant flexibility using case statements, guards and other semantics to pattern match on just about anything. Haskell depends on pattern matching heavily and is one of the most important features of the language.

1factorial :: (Integral a) => a -> a  
2factorial 0 = 1  
3factorial n = n * factorial (n - 1)
4
5ghci> factorial 0
61
7
8ghci> factorial 5
9120

Pattern matching makes handling cases like above extremely simple. The function returns the factorial of a number. There are two definitions of factorial. The first one deals with the case where the argument is 0 and the next for any other number. This is the most basic way of pattern matching.

In most other languages the _ is more of a convention, than an actual language feature; in Haskell, however, it lets you ignore anything that is unwanted. I thought it was a subtle way of making the language more readable and meaningful.

The use of _ can be illustrated as below

1first :: (a, b, c) -> a  
2first (x, _, _) = x  
3
4second :: (a, b, c) -> b  
5second (_, y, _) = y  
6
7third :: (a, b, c) -> c  
8third (_, _, z) = z
9
10ghci> first (1, 2, 3)
111
12
13ghci> second ("a", "b", "c)
14"b"

first returns the first element of a triple. We use _ for the rest of the elements because our interest lies only in the first element. The others do not matter. second and third are similar in that they return the second and third elements of a triple respectively.

Monads hit you hard at first

Monads allow the order of evaluation to be defined. This is done using the do syntax. Monads are your best friends! They bear resemblance to imperative language syntax. While it might seem imperative, all a Monad does is chain operations in a specific manner. The do syntax is only syntactic sugar for the bind (>>=) operator. Monads also provide a way to introduce side effects into the language. The compiler does not have to know about it because the language itself remains pure, however the implementations usually do know about them. It was tricky to grasp, since they are tough to explain without expanding on the implementation, which delves into more advanced features of the language. The way to truly understand them is by writing code and playing around until the bulb lights up.

1import System.Directory (doesFileExist)
2
3readConfigFile :: String -> IO ()
4readConfigFile filePath = do
5    configFileExists <- doesFileExist filePath
6    if configFileExists then do
7        fileContents <- readFile filePath
8        putStrLn fileContents
9    else
10        putStrLn "File does not exist."

The do notation makes this snippet of code very readable, even for someone not used to the Haskell syntax. We first check to see if the file exists, using doesFileExists which is of type IO Bool. But, if expects a Bool; we use the <- to grab the Bool out of the IO Bool. If the files exists, we read the file using readFile which returns an IO String. We then grab the String from IO Bool and finally print it to the screen.

Time and resources

There is a good list of Haskell resources here but for a gentle start:

Make sure you give yourself plenty of time to bash your head against it. Haskell can be quite tricky and intricate to understand. It will drag you back to square one, which makes it important to embrace the basics of Haskell and develop a solid grounding. But the effort is truly worth it: it’ll make you a much better developer, even if you decide not to use it.