Okay, but what ARE Monads?

A monad is just a monoid in the category of endofunctors, what’s the problem?
Philip Wadler, fictional atribution by James Iry in A Brief, Incomplete, and Mostly Wrong History of Programming Languages

Monads are kinda hard to wrap your head around in functional programming. The fact that their mathematical basis is complicated as hell doesn’t help either.

So for this blog post I figured I’ll share how I think about monads, and what mental model helped me to understand their applications.

Thinking Inside The Box

In my opinion, the easiest way of thinking about monads is seeing them as some sort of fancy, generic unit-containers. These containers abstract certain aspects of their containing data. For example, this data might not (yet) be available, or be contaminated with state, side-effects or non-determinism (which is especially relevant for purely functional languages like Haskell).

An important property of monads is that, while they might not expose the data directly, they must provide a way to interact with it. How this works will be clear once we look at what operations a monad usually supports.

(Bear with me here, the next chapter is a bit technical. We’ll get to some examples after that.)

Methods To The Madness

To see what kind of operations a monad needs to provide, let’s take a look at the monad class¹ in Haskell²:

class Monad m where
   return :: a   ->               m a
   (>>=)  :: m a -> (a -> m b) -> m bCode language: Haskell (haskell)

So we have two important methods:

return takes anything (a in this case) as an argument and returns its monadic type m a. Following the analogy from above return basically wraps our data in the container type. This method also has a function associated with (which acts as a kind of default implementation) that will just call the monad constructor.

(>>=) – also called the bind-operator – takes a monadic instance m a and a function (a -> m b) which has another monadic instance of the same type as it’s result. The overall result will have the same type as the parameter function.

This sounds complicated, but if you look at it like a container it becomes much clearer: The bind-operator can be thought of as applying a function to the data in the container, but the result is also a container. By doing that we can make sure that the raw data is never leaked outside the monadic context.

To illustrate the bind-method a bit further, it makes sense to define it in a different language as well. Let’s see how a monad interface might look like in Java³:

public interface Monad<T> {
   Monad<T> return_(T t);
   <S> Monad<S> bind(Function<T, Monad<S>> f);
}Code language: Java (java)

We can do transformations on the data using .bind(), but we can not get rid of the container/monad.

Sometimes different names are used: unit – similarly to return – converts the argument into a monadic type, and flatMap – like the bind-operator – allows us to apply functions on the content of the monad⁴. We will later see some examples of this.

Maybe It’s Just Nothing

Okay, so how exactly is this useful? Probably the most famous example is the Maybe monad in Haskell⁵.

data Maybe a = Nothing | Just a

instance Monad Maybe where
   (Just x) >>= k = k x
   Nothing  >>= _ = NothingCode language: Haskell (haskell)

Maybe has two constructors: Nothing and Just. The latter has a generic parameter a. When thinking in containers, Nothing would be an empty container, while Just is a container that has an element.

We can also see the function definition for the bind-operator. There are two patterns: If the first argument is Just, then apply the function argument k to the element in the container and return the result. If the first argument is Nothing – i.e. there is no element in the container – again return Nothing.

Let’s look at the following example on how we could apply the bind-operator:

Just 42 >>= return . (* 2)
-- will result in Just 84

Nothing >>= return . (* 2)
-- will result in NothingCode language: Haskell (haskell)

I hope you can see now how this can be useful: We are able to apply our calculation (doubling in this case) without having to think about whether the element is actually present or not.

As already discussed in the previous chapter, the return function just re-packages its argument into the monad. We could also instead use the Just constructor⁶:

Just 42 >>= return . (* 2)
-- is the same as
Just 42 >>= Just . (* 2)
-- is the same as
Just 42 >>= \x -> Just (2 * x)
-- is the same as
Just 42 >>= \x -> return (2 * x)Code language: Haskell (haskell)

(Yeah, yeah, I know. This is the last time we will look at Haskell code, I swear.)

As a side-note: Java actually has something very similar to this: The Optional class⁷.

Optional.ofNullable((Integer) 42)
   .flatMap(x -> Optional.of(2 * x));
// will result in Optional[84]

Optional.ofNullable((Integer) null)
   .flatMap(x -> Optional.of(2 * x));
// will result in Optional.emptyCode language: Java (java)

(Notice how Java uses the flatMap name for the bind-operation.)

Mission: Impossible

Maybe until now you are still not convinced and think that monads are just a convoluted way to avoid null values. That’s not completely wrong⁸, but because of how general monads are, they can actually do much, much more. In some instances they can even do the impossible.

Haskell is a purely functional language. Every expression is always referentially transparent, meaning that there are no side-effects whatsoever. This is of course not very useful, because this also disallows any form of input and output, any way of storing and retrieving state, any probabilistic calculations, and so on.

But monads can help here. We can encapsulate these impurities within them and then rely on bind-operations to do the actual useful calculations.

Let’s look at input/output as an example. The IO monad will mark our contaminated code. Input values are directly packaged into the IO monad. That way, we cannot accidentally get non-deterministic values into any pure parts of the program⁹. When outputting something, we will get an empty IO value back, that we cannot get rid of either (since in Haskell we cannot discard return values – that would make the function call pointless after all; lazy evaluation would just ignore it). So, as soon as we contaminate parts of the program with IO, everything that is dependent on those parts must be modified as well (by adding the IO type) to reflect that¹⁰.

Let’s look at an example. (Okay, I lied. We need a bit more Haskell code. Sorry.) We’ll use the getLine “function”¹¹ to read some input, then bind the resulting IO String to a function, that will parse the string as a number, double it, convert it to a string again, and output it with putStrLn¹². The bind function needs to have an IO type as a result, but as already discussed, all output operations return an empty IO() value – which is also the type of the complete expression.

getLine >>= putStrLn . show . (* 2) . readCode language: Haskell (haskell)

If we don’t want to do everything in one big function composition we can also use multiple smaller binds:

getLine >>= return . read
        >>= return . (* 2)
        >>= return . show
        >>= putStrLnCode language: Haskell (haskell)

Here we can also see how we need to carry the IO monad from one function to the next (using the return method).

Because this is rather cumbersome to write, Haskell provides us with some syntactic sugar in the form of the do-notation¹³.

do
   s <- getLine
   putStrLn s
   let i = read s
       j = i * 2
       o = show j
   putStrLn oCode language: Haskell (haskell)

The do-notation does two things:

It allows us to work with monads transparently without having to think about which return values we need for each bind.
It also allows us to think in an imperative manner, which is especially useful for user interaction, sequential network communication, …

It’s A Promise, Not A Thread

We’ve had a lot of Haskell code until now, so let’s talk about JavaScript.

One of the things that make JavaScript (especially in the browser) unique is that everything is running in one single thread¹⁴, including the UI. So you effectively can not have the main process wait or do long calculations at all¹⁵.

The way to deal with that, is to use callbacks, or better yet: the Promise-API¹⁶. Let’s look at how this works.

const promise = new Promise(
   (resolve, reject) => resolve(42)
);Code language: JavaScript (javascript)

A Promise can either resolve and provide a value, or reject and provide some error information.

Because getting the value out of the Promise would mean to wait for the result (because the Promise might, for example, talk to a server before resolving), there is no way of doing that. The only thing we can do is using callbacks. The .then() callback¹⁷ will be called if the promise resolves, and return a new Promise that resolves to the result of the callback. If the callback returns a Promise, the Promise returned from .then() resolves when the callback Promise resolves. Sounds complicated, but it isn’t. Let’s do an example.

const wait = (s, msg) => new Promise(
   res => setTimeout(() => res(msg), s)
);

wait(1000, "foo")
  .then(console.log)  // print "foo" after 1s
  .then(() => wait(1000, "bar"))
  .then(console.log); // print "bar" after 2s
Code language: JavaScript (javascript)

Déjà vu. Turns out: Promises behave like monads: The Promise constructor acts similar to return, and .then() can behave like the bind-operator¹⁸. There is also no easy way of getting rid of the container, similar to there being no way of getting rid of the IO() monad in Haskell when reading input.

And the similarities don’t end here. JavaScript also has something similar to the do-notation in Haskell: async/await is syntactic sugar¹⁹ for the Promise API.

(async () => {
    const a = await wait(1000, 42);
    console.log(a);
    const b = await wait(1000, a * 2);
    console.log(b);
})();Code language: JavaScript (javascript)

Of course, a lot of other languages also provide an async/await API (seems to be kind of a trend right now²⁰), but I chose to use JavaScript for this example, because it doesn’t not provide an alternative, synchronous API – similar to how Haskell does not provide an alternative way of dealing with IO/state/…

Conclusion

I hope after reading this you too think that monads are not quite as much black magic as they maybe seem to be. In fact, I find them to be an intriguing pattern. Even in languages that are less constraint than Haskell or JavaScript, there are definitely applications for them.

Until next time,
Sigma

Footnotes

1
Haskell calls them classes, in the object oriented world these would be called interfaces instead. A type of the class must support all it’s methods. Different names but essentially the same concept.
2
I simplified it a bit for readability reasons. You can find the original code here.
3
Unfortunately the return method can not be represented properly in Java, since it should be a static abstract interface method, which is not supported by the language.
4
This is again an example of how monads are basically unit containers. flatMap usually is used to map each container element to a container, which are then “flattened” into one. But what if the container, as well as the map result are unit containers? Now it behaves like the bind-operator. Take a look at this JavaScript expression for example: [42].flatMap(x => [x*2])
5
You can find the Maybe data type and constructor definitions here. The monad instance definition can be found here.
6
It should be noted, that this does not work for all monads. Any constructor can also be used as a destructor in pattern matching, so in most cases (we will see an example of this later on) the constructor is not exported. However, the return function always exists for any monad.
7
Here you can fine the JavaDoc for the Optional class.
8
Theoretically Maybe/Optional/… would also work if they are not monads, but they provide much more functionality this way.
9
In fact, since the IO constructors are not actually exported, we cannot even deliberately get rid of the monadic type.
10
A side note for the real h4x0rs: You could actually get rid of IO by using something like unsafePerformIO from the System.IO.Unsafe package. But don’t tell anyone! ?
11
Since all functions in Haskell need to have a parameter type, getLine is not strictly speaking a function, but a non-deterministic value of type IO String. See Prelude package.
12
putStrLn :: String -> IO() See Prelude package.
13
More infomation about the do-notation.
14
Let’s ignore Workers for now.
15
Here you can read more about JavaScripts Event loop runtime model.
16
Read more about the Promise API.
17
The .catch() and .finally() callbacks can be used if the Promise is rejected. But for my point here, these are not relevant.
18
Or the fmap method of the Functor class (fmap :: (a -> b) -> m a -> m b), that we didn’t talk about.
19
More about async functions.
20
C#, F#, Java (kinda), Python, C++, Rust, …

Sigma's Blog