orbifold think. visualize. understand.

Categories, monads and F# for dummies

This is an article to stir your curiosity about monads, categories and functional programming (and even more in 'A kaleidoscope from F# and monads' ). I didn't try to make it mathematically correct and it's not a precise treatment of monads from the functional programming point of view. It's only to help you understand and appreciate the unifying power of monads as they appear in the context of F# and Haskell. I also think there is a widely spread confusion around monads and the fact that diverse concepts are mixed up:
  • is a computational expression the same as a workflow, is it a monad?
  • how do math articles about monads map to articles about functional programming?
  • are the methods defined by F# as part of a computation expression dictated by the idea of a monad? How to translate the concepts from categories to functional programming and back?
  • why should I care about monads, they never appear in C# or JavaScript...!?
  • how can the Haskell or OCaml stuff be translated to F#? Though Haskell has a whole lot of info regarding monads, it's not always easy to see how it maps to the F# syntax or way of thinking.
Let me say straight away that understanding monads will not directly solve your marketing challenges, nor will it make writing software easier, not is it a magical recipe to simplify the data access layer of your website. However, it does
  • make you a better programmer in the same way that learning design patterns (anti-pattern etc.) teaches you to recognize structures and solutions on a higher level of abstraction
  • shed light on the relationship between domains which seem unrelated (exception handling, state management, graphs and networks, asynchronous calls, domain specific languages and so on)
  • suggest you to rethink software structures and how typical programming challenges are solved
  • open up a large field of ideas in the overlap of software development and maths which are fun to explore.
I don't expect to see monadic stuff appear in Silverlight animations or in CSS style sheets but for sure you'll see it appear in
  • the NoSQL movement and graph analysis algorithms
  • the financial and insurance business where F# is at this moment a big hit
  • DSL developments and meta-programming
and who knows what the future brings?

A simple idea

We are surrounded by data collections and structures, sets of interrelated data and hierarchies:
  • a family and its members where relationships are defined by concepts like father, mother, sister, uncle and so on
  • company hierarchies with employees and relationships defined by team structures, divisions and so on
  • modules of code and calling/instantiation relations where exceptions are bubbling from down the stack to the upper (UI level if not properly handled) parts
  • algorithmic financial transactions (flows) based on some intricate logic
and on a more scientific level even more where one notices certain similarities and intriguing analogies:
  • genes and memes: the formation of genetic structures vs. the formation of thoughts and ideas
  • the spreading of viral diseases vs. the spreading of computer viruses
  • the six-degrees of freedom appearing on the internet vs. the plex of actors in the IMDB database
In a lot of cases similarities are just accidental (it's because a banana and a lemon are yellow that there is deep connection, except the shared pigment) but there are equally well amazing symmetries which can be made precise. Category theory is a theory which makes this idea of 'being similar' exact and at the heart of it is really a very simple notion:
  • you have collection of objects
  • a collection has a certain structure
  • you can map objects from one collection to another and the relationship can also be mapped.
Category Idea
Example
Take the members of a family (with their relationships; mother, father etc.) and take the employees in a company with its hierarchy. Can you create a mapping from one collection to the other? If your thinking is more in terms of pictures and visuals you'll probably starts drawing a graph and map the nodes and edges from one graph to the other. This is no accident, graphs and mappings from one graph onto another is a good playground to study categories.
 
Example
Take some business logic and imagine that at some point an exception is thrown; it propagates through the logic to end up being caught (or not). Take, on the other hand, the stateful (call it state-machine if you prefer) flow of an online ordering system where you are guided through a wizard to check out and buy some goods (like you'd do on Amazon). Is there some underlying structure governing these two systems? Probably this would take you somewhat longer than the previous example and it also requires a bit more abstraction, but the answer is 'yes' there is a mapping (i.e. a mathematically exact symmetry) between the two.State machines (automata) and the way exceptions or infinities can be handled in code have a common mathematical underpinning.
  From this we take a bold step a define a (pedestrian) category as follows (more precise definition below):
Definition
A category is a collection of objects with some relationships. The objects in the collection are called (duh!) objects and the relationships are called morphisms. A mapping between two categories is called a functor.
  Maybe it should be called at this point a pedestrian category or something because this is of course not very precise; this definition would turn pretty much everything into categories and functors! At this point I only want to help you understand where the real maths and the hard F# stuff is coming from and make it more acceptable. A morphism is something which morphs one thing into another, something which transfers one thing into another. You will also encounter in the literature similar concepts like
  • homomorphism: it emphasizes even more the idea of morphing. The prefix 'homo' meaning 'the same' and 'morph' means 'changing'. Technically speaking the term homomorphism comes from group theory and morphism refers to the more general idea which can be applied outside group theory and Lie groups.
  • endomorphism: 'endo' means 'in itself' and this emphasizes that a morphism is not going somewhere else but stays in the same package. So, an endomorphism of a family would map one person onto another inside the same family. A morphism would in general morph one family onto another.
  • epimorphism: here the emphasis is on the fact that the morphism reaches the whole target collection.
  • isomorphism: in this case the morphism is mapping elements one-to-one and onto the target. The notion is a very important one and appears pretty much everywhere in maths and science. In essence it says that if two things are isomorphic they are indistinguishable from one another and anything.everything you say about one is valid for the other.

Some more simple examples

In order to move away from the heuristics above towards some more serious definitions I need to highlight more (simple) examples.
Example
Take the collection of sets and the mappings between sets. This is a category where the mappings are the morphisms (morphings) and it's often denoted with Set (in bold). An endofunctor F here is, for example, a mapping into itself which maps a set, say S, to the set of subsets: F: S \rightarrow 2^S
 
Example
Take the real number (floats, if you prefer) with an operation called 'addition'. Take the same real number with an operation called 'multiplication'. We call them objects in the category of 'real numbers with an operation'. A morphism f between the two objects is f: \mathbb{R},+ \rightarrow \mathbb{R}, x \mapsto e^x. This morphism has the amazing property that: f(x+y) = f(x).f(y). and this property is shared with many other types of collections. In addition, you should also notice that zero is mapped to one and that zero is the identity (null element) for the addition while the number one is the identity for the multiplication.
 
Example
Take the collection of lists of strings (List in C# and 'string List' in F#) and consider the morphism called 'concatenation' between lists. Take the collection of lists of integers on the other hand and with the same concatenation of of lists as with the string collection. A functor here is the List.map operation of F# and you should note that it doesn't matter whether you concatenate things first and then apply the map or first apply the map and then concatenate things. In a mathematical fashion you can write that if F is the List.map action and f, g are morphisms then F(f.g) = F(f).F(g)
 
Example
The collection of data types in .Net is a category so is the collection of data types in Haskell (usually denoted by Hask). Let's call this category Net. There is a useful lifting procedure or amplifier inside the collection; take any type and map it to a collection of this same type: Amp: T \mapsto List< T> Note that you can iterate this, even though in practice you won't often see a List<List<List<T>>> type defined in an application. The things is, this is an endofunctor of the category Net into Net.

Some simple list manipulations

There are some well-know examples of categories and monads like the so-called maybe monad, the identity monad and the state monad but I think the easiest way to explore or check the ideas is by means of the concat monad over the category of lists of integers, called from here on List (in bold). It consists of the following:
  • sets and subsets of integers: [], [1], [1;2;3], [[1], [5;6]]...are examples of objects in this category
  • the return operation which returns a singleton set from any given element return: x \mapsto [x]
  • the identity operator which returns whatever is given: id: x \mapsto x \newline  id: [x]\mapsto [x]
  • the concatenation operator which concatenates list: concat: [[a],[b]] \mapsto [a,b]
Now, you can play with these operations and discover some interesting properties like the fact that:
Property
concat . concat = concat . (List.map)(concat)
  Indeed if you take an arbitrary set, say [[[a];[b]];[[c;d];[[f]]], an apply the left hand side you get: concat(concat([[[a];[b]];[[c;d];[[f]]]))\newline  = concat( [[a];[b];[c;d];[f]])\newline  = [a;b;c;d;f]  The other side of the equation tells you that: concat . (List.map)(concat)([[[a];[b]];[[c;d];[[f]]])\newline  = concat( [[a;b];[c;d;f]])\newline  = [a;b;c;d;f]  In a similar fashion you can show that
Property
concat. return = id= concat.(List.map)(return)
  Now, let's introduce a *-operator which acts on morphisms f in the List category: f^* = concat.(List.map)(f) This star-operator has also some interesting features which immediately generalize to arbitrary monads later on, namely:
Property
return^*= id\newline  f^* . return = f\newline  g^*.f^*=(g^*.f)^*
  All these things are really easy to show inside this List category but have a deeper meaning on a general level.

Some simple maths

At this point you are ready to receive the general definition of a category and a monad.
Definition
A category consists of three things (aka triplet):
  • a collection of objects
  • a collection of morphisms f between the objects
  • a binary operation (which tunrs the structure in a so-called monoid) inside the collection such that f.(g.h) = (f.g).h and an identity which keeps things identitical id.f = f.id = f
  The only difference here with the pedestrian definition above is that morphism need to satisfy the associative constraint and the existence of an identity.
Definition
A functor is a map F from one category to another which maps both objects and morphisms in such a way that F . id = id.F\newline  F(f).F(g) = F(f.g)
  At this point you should see how this abstract definition corresponds to the various examples above, in particular the List category in the previous section. In the literature you will also see that the collection of morphisms between two objects A and B is denoted by Hom(A,B), which is an abbreviation for 'Hom-omorphisms'. In the previous section you was that the List category of integers has some interesting features when you combine the (List.map), the return, the id and the concat operations. The general definition of a monad is simply expressing the fact that many structures have the same set of properties. Let me first rephrase it in this specific category; The List category has
  • an operation (actually an endofunctor) (List.map) which maps lists onto lists
  • the return operation creates a singleton from an element
  • the concat operation together with the return operation has the following properties: concat. return = id= concat.(List.map)(return) \newline  concat . concat = concat . (List.map)(concat)
and this turns the triple (List.map, return, concat) into a monad. In general, a monad is a triple which satisfies the same set of properties and you'll see instead of concat and return the following:
Definition
A monad is a category C together with 
  • an endofunctor T from C to C
  • a transformation \eta: C\rightarrow T
  • a transformation \mu: T^2\mapsto T
satisfying \mu . \mu = \mu . T (\mu)\newline  \mu . \eta = id = \mu . T(\eta)
Don't be scared away by all this, it's just an abstract reformulation of the simple concatenation and mappings of lists above! I do need however to push things a bit further because the way a monad is defined in maths is in general not the way you'll recognize it in F# or Haskell. Usually you'll see the equivalent formulation by Heinrich Kleisli who invented the Kleisli triple and which turned out to be an equivalent definition for monads. That is, it can be shown that there is a one-to-one correspondence between a Kleisli structure and a monad.
Definition
In this equivalent formulation a monad (or Kleisli triple if you prefer) is defined as a triple over a category consisting of 
  • an endofunctor T
  • a return operation \eta: C\rightarrow T
  • a *-operator on morphisms which lifts a morphism
satisfying: \eta^* = id\newline  f^* .\eta = f\newline  g^* .f^* = (g^* . f)^*
and you should immediately recognize the properties we mentioned above in the context of the List category. The proof that the two definitions are leading to the same structure is not difficult but it wouldn't be useful at this point (unless you are math oriented and you need to see it to understand it).

Some simple F#

The prototypical example in the context of F# workflows is the 'dividing by zero' problem; one cannot divide by zero. If you have a computation where at some point an exception is raised (either explicitly or by the runtime) it doesn't make sense to continue a computation but maybe you want to handle the exception. The way to handle this in F# is by using a label attached to a number; if the label says 'failure' it doesn't matter what the value is, if the label says 'success' it makes sense to read the value. It's a way to include infinity in the constraint range of values the runtime can handle:
1
type Result = Success of float | Infinite
Next, instead of the usual division you define a divide function like so
1
2
3
4
let divide x y =
  match y with
  | 0.0 -> Infinite
  | _ -> Success(x/y)
and plug it into a class with two members:
1
2
3
4
5
6
type SaveDivision() =
 member this Bind ((x:Results), (res: float-&gt;Result))
  match x with
  |Success(x) -> rest x
  | Infinite -> Infinite
 member this.Return x = x
At this point you can look at this as a class with two methods but what you really should see is that
  • the map from a float number to the Result type is an amplifier or lift map as we described in the example above; it's the endofunctor on the Net category of data types
  • the Bind operation corresponds to the *-operation of a Kleisli triple
  • the return corresponds to the eta-mapping of the Kleisli triple
and it turns this triple into a monad! You should also reflect on the fact that the Bind member correspond to the concat operation on the List category (and similarly the return corresponds to the return member). The way you use the SaveDivision monad in a concrete computation is by instantiating it and do something inside as follows:
1
2
3
4
5
6
7
let safe = SaveDivision()
let SomeCalculation x1 x2 =
 safe {
    let! x = divide 1 x1
    let! y = divide 1 x2
    return divide x y
 }
and as soon as you try this on zero you get the Infinite as an answer. Without this construction you would get an exception, obviously.The 'bang' or exclamation mark after the let keywords is actually a way to tell F# that the Bind operation should be used instead of the normal assignment (or type inference). So, this very simple example highlights the following:
  • monads in F# allows you to do 'stuff' under the hood and redefine how things are being assigned or returned
  • a monad in F# is really a Kleisli triple but because it's mathematically equivalent to a monad there is no need to differentiate. I does confuse however when you try to compare the mathematical textbooks with the functional programming textbooks
  • monads are often called workflows because it allows you do embed a whole logic or workflow inside the instantiated class (monad) and redefine how things like the assignment (let!) should function. You will also see that F# in fact has extended the notion of monad to include other monad members and in this fashion allowing you to redefine how for-next, try, yield and other statements are acting inside the monad.
  • monads are sometimes called computation expression since it redefines how assignments and expressions should act on (category) objects
In Haskell you will not encounter the alternative names and the term monad is used uniformly. The suspicious mind should by now worry about a lack of consistency and feel the need for more formal proofs that for example the constraints (or defining rules) of a Kleisli triple (aka monad) are satisfied. Indeed, is the SafeDivision monad with the specified operations satisfying the constraints? Well, yes, it does. But I will refrain to reproduce the proofs and am happy to forward you to someone you tediously wrote it all down; see Paul Abraham's blog and this document in particular wherein you'll find the F# proofs.

A simple dictionary

In Haskell the following type class is used to implement monads:
1
2
3
class Monad m where
   return :: a →m a
   (>>=) :: m a →(a →m b) →m b
You can immediately recognize the return and the bind methods but the symbol '>>=' is usually used instead of 'bind'. Purists will tell you that the discussion around monads in Haskell is not complete without the topic of 'comonads' and 'cofunctors' but it will take a while before these discussions appear in the F# community. See this article in case you want to know more about this. In order to use monads in F# you only need to implement an (implicit) class which implements the Bind and Return member. The language extends the purely mathematical definition with eight extra methods:
Member Description
member Bind : M<'a> * ('a -> M<'b>) -> M<'b> Required member. Converts let! and do! within the monad.
member Return : 'a -> M<'a> Required member. Converts the return within the monad.
member Delay : (unit -> M<'a>) -> M<'a> Optional. Ensures that side effects within the monad are performed when expected.
member Yield : 'a -> M<'a> Optional. Converts the yield within the monad.
member For : seq<'a> * ('a -> M<'b>) -> M<'b> Optional. Converts the for ... do ... within the monad.  M<'b> can optionally be M<unit>
member While : (unit -> bool) * M<'a> -> M<'a> Optional. Converts the while ... do ... within the monad. M<'b> can optionally be M<unit>
member Using : 'a * ('a -> M<'b>) -> M<'b> when 'a :> IDisposable Optional. Converts the use bindings within the monad.
member Combine : M<'a> -> M<'a> -> M<'a> Optional. Used to convert sequencing within the monad. The first M<'a> can optionally be M<unit>
member Zero : unit -> M<'a> Optional. Converts the empty else branches of if/then in the monad.
member TryWith : M<'a> -> M<'a> -> M<'a> Optional. Converts the try/with bindings of the monad.
member TryFinally : M<'a> -> M<'a> -> M<'a> Optional. Converts the try/finally bindings of the monad.
  Finally, the following table should be helpful in case you wish to move back and forth between F#, Haskell and the math literature.
Haskell F# Math
return return \eta
>>== bind * operator
instantiate the Monad type class implement at least the Return and Bind members NA
Monad Monad, Monad Expression, Workflow, Computation Expression Monad with an equivalent definition through Kleisli triple. The defining structure in F# and Haskell is really a Kleisli triple but no distinction is made there between the two types.
NA NA \mu
functor through a type class definition usually not mentioned functor
function function ('fun') morphism
Hask category of Haskell data types Net category of data types a wild zoo of categories; groups, topologies, graphs, differential geometries, quantum field theories and so on
composable functions composable functions binary operation and monoids
 

For more enchanting monadic stuff, see 'A kaleidoscope from F# and monads' on this site.

 

References