While popular in Haskell, monads can be encoded effectively in OCaml too. This post describes one such encoding, as well as the process of developing a modern OCaml program.

I assume that you know what monads are and why you should care about them. If this is not the case, then there are some excellent books 1 out there.

Introduction

Haskell is famous for the integration of monads in its core language. Not only are functions like >>= supported syntactically through the do syntax 2, but monads are commonly used for program flow and used by necessity for managing and tracking effects. One of the reasons that it is so nice to use monads (and, for that matter, other abstractions like functors and monoids) in Haskell is its type-classes. Type-classes are a means of achieving so-called “ad-hoc polymorphism”: common interfaces that are shared between things that may share no other relation.

One example of a type-class defined in standard Haskell (the “Prelude”) is Show, which defines how to convert a value into a human-readable string. It makes sense that we would want to be able to show many different kinds of things. Show has the following definition (with an example instance for Bool):

There are some important characteristics of type-classes:

  • Type-class functions can be applied to any type for which an instance is defined in scope somewhere. For example, as long as the Show Int instance is in scope, show 23 can be invoked without additional annotation.

  • There can only be a single global instance in scope of a type-class for a particular type. This means that, for instance, that without language extensions, Show String cannot be defined differently than Show a => Show [a] since String is a type alias for [Char].

While Haskell popularized type-classes through its first-class support for the idiom, it’s absolutely possible to encode these concepts in other programming languages.

Enter OCaml

I’m getting interested in OCaml again. It was actually the first functional programming language I learned back in 2007 or so (shortly afterwards, I learned Haskell), and there are several recent developments in the language and the ecosystem:

  • The opam package manager makes it easier than ever to install OCaml and its libraries on your system, including seamless switching between versions.
  • The freely-available online “Real World OCaml” book is very well-written and accessible.
  • The MirageOS project for building so-called “unikernels” in OCaml: programs that are written in OCaml and that run on “bare metal” on the Xen hypervisor.
  • There are rumors that OCaml will soon be multi-core ready. Currently, threads are multiplexed on a single CPU core.
  • I hear that the new optimization engine, flambda, will be merged soon. It performs whole-program optimization a la MLton and apparently significantly improves performance.

I won’t attempt to make the case of OCaml versus Haskell, but it is worth noting OCaml is strict by default with strong support for lazy evaluation (unlike Haskell, in which the opposite is true).

Another aspect of OCaml that I find extremely compelling is its strong support for modular design through interfaces and its support for functors: modules that are functions of other modules. I was curious about how monads could be encoded in this system, and in particular, if monads in OCaml were viable for everyday use.

Running these examples

I’m going to briefly outline how to get set-up with OCaml since there are still not a tonne of resources out there:

  1. Install opam. Then set-up your environment via opam init --comp=4.02.2 (where OCaml 4.02.2 is the latest version as of this writing).

  2. Install the utop interactive top-level with opam install utop. This is a more modern replacement for the default top-level with readline key-bindings and autocompletion. I’m not a huge fan of the curses “bling”, but it does the job.

  3. (Optional) Install tools for developing OCaml programs: opam install merlin ocp-index ocp-indent. These tools (with the corresponding configuration in Emacs or Vim) make it possible to automatically indent OCaml code, jump to definitions, and dynamically inspect the types of values.

You can also find all of the OCaml source code in this post at the corresponding GitHub repository.

Starting with a functor

One of the most simple higher-order structures in Haskell is the functor (this unfortunately shares a name with the language construct in OCaml). A functor is a structure that we can map over.

Note the file name of the snippet: in OCaml, the name of the file dictates the name of the module to which the contents belong. Therefore, we have just defined the Functor_class module, which consists of a module signature called Functor_class.S.

We can play with this definition with utop:

(Note the #mod_use directive. This loads the file by wrapping it in a module just like the OCaml compiler would. If we had typed #use instead, the definitions would be available as if we were inside the module at the top-level.)

Let’s look at a simple functor instance:

and how it could be used:

Monads and OCaml functors

With the definition of functors in mind, look at the following definition of a monad structure:

In Haskell, type-classes can be defined in terms of a minimal set of functions that can be used to implement the other functions. In the case of the monad definition, a number of a functions are usually expressed in terms of two functions: >>= and return.

OCaml doesn’t have the notion of partially-implemented signatures in modules: either all values are abstract (module signatures) or all are concrete (modules). Thus, to mirror this notion of functions being defined in terms of others, we use a OCaml functor to “extend” the basic type-class functions of pure and bind into the rest of the monad functions. There are some more details of this technique in the functors chapter of “Real World OCaml”.

Expressing relationships between higher-order structures

Any monad instance is a functor instance “for free”. We can express this idea in the revised definition of the functor class:

We’ve expressed that given any monad instance, we can obtain a functor instance by implementing map in terms of bind and pure.

Looking at the option monad

The Maybe monad instance in Haskell is (in my opinion) the easiest to understand. Let’s see how it can be expressed in OCaml. Since types are lowercase in OCaml, we can create option.ml (yielding an Option module) for our definitions without any conflict with the language:

Let’s compile this module (along with its dependent modules) to see how it works using the ocamlbuild tool:

ocamlbuild option.cmo

Ocamlbuild creates compiled objects into the _build directory, so we should start-up utop with an amended include path: utop -I _build. Inside utop, we use the #load_rec directive to load the Option implementation and any modules that it depends on. We can use the #show directive to view a module’s signature from within the top-level, too.

Monads with multiple type parameters: state

Some monads, like Haskell’s Maybe monad and IO monad, have only a single type parameter. However, many have more than one. It’s not immediately clear how to encode this into OCaml’s type system.

One solution is to use another functor. Let’s see how this works in an implementation in OCaml of the State monad in Haskell. In this case, the state type is an any module that defines a type t, and our module is parameterized on this state:

As an example, consider an example directly from “Functional Programming in Scala”: implementing a simple candy dispenser state machine. First, we’ll show the interface (which doesn’t reveal any state-monad implementation):

and then the implementation, using our state monad parameterized over Machine_state:

As a quick example of the candy dispenser:

You can compile this example with ocamlbuild main.native which will produce an executable main.native.

Conclusions and next steps

It is undeniably more syntactically messy to define these structures in OCaml over Haskell. Nonetheless, I think it’s compelling that OCaml’s module system, which is so useful for structuring programs with interfaces, can define such high-level abstractions without breaking a sweat.

In a paradoxical way, I actually find that sometimes implementing these and other structures in OCaml (and similarly, sometimes Scala) offers more educational insight into how they all work, since there’s less magic with laziness and instances have to be brought directly into scope. I suspect it also makes readability clearer for heavily monadic code. A block that begins with

let open Option.Monad in

makes it abundantly clear that the monadic operations that follow are in the context of the option monad.

I’d like to fiddle in the future with monad transformers in OCaml, and maybe try writing a larger OCaml program that actually makes use of monads internally.

Implicits

What’s very interesting is that there is a branch of OCaml that supports a sort-of hybrid of the existing module system and Haskell’s type-classes called implicit modules. This language construct is virtually identical to Scala’s implicit mechanism, and more details can be found here. The paper has lots of nifty examples.

Other references

Updates

  • 26 February 2017: Removed unnecessary laziness from the definition of the monad signature.

Footnotes