Banging My Head Against the Wall With Haskell and C++ FFI

Posted on 18th of December 2022 | 1747 words

As some of you may know, I have a soft spot for functional programming. So most of the code I tend to write outside my professional work tends to be written in Haskell!

Unfortunately, the world we live in is still living in a world of imperative languages, and due to the sheer amount of that code, we most likely will be living in that world for years to come; I can’t say forever since the planet is not here forever, but we’re talking about a long, long time. It’s also the same for my own professional life.

I’m not very fanatical when it comes to tooling as long as it gets the job done. But still, the interaction between imperative and functional worlds is fascinating. So we are talking about foreign function interface (FFI) here.

Lately, I’ve been banging my head against the wall with the FFI of Haskell since I wanted to write a particular piece of code in mainly Haskell, but I needed something from the world of C++. Could have I written this in just Haskell? Probably, but despite enjoying functional programming more over imperative code, certain data structures, algorithms and libraries work better in the C++ world and sometimes you might need to “extra oomph” when it comes to performance.

Okay, better is a strong word for this, maybe better on how they work in the imperative world and but they don’t translate one to one to fully functional and pure world of Haskell where there are better options for data structures and algorithms. Sure some libraries are only written in C++, so there is nothing that can be done with it. For functional data structures, I definitely recommend Purely Functional Data Structures by Chris Okasaki .

Generally speaking, when it comes to FFI, interacting with C is pretty straightforward, but when C++ comes into play, some extra ceremony is required due to the nature of the language with things like unstable ABI, more extensive language and more complex features. Thankfully, C++ offers an easy way to write C ABI compatible code with the extern "C" keyword, which makes it so that the code can be called from C. But there is a caveat in this. While you can use C++ features inside the function body, when you use extern "C" functions type signature needs to be compatible with C. So no fancy STL features etc., here.

Also, when we talk about C and/or C++, memory management comes into question. So if you want to bind those languages to some memory-managed language like Haskell, you need to ensure that the memory gets handled correctly. C++ offers fancy features like RAII, smart pointers and stuff for making memory management a little bit easier, but that’s not the case in C.

Let’s start by creating a small Cabal project and some sample C++ library that we would like to interact with from Haskell.

common base
  ghc-options: -Wall -Wextra -Wno-orphans -Wno-name-shadowing
  default-language: Haskell2010
  build-depends: base ^>=4.16.3.0

executable arith
  import: base
  main-is: Main.hs
  hs-source-dirs: app

  -- C++ bits
  cxx-options: -std=c++20 -Wall -Werror -Wextra
  cxx-sources: cbits/arith_capi.cc cbits/arith.cc
  include-dirs: cbits
  extra-libraries: stdc++

While otherwise, it’s a reasonably standard Cabal boilerplate, the interesting bits are the lines relating to C++. Basically, what we do here is define some compiler options for the C++ compiler, where the C++ source files are located, and the relevant header files. Commonly, in Haskell, if you’re library/application has had anything related to C or C++ files relevant for those have usually resided in a directory called cbits, but nothing forces you to follow this convention. When that’s done, we can proceed to write some “earth-shattering” C++ library for our application.

// arith.h

#pragma once

struct arith {
  arith() noexcept;

  int add(int x, int y) noexcept;
  int sub(int x, int y) noexcept;
  int mult(int x, int y) noexcept;
  int div(int x, int y) noexcept;
};

// arith.cc

#include "arith.h"

arith::arith() noexcept {}

int arith::add(int x, int y) noexcept { return x + y; }
int arith::sub(int x, int y) noexcept { return x - y; }
int arith::mult(int x, int y) noexcept { return x * y; }
int arith::div(int x, int y) noexcept { return x / y; }

We’ll just define a simple arith struct/class with some elementary arithmetic operations. Nothing too fancy. This will work as our C++ library that we’ll interact with via Haskell. After that’s done, we need to provide some simple C API for this library so that we can interact with the library via the stable C ABI.

// arith_capi.h

#pragma once

#ifdef __cplusplus
extern "C" {
#endif

typedef struct arith arith;

extern arith *arith_new();

extern void arith_delete(arith *p);

extern int arith_add(arith *p, int x, int y);
extern int arith_sub(arith *p, int x, int y);
extern int arith_mult(arith *p, int x, int y);
extern int arith_div(arith *p, int x, int y);

#ifdef __cplusplus
}
#endif

// arith_capi.cc

#include "arith_capi.h"
#include "arith.h"

extern "C" {
  struct arith *arith_new() { return new arith(); }

  void arith_delete(arith *p) { delete p; }

  int arith_add(arith *p, int x, int y) { return p->add(x, y); }

  int arith_sub(arith *p, int x, int y) { return p->sub(x, y); }

  int arith_mult(arith *p, int x, int y) { return p->mult(x, y); }

  int arith_div(arith *p, int x, int y) { return p->div(x, y); }
}

In our C API, you’ll notice that we need to wrap our functions inside extern "C" to ensure that they’re compatible with C ABI. Also since extern "C" is a C++ keyword, we’ll wrap it in #ifdef __cplusplus directive to ensure that it gets only used if we happen to call this via C++. In the actual implementation side, you can notice that we use new and delete to do the memory management. The thing to note here is that using those keywords in “modern C++” is very much frowned upon since the language offers better ways to do that management with features like RAII, smart pointers etc., which basically makes it so that programmer don’t need memory management explicitly like we do here, but instead, they can let the compiler do it for you. We on the other need to use those keywords since we are similar management but from Haskell with its foreign pointers, which makes it so that we are able to leave the memory management to Haskell’s runtime and garbage collector.

Now we have the C++ bits done, we can proceed on how to interact with that via Haskell. All Haskell’s FFI features reside behind GHC’s {-# LANGUAGE ForeignFunctionInterface #-} language extension. So first, we need to include that in our Haskell files (either on top of the file or in the project’s Cabal file), and we can already import some of the needed modules.

{-# LANGUAGE ForeignFunctionInterface #-}

module Main where

import Control.Exception ( mask_ )
import Foreign.Ptr ( FunPtr, Ptr )
import Foreign.C.Types ( CInt(..) )
import Foreign.ForeignPtr ( ForeignPtr, newForeignPtr, withForeignPtr )

What are we importing here?

mask_ is needed to avoid leaking the pointer in case an async exception occurs between allocation and wrapping it in a foreign pointer.
Foreign.Ptr is a module holding pointers to foreign data from which we’re using only FunPtr, a pointer to a function, and Ptr, general pointer to an object.
Foreign.C.Types holds the mappings of C types to Haskell types, we’re now only using only int in C++, so Haskell type CInt corresponds to that.
Foreign.ForeignPtr, as mentioned above, is used for representing an object that is maintained in a foreign language and not Haskell’s own storage manager. The way it differs from vanilla memory type, Ptr, is that we can associate finalizers to it that can be invoked by Haskell’s storage manager, essentially cleaning it from all the references.

After the importing shenanigans, we can proceed on to make foreign imports to our code so that we can actually call the C++ code we just wrote.

data Arith

foreign import ccall unsafe "arith_capi.h arith_new" c_arithNew :: IO (Ptr Arith)
foreign import ccall unsafe "arith_capi.h &arith_delete" c_arithDelete :: FunPtr (Ptr Arith -> IO ())
foreign import ccall unsafe "arith_capi.h arith_add" c_arithAdd :: Ptr Arith -> CInt -> CInt -> IO CInt
foreign import ccall unsafe "arith_capi.h arith_sub" c_arithSub :: Ptr Arith -> CInt -> CInt -> IO CInt
foreign import ccall unsafe "arith_capi.h arith_mult" c_arithMult :: Ptr Arith -> CInt -> CInt -> IO CInt
foreign import ccall unsafe "arith_capi.h arith_div" c_arithDiv :: Ptr Arith -> CInt -> CInt -> IO CInt

So what’s happening here:

We’ll create an empty data type which we can pass in as a tag to the Ptr type
foreign import ccall unsafe "arith_capi.h arith_new" c_arithNew :: IO (Ptr Arith)
- foreign import ccall tells Haskell that we’re importing a C call.
- unsafe tell Haskell that the C call won’t be calling back to Haskell, which essentially makes it produce a little bit less overhead when crossing languages.
- "arith_capi.h arith_new" tells what we’re actually importing to Haskell.
- c_arithNew :: IO (Ptr Arith) defines the name with what we can call the function and its type signature.

Finally, you’re actually able to call these functions.

-- | Create a new foreign object that will be cleaned after it's not in use
-- anymore. It also uses mask_ in case the pointer leaks if an exception happens.
newArith :: IO (ForeignPtr Arith)
newArith = mask_ c_arithNew >>= newForeignPtr c_arithDelete

main :: IO ()
main = newArith >>= \arith -> withForeignPtr arith $ \ptr -> do
  -- Foreign object is now unwrapped to a foreign pointer which you can use in
  -- any FFI function you described above.
  c_arithAdd ptr 1 1 >>= print
  c_arithSub ptr 1 1 >>= print
  c_arithMult ptr 2 2 >>= print
  c_arithDiv ptr 2 2 >>= print

Now you should be able to run your program and interact with C++ in safe manner from the comfortable world of Haskell.

$ cabal run
# ... cabal stuff ...
2
0
4
1

So you can see that while calling C++ from a foreign language is definitely possible, it just requires a bit more ceremony than calling C from these kinds of languages. I initially started to bang my head against the wall with these FFI shenanigans just for the need to use some C++ interfaces that didn’t offer C API, so hopefully, this proves to be beneficial to some. At least, I can use this to freshen my memory in the future since I can guarantee that I’ll forget about it.