Banging My Head Against the Wall With Haskell and C++ FFI
Posted on 18th of December 2022 | 1747 wordsAs some of you may know, I have a soft spot for functional programming. So most of the code I tend to write outside my professional work tends to be written in Haskell!
Unfortunately, the world we live in is still living in a world of imperative languages, and due to the sheer amount of that code, we most likely will be living in that world for years to come; I can’t say forever since the planet is not here forever, but we’re talking about a long, long time. It’s also the same for my own professional life.
I’m not very fanatical when it comes to tooling as long as it gets the job done. But still, the interaction between imperative and functional worlds is fascinating. So we are talking about foreign function interface – FFI – here.
Lately, I’ve been banging my head against the wall with the FFI of Haskell since I wanted to write a particular piece of code in mainly Haskell, but I needed something from the world of C++. Could have I written this in just Haskell? Probably, but despite enjoying functional programming more over imperative code, certain data structures, algorithms and libraries work better in the C++ world and sometimes you might need to “extra oomph” when it comes to performance.
Okay, better is a strong word for this, maybe better on how they work in the imperative world and but they don’t translate one to one to fully functional and pure world of Haskell where there are better options for data structures and algorithms. Sure some libraries are only written in C++, so there is nothing that can be done with it. For functional data structures, I definitely recommend Purely Functional Data Structures by Chris Okasaki .
Generally speaking, when it comes to FFI, interacting with C is pretty
straightforward, but when C++ comes into play, some extra ceremony is
required due to the nature of the language with things like unstable
ABI, more extensive language and more complex features. Thankfully,
C++ offers an easy way to write C ABI compatible code with the extern "C"
keyword, which makes it so that the code can be called from
C. But there is a caveat in this. While you can use C++ features
inside the function body, when you use extern "C"
functions type
signature needs to be compatible with C. So no fancy STL features
etc., here.
Also, when we talk about C and/or C++, memory management comes into question. So if you want to bind those languages to some memory-managed language like Haskell, you need to ensure that the memory gets handled correctly. C++ offers fancy features like RAII, smart pointers and stuff for making memory management a little bit easier, but that’s not the case in C.
Let’s start by creating a small Cabal project and some sample C++ library that we would like to interact with from Haskell.
common base ghc-options: -Wall -Wextra -Wno-orphans -Wno-name-shadowing default-language: Haskell2010 build-depends: base ^>=4.16.3.0 executable arith import: base main-is: Main.hs hs-source-dirs: app -- C++ bits cxx-options: -std=c++20 -Wall -Werror -Wextra cxx-sources: cbits/arith_capi.cc cbits/arith.cc include-dirs: cbits extra-libraries: stdc++
While otherwise, it’s a reasonably standard Cabal boilerplate, the
interesting bits are the lines relating to C++. Basically, what we do
here is define some compiler options for the C++ compiler, where the
C++ source files are located, and the relevant header files. Commonly,
in Haskell, if you’re library/application has had anything related to
C/C++ files relevant for those have usually resided in a directory
called cbits
, but nothing forces you to follow this convention. When
that’s done, we can proceed to write some “earth-shattering” C++
library for our application.
// arith.h pragma once struct arith { arith() noexcept; int add(int x, int y) noexcept; int sub(int x, int y) noexcept; int mult(int x, int y) noexcept; int div(int x, int y) noexcept; };
// arith.cc include "arith.h" arith::arith() noexcept {} int arith::add(int x, int y) noexcept { return x + y; } int arith::sub(int x, int y) noexcept { return x - y; } int arith::mult(int x, int y) noexcept { return x * y; } int arith::div(int x, int y) noexcept { return x / y; }
We’ll just define a simple arith
struct/class with some elementary
arithmetic operations. Nothing too fancy. This will work as our C++
library that we’ll interact with via Haskell. After that’s done, we
need to provide some simple C API for this library so that we can
interact with the library via the stable C ABI.
// arith_capi.h pragma once ifdef __cplusplus extern "C" { endif typedef struct arith arith; extern arith *arith_new(); extern void arith_delete(arith *p); extern int arith_add(arith *p, int x, int y); extern int arith_sub(arith *p, int x, int y); extern int arith_mult(arith *p, int x, int y); extern int arith_div(arith *p, int x, int y); ifdef __cplusplus } endif
// arith_capi.cc include "arith_capi.h" include "arith.h" extern "C" { struct arith *arith_new() { return new arith(); } void arith_delete(arith *p) { delete p; } int arith_add(arith *p, int x, int y) { return p->add(x, y); } int arith_sub(arith *p, int x, int y) { return p->sub(x, y); } int arith_mult(arith *p, int x, int y) { return p->mult(x, y); } int arith_div(arith *p, int x, int y) { return p->div(x, y); } }
In our C API, you’ll notice that we need to wrap our functions inside
extern "C"
to ensure that they’re compatible with C ABI. Also since
extern "C"
is a C++ keyword, we’ll wrap it in #ifdef __cplusplus
directive to ensure that it gets only used if we happen to call this
via C++. In the actual implementation side, you can notice that we use
new
and delete
to do the memory management. The thing to note here
is that using those keywords in “modern C++” is very much frowned upon
since the language offers better ways to do that management with
features like RAII, smart pointers etc., which basically makes it so
that programmer don’t need memory management explicitly like we do
here, but instead, they can let the compiler do it for you. We on the
other need to use those keywords since we are similar management but
from Haskell with its foreign pointers, which makes it so that we are
able to leave the memory management to Haskell’s runtime and garbage
collector.
Now we have the C++ bits done, we can proceed on how to interact with
that via Haskell. All Haskell’s FFI features reside behind GHC’s {-# LANGUAGE ForeignFunctionInterface #-}
language extension. So first,
we need to include that in our Haskell files (either on top of the
file or in the project’s Cabal file), and we can already import some
of the needed modules.
{-# LANGUAGE ForeignFunctionInterface #-} module Main where import Control.Exception ( mask_ ) import Foreign.Ptr ( FunPtr, Ptr ) import Foreign.C.Types ( CInt(..) ) import Foreign.ForeignPtr ( ForeignPtr, newForeignPtr, withForeignPtr )
What are we importing here?
mask_
is needed to avoid leaking the pointer in case an async exception occurs between allocation and wrapping it in a foreign pointer.Foreign.Ptr
is a module holding pointers to foreign data from which we’re using onlyFunPtr
, a pointer to a function, andPtr
, general pointer to an object.Foreign.C.Types
holds the mappings of C types to Haskell types, we’re now only using onlyint
in C++, so Haskell typeCInt
corresponds to that.Foreign.ForeignPtr
, as mentioned above, is used for representing an object that is maintained in a foreign language and not Haskell’s own storage manager. The way it differs from vanilla memory type,Ptr
, is that we can associate finalizers to it that can be invoked by Haskell’s storage manager, essentially cleaning it from all the references.
After the importing shenanigans, we can proceed on to make foreign imports to our code so that we can actually call the C++ code we just wrote.
data Arith foreign import ccall unsafe "arith_capi.h arith_new" c_arithNew :: IO (Ptr Arith) foreign import ccall unsafe "arith_capi.h &arith_delete" c_arithDelete :: FunPtr (Ptr Arith -> IO ()) foreign import ccall unsafe "arith_capi.h arith_add" c_arithAdd :: Ptr Arith -> CInt -> CInt -> IO CInt foreign import ccall unsafe "arith_capi.h arith_sub" c_arithSub :: Ptr Arith -> CInt -> CInt -> IO CInt foreign import ccall unsafe "arith_capi.h arith_mult" c_arithMult :: Ptr Arith -> CInt -> CInt -> IO CInt foreign import ccall unsafe "arith_capi.h arith_div" c_arithDiv :: Ptr Arith -> CInt -> CInt -> IO CInt
So what’s happening here:
We’ll create an empty data type which we can pass in as a tag to the
Ptr
typeforeign import ccall unsafe "arith_capi.h arith_new" c_arithNew :: IO (Ptr Arith)
foreign import ccall
tells Haskell that we’re importing a C call.unsafe
tell Haskell that the C call won’t be calling back to Haskell, which essentially makes it produce a little bit less overhead when crossing languages."arith_capi.h arith_new"
tells what we’re actually importing to Haskell.c_arithNew :: IO (Ptr Arith)
defines the name with what we can call the function and its type signature.
Finally, you’re actually able to call these functions.
-- | Create a new foreign object that will be cleaned after it's not in use -- anymore. It also uses mask_ in case the pointer leaks if an exception happens. newArith :: IO (ForeignPtr Arith) newArith = mask_ c_arithNew >>= newForeignPtr c_arithDelete main :: IO () main = newArith >>= \arith -> withForeignPtr arith $ \ptr -> do -- Foreign object is now unwrapped to a foreign pointer which you can use in -- any FFI function you described above. c_arithAdd ptr 1 1 >>= print c_arithSub ptr 1 1 >>= print c_arithMult ptr 2 2 >>= print c_arithDiv ptr 2 2 >>= print
Now you should be able to run your program and interact with C++ in safe manner from the comfortable world of Haskell.
$ cabal run # ... cabal stuff ... 2 0 4 1
So you can see that while calling C++ from a foreign language is definitely possible, it just requires a bit more ceremony than calling C from these kinds of languages. I initially started to bang my head against the wall with these FFI shenanigans just for the need to use some C++ interfaces that didn’t offer C API, so hopefully, this proves to be beneficial to some. At least, I can use this to freshen my memory in the future since I can guarantee that I’ll forget about it.