Stumbled upon an interesting video by Andrej Karpathy talking about
building Generatively Pretrained Transformer (GPT). I definitely need
to take some time to start looking through the papers linked in the
video on wonder how things work out.
As some of you may know, I have a soft spot for functional
programming. So most of the code I tend to write outside my
professional work tends to be written in Haskell. Sometimes some other
functional languages too, but Haskell is definitely the one that I
tend to default if I’m able to have a choice. Unfortunately, the world
we live in is still living in a world of imperative languages, and due
to the sheer amount of that code, we most likely will be living in
that world for years to come; I can’t say forever since the planet is
not here forever, but we’re talking about a long, long time. It’s also
the same for my own professional life. Even though I work as a DevOps
consultant, I’m mainly working on projects where I’m contributing
code, mainly in Go and C++ and, to a lesser extent, in JavaScript and
Python. Of course, I would like to work more with something like
Haskell for a living, but at the same time, I’m not very fanatical
when it comes to tooling as long as it gets the job done. But still,
the interaction between imperative and functional worlds is
fascinating. So we are talking about foreign function interface – FFI
– here.
Most code I tend to write, at least considering lines of code, is
definitely, Haskell and C++, and I’m quite comfortable with both of
those languages, although the latter one may be a little more
reluctantly. Lately, I’ve been banging my head against the wall with
the FFI of Haskell since I wanted to write a particular piece of code
in mainly Haskell, but I needed something from the world of C++. Could
have I written this in just Haskell? Probably, but despite enjoying
functional programming more over imperative code, certain data
structures, algorithms and libraries work better in the C++ world and
sometimes you might need to “extra oomph” when it comes to
performance.
Okay, better is a strong word for this, maybe better on how they
work in the imperative world and but they don’t translate one to one
to fully functional and pure world of Haskell where there are better
options for data structures and algorithms. Sure some libraries are
only written in C++, so there is nothing that can be done with
it. For functional data structures, I definitely recommend Purely
Functional Data Structures by Chris
Okasaki.
Generally speaking, when it comes to FFI, interacting with C is pretty
straightforward, but when C++ comes into play, some extra ceremony is
required due to the nature of the language with things like unstable
ABI, more extensive language and more complex features. Thankfully,
C++ offers an easy way to write C ABI compatible code with the extern "C"
keyword, which makes it so that the code can be called from
C. But there is a caveat in this. While you can use C++ features
inside the function body, when you use extern "C"
functions type
signature needs to be compatible with C. So no fancy STL features
etc., here.
Also, when we talk about C and/or C++, memory management comes into
question. So if you want to bind those languages to some
memory-managed language like Haskell, you need to ensure that the
memory gets handled correctly. C++ offers fancy features like RAII,
smart pointers and stuff for making memory management a little bit
easier, but that’s not the case in C.
Let’s start by creating a small Cabal project and some sample C++
library that we would like to interact with from Haskell.
common base
ghc-options: -Wall -Wextra -Wno-orphans -Wno-name-shadowing
default-language: Haskell2010
build-depends: base ^>=4.16.3.0
executable arith
import: base
main-is: Main.hs
hs-source-dirs: app
-- C++ bits
cxx-options: -std=c++20 -Wall -Werror -Wextra
cxx-sources: cbits/arith_capi.cc cbits/arith.cc
include-dirs: cbits
extra-libraries: stdc++
While otherwise, it’s a reasonably standard Cabal boilerplate, the
interesting bits are the lines relating to C++. Basically, what we do
here is define some compiler options for the C++ compiler, where the
C++ source files are located, and the relevant header files. Commonly,
in Haskell, if you’re library/application has had anything related to
C/C++ files relevant for those have usually resided in a directory
called cbits
, but nothing forces you to follow this convention. When
that’s done, we can proceed to write some “earth-shattering” C++
library for our application.
// arith.h
#pragma once
struct arith {
arith() noexcept;
int add(int x, int y) noexcept;
int sub(int x, int y) noexcept;
int mult(int x, int y) noexcept;
int div(int x, int y) noexcept;
};
// arith.cc
#include "arith.h"
arith::arith() noexcept {}
int arith::add(int x, int y) noexcept { return x + y; }
int arith::sub(int x, int y) noexcept { return x - y; }
int arith::mult(int x, int y) noexcept { return x * y; }
int arith::div(int x, int y) noexcept { return x / y; }
We’ll just define a simple arith
struct/class with some elementary
arithmetic operations. Nothing too fancy. This will work as our C++
library that we’ll interact with via Haskell. After that’s done, we
need to provide some simple C API for this library so that we can
interact with the library via the stable C ABI.
// arith_capi.h
#pragma once
#ifdef __cplusplus
extern "C" {
#endif
typedef struct arith arith;
extern arith *arith_new();
extern void arith_delete(arith *p);
extern int arith_add(arith *p, int x, int y);
extern int arith_sub(arith *p, int x, int y);
extern int arith_mult(arith *p, int x, int y);
extern int arith_div(arith *p, int x, int y);
#ifdef __cplusplus
}
#endif
// arith_capi.cc
#include "arith_capi.h"
#include "arith.h"
extern "C" {
struct arith *arith_new() { return new arith(); }
void arith_delete(arith *p) { delete p; }
int arith_add(arith *p, int x, int y) { return p->add(x, y); }
int arith_sub(arith *p, int x, int y) { return p->sub(x, y); }
int arith_mult(arith *p, int x, int y) { return p->mult(x, y); }
int arith_div(arith *p, int x, int y) { return p->div(x, y); }
}
In our C API, you’ll notice that we need to wrap our functions inside
extern "C"
to ensure that they’re compatible with C ABI. Also since
extern "C"
is a C++ keyword, we’ll wrap it in #ifdef __cplusplus
directive to ensure that it gets only used if we happen to call this
via C++. In the actual implementation side, you can notice that we use
new
and delete
to do the memory management. The thing to note here
is that using those keywords in “modern C++” is very much frowned upon
since the language offers better ways to do that management with
features like RAII, smart pointers etc., which basically makes it so
that programmer don’t need memory management explicitly like we do
here, but instead, they can let the compiler do it for you. We on the
other need to use those keywords since we are similar management but
from Haskell with its foreign pointers, which makes it so that we are
able to leave the memory management to Haskell’s runtime and garbage
collector.
Now we have the C++ bits done, we can proceed on how to interact with
that via Haskell. All Haskell’s FFI features reside behind GHC’s {-# LANGUAGE ForeignFunctionInterface #-}
language extension. So first,
we need to include that in our Haskell files (either on top of the
file or in the project’s Cabal file), and we can already import some
of the needed modules.
{-# LANGUAGE ForeignFunctionInterface #-}
module Main where
import Control.Exception ( mask_ )
import Foreign.Ptr ( FunPtr, Ptr )
import Foreign.C.Types ( CInt(..) )
import Foreign.ForeignPtr ( ForeignPtr, newForeignPtr, withForeignPtr )
What are we importing here?
mask_
is needed to avoid leaking the pointer in case an async
exception occurs between allocation and wrapping it in a foreign
pointer.
Foreign.Ptr
is a module holding pointers to foreign data from
which we’re using only FunPtr
, a pointer to a function, and Ptr
,
general pointer to an object.
Foreign.C.Types
holds the mappings of C types to Haskell types,
we’re now only using only int
in C++, so Haskell type CInt
corresponds to that.
Foreign.ForeignPtr
, as mentioned above, is used for representing
an object that is maintained in a foreign language and not Haskell’s
own storage manager. The way it differs from vanilla memory type,
Ptr
, is that we can associate finalizers to it that can be invoked
by Haskell’s storage manager, essentially cleaning it from all the
references.
After the importing shenanigans, we can proceed on to make foreign
imports to our code so that we can actually call the C++ code we just
wrote.
data Arith
foreign import ccall unsafe "arith_capi.h arith_new" c_arithNew :: IO (Ptr Arith)
foreign import ccall unsafe "arith_capi.h &arith_delete" c_arithDelete :: FunPtr (Ptr Arith -> IO ())
foreign import ccall unsafe "arith_capi.h arith_add" c_arithAdd :: Ptr Arith -> CInt -> CInt -> IO CInt
foreign import ccall unsafe "arith_capi.h arith_sub" c_arithSub :: Ptr Arith -> CInt -> CInt -> IO CInt
foreign import ccall unsafe "arith_capi.h arith_mult" c_arithMult :: Ptr Arith -> CInt -> CInt -> IO CInt
foreign import ccall unsafe "arith_capi.h arith_div" c_arithDiv :: Ptr Arith -> CInt -> CInt -> IO CInt
So what’s happening here:
Finally, you’re actually able to call these functions.
-- | Create a new foreign object that will be cleaned after it's not in use
-- anymore. It also uses mask_ in case the pointer leaks if an exception happens.
newArith :: IO (ForeignPtr Arith)
newArith = mask_ c_arithNew >>= newForeignPtr c_arithDelete
main :: IO ()
main = newArith >>= \arith -> withForeignPtr arith $ \ptr -> do
-- Foreign object is now unwrapped to a foreign pointer which you can use in
-- any FFI function you described above.
c_arithAdd ptr 1 1 >>= print
c_arithSub ptr 1 1 >>= print
c_arithMult ptr 2 2 >>= print
c_arithDiv ptr 2 2 >>= print
Now you should be able to run your program and interact with C++ in
safe manner from the comfortable world of Haskell.
$ cabal run
... cabal stuff ...
2
0
4
1
So you can see that while calling C++ from a foreign language is
definitely possible, it just requires a bit more ceremony than calling
C from these kinds of languages. I initially started to bang my head
against the wall with these FFI shenanigans just for the need to use
some C++ interfaces that didn’t offer C API, so hopefully, this proves
to be beneficial to some. At least, I can use this to freshen my
memory in the future since I can guarantee that I’ll forget about it.
I found this great
quote from
Richard P. Gabriel, Lisp hacker and a
poet, which summarizes nicely my feelings towards programming and how
I see it as a creative field more so than technical:
“Writing code certainly feels very similar to writing poetry. When
I’m writing poetry, it feels like the center of my thinking is in a
particular place, and when I’m writing code the center of my thinking
feels in the same kind of place. It’s the same kind of
concentration. So, I’m thinking up possibilities, I’m thinking about,
well, so how do I reinvent the code, gee, you know, what’s the
simplest way to do this.”
– Richard P. Gabriel
I’ve always found tremendous value in testing my software. Especially
what might be closest to home for developers - or at least should be -
are unit tests. While unit tests are not necessarily the best way of
making safe working code (this often requires a little bit more
exhaustive testing) but at least they’re very beneficial for your
future self and/or co-workers who might be working with your code
since with them you can quickly see any new errors that might’ve come
from regression.
That being said, often, writing unit tests can be quite cumbersome. I
would love to see some mature tooling for randomized testing like
QuickCheck in Haskell (and later some other languages
too) that would “just
work”, but often something like that just isn’t possible, especially
when the project reaches a certain degree of complexity. Tests and
test suites should be designed on their own as well as your code
itself. Unfortunately, people tend to forget this. In these kinds of
cases, quite simple table-driven test design can come to help!
I first stumbled upon table-driven test design when I was working with
Go, since in there, this seems to be a quite popular way of doing unit
tests, and at least, in my opinion, it works quite nicely!
Often while writing unit tests, you would want to write various
failing and passing test cases, which often leads to quite a bit of
duplication. For example:
TEST(TwoSumTests, PassingTest) {
std::vector<int> nums{2, 7, 11, 15};
auto got = twoSum(nums, 9);
std::vector<int> expected{0, 1};
EXPECT_EQ(got, expected);
}
TEST(TwoSumTests, FailingTest) {
std::vector<int> nums{2, 7, 11, 15};
auto got = twoSum(nums, 9);
std::vector<int> expected{0, 123};
EXPECT_NEQ(got, expected);
}
So even with this elementary example, we can see that most of the code
in the test case is duplicated and/or boilerplate. So we can do
better. For example, with quite a simple table for tests, we can loop
through multiple tests without duplication and easily add new tests.
Regarding testing functions, we often care about what is going in and
what should go out. Everything else in unit tests is often
boilerplate. So where table-driven design help in setting up these
input and expected outputs.
typedef struct {
std::vector<int> nums;
int target;
std::vector<int> expected;
} twoSumTestCase;
TEST(TwoSumTests, BasicAssertions) {
twoSumTestCase tests[] = {
{
std::vector<int>{2, 7, 11, 15},
9,
std::vector<int>{0, 1},
},
{
std::vector<int>{3, 2, 4},
6,
std::vector<int>{1, 2},
},
{
std::vector<int>{3, 3},
6,
std::vector<int>{0, 1},
},
};
for (auto t : tests) {
auto got = twoSum(t.nums, t.target);
EXPECT_EQ(got, t.expected);
}
}
So when we run this we can easily run all the tests at once:
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from TwoSumTests
[ RUN ] TwoSumTests.BasicAssertions
[ OK ] TwoSumTests.BasicAssertions (0 ms)
[----------] 1 test from TwoSumTests (0 ms total)
[----------] Global test environment tear-down
[==========] 1 test from 1 test suite ran. (0 ms total)
[ PASSED ] 1 test.
To demonstrate failing test case, let’s add new test there:
{
std::vector<int>{3, 3},
6,
std::vector<int>{0, 2},
},
We get the following output:
Expected equality of these values:
got
Which is: { 0, 1 }
t.expected
Which is: { 0, 2 }
[ FAILED ] TwoSumTests.BasicAssertions (0 ms)
[----------] 1 test from TwoSumTests (0 ms total)
[----------] Global test environment tear-down
[==========] 1 test from 1 test suite ran. (0 ms total)
[ PASSED ] 0 tests.
[ FAILED ] 1 test, listed below:
[ FAILED ] TwoSumTests.BasicAssertions
1 FAILED TEST
Extending test cases
Of course, with that information, test logs can be pretty misleading.
Thankfully, we can just change the table to our liking. For example,
we could add names to the tests:
typedef struct {
std::string name;
std::vector<int> nums;
int target;
std::vector<int> expected;
} twoSumTestCases;
That we could then use on diagnostic messages in GTest’s macros:
EXPECT_TRUE(false) << "diagnostic message"; // format to your liking
With this kind of formatting, we easily extend these test cases with
just playing around a little bit with your test struct, so it could
involve enumeration, subtests and much more. Which could help you
making your tests/code easier to fix, but also easier for adding new
useful and good tests.
Lately, I have dedicated a large part of my free time to audio
software. I have done this mainly out of interest in the subject due
to my history in music. But at the same time, I also thought writing
audio software could be a fun passion project or even a small business
that I could work on alongside my day job. I don’t see myself
replacing my current job with this, but maybe I could dedicate 20% of
my work time to it.
The world of audio software is a pretty exciting place. It involves a
lot of low-level systems like signals and real-time operations,
complex math at times, and something you can feel or at least hear.
And what’s great, I don’t have any background in this stuff!
Now I have programmed most of my life and played around with RTOS, but
when it comes to writing algorithms for manipulating digital signals,
that’s new stuff for me. However, I have experience with the topic
from the user point of view since I have been making music for almost
as long as I have programmed. This experience involves playing
instruments, how effects affect the sound, how mixing and mastering
works etc. But what do linear types have to do with any of this?
Signals in the Wild
Like I said earlier, signal processing (not necessarily just audio) is
very low-level stuff. So when working with signals in software, you
often need to work with C or C++. This is mainly due to the performant
and close-to-hardware nature of the languages required to handle and
manipulate signals optimally and efficiently.
Digital signal processing is also full of algorithms. The standard
workflow for people in this industry seems to be that these
applications are prototyped on some high-level language before being
produced. Often in languages/tools like MATLAB, Octave, Mathematica,
and similar, very heavily math-oriented languages and tools. Julia has
appeared to grow in popularity also in this world. These high-level
languages are mainly used due to the speed of development.
It is also not uncommon to see FPGA being used in these applications.
For several reasons, they are reconfigurable hardware, so you can
tailor and deploy computation units and data buses specifically
designed for your particular needs. So if you’re working with digital
hardware, you can’t go wrong with FPGAs. In this world, VHDL or
Verilog comes in handy.
As you can see, overall, the applications tend to involve a lot of
different low-level concepts but, at the same time, high-level topics
in terms of prototyping. But, as the post’s header might hint, I’m not
interested in the prototyping aspects of signal processing since I
think those are all well and good. Instead, I’m interested in having a
small thought experiment on whether the low-level elements could be
improved somehow.
I would consider myself a functional programmer first and foremost,
even though I mainly write imperative and/or object-oriented code, at
least professionally. Now in my free time and in non-trivial side
projects (that are not signal processing related), I like to work with
weird languages like Haskell or Common Lisp. Unfortunately, as I
mentioned above, almost all the work in this signal-processing world
is written in C or C++, emphasizing the latter. However, I completely
understand why these languages are used since we talk about real-time
programming, so latency needs to be minimized.
“Real-time” can be understood that the program has to produce the
correct result but also on a certain amount of time (which varies
between systems).
If we use audio processing as an example, typically, you would have
some sort of processing function in your code that would work in the
audio callback:
process :: BufferRef -> ()
This function would get its callback from a sound card or some input
device, e.g. microphone. After it has received its callback, this
block of code (whatever might be inside it) would write the
corresponding audio data into the given buffer. Which would then be
played by the speakers or vice-versa when recording.
This procedure is basically what should happen in real-time lots of
times when we are doing audio processing. Audio software is often set
up to send these audio callbacks from a high-priority “real-time”
thread with a very short latency between the callbacks, ~1-10ms
(varies between systems).
To achieve this minimal latency between callbacks, you often can’t
rely on garbage collection since you can’t be sure when your program
launches it. I dare to say that most of the software benefits from GC
significantly, but in the audio, making GC right is very hard. What
makes it hard is that if GC launches at the wrong time or the latency
between callbacks gets too large, garbage data will leak into the
buffer, causing unwanted sounds.
Most other software might only see a slight latency in their
computations if they do profiling, so that might not be the end of the
world, of course, depending on your context. But in audio, you cannot
let that happen since you can literally hear that glitch, which is
unforgivable.
When it comes to C or C++, I think everyone knows their foot guns that
involve memory management. Thankfully in modern C++, it’s not that bad
(as long as you follow core
guidelines),
but there is still a lot of unnecessary baggage when it comes to safe
code in these languages.
Could there be any way we could use garbage-collected language while
doing “real-time” operations and how that could be achieved?
Linear Types
GHC 9.0
introduced support for Linear Haskell, which can be enabled with
-XLinearTypes
. One of the significant use cases for linear types is
implementing latency-sensitive real-time services and analytics
jobs. As I mentioned earlier a major issue in this use case is GC
pauses, which can happen at arbitrary points for reasonably long
periods. The problem is exacerbated as the size of the working set
increases. The goal is to partially or entirely eliminate garbage
collection by controlling aliasing and making memory management
explicit but safe.
So what, then, are linear types. Henry Baker described linear types
and their benefits in his paper Lively Linear Lisp — ‘Look Ma, No
Garbage!’
and also on “Use-once” variables and linear objects: storage
management, reflection and
multi-threading. As you
can see, we are not talking about a new topic. We are talking about
types whose instances must be held in a linear variable. A variable is
linear if it’s accessed exactly once in its scope. Same for a linear
object, their reference count is always 1. When we have this safety
guarantee on the type-level, we can avoid synchronization and GC and
also, we could update linear objects in place since that would be
referentially transparent.
Avoiding Garbage Collection
So why can we avoid synchronization and GC with linear types? If we
would consider the following function as an example:
linearFunc :: a %1-> b
On their own, linear types only gives a type to functions that consume
their argument exactly once when their result is consumed precisely
once. So, alone, they don’t make your programs any faster or safer for
resources. Still, they allow you to write many optimizations and
resource-safe abstractions that weren’t possible before.
First, since linear values can only be used once, these values cannot
be shared. This means that, in principle, they shouldn’t be subject to
GC. But this depends on the value consumer since that may very well do
some sort of de-allocation on the spot. One way to mitigate this could
be to store these values to heap outside of GC’s control.
While utilizing heap for these values alone would diminish the GC, it
would introduce some overhead to your program, which could increase
the total running time of your application. But if we continue using
real-time systems as an example, this isn’t necessarily bad.
In real-time systems, optimizations often happen only to the
worst-case scenarios. This is because you don’t really care about your
latencies as long as they stay within the particular window. But you
watch that those latencies should never go above your maximum limit,
and this is primarily where optimisation utilizing linear types could
come in handy.
Practical Linear Types
Linear types are a blessing in GC languages if you intend to do
anything safely in the low-level world. I would like to continue this
post with some practical examples of how Haskell utilizes these types
and how they can make low-level optimizations and resources safer in
your Haskell code, but that deserves its own post.