9th of January, 2022
Lately, I have dedicated a large part of my free time to audio software. I
have done this mainly out of interest in the subject due to my history in
music. But at the same time, I also thought writing audio software could be a
fun passion project or even a small business that I could work on alongside my
day job. I don't see myself replacing my current job with this, but maybe I
could dedicate 20% of my work time to it.
The world of audio software is a pretty exciting place. It involves a lot of
low-level systems stuff like signals and real-time operations, complex math at
times, and something that you can feel or at least hear. And what's great, I
don't have any background in this stuff!
Now I have programmed most of my life and played around with RTOS, but when it
comes to writing algorithms for manipulating digital signals, that's new stuff
for me. However, I have experience with the topic from the user point of view
since I have been making music for almost as long as I have programmed. This
experience involves playing instruments, how effects affect the sound, how
mixing and mastering works etc. But what do linear types have to do with any
of this?
Signals in the wild
Like I said earlier, signal processing (not necessarily just audio) is very
low-level stuff. So when it comes to working with signals in software, you
often need to work with C or C++. This is mainly due to the performant and
close to hardware nature of the languages required to handle and manipulate
signals optimally and efficiently.
Digital signal processing is also full of algorithms. Standard workflow for
people working in this industry seems to be that these applications are
prototyped on some high-level language before being produced. Often in
languages/tools like MATLAB, Octave, Mathematica, and similar very heavily
math-oriented languages and tools. Julia has appeared to grow in popularity
also in this world. These high-level languages are mainly used due to the
speed of development.
It is also not uncommon to see FPGA being used in these applications. For
several reasons: they are reconfigurable hardware, so you can tailor and
deploy on them computation units and data buses specifically designed to your
particular needs. So if you're working with digital hardware, you can't go
wrong with FPGAs. In this world, VHDL or Verilog comes in handy.
As you can see, overall, the applications tend to involve a lot of different
low-level concepts, but at the same time, high-level topics in terms of
prototyping. But, as the post's header might hint, I'm not interested in the
prototyping aspects of signal processing since I think those are all well and
good. Instead, I'm interested in having a small thought experiment on whether
the low-level elements could be improved somehow.
I would consider myself a functional programmer first and foremost, even
though I mainly write imperative and/or object-oriented code, at least
professionally. Now in my free time and in non-trivial side projects (that are
not signal processing related), I like to work with weird languages like
Haskell or Common Lisp. Unfortunately, as I mentioned above, almost all the
work in this signal processing world is written in C or C++, emphasizing the
latter. However, I completely understand why these languages are used since we
talk about real-time programming, so latency needs to be minimized.
"Real-time" can be understood that the program has to produce the correct
result but also on a certain amount of time (which varies between systems).
If we use audio processing as an example, typically, you would have some sort
of processing function in your code that would work in the audio callback:
process :: BufferRef -> ()
This function would get its callback from either a sound card or either some
input device, e.g. microphone. After it has received its callback, this block
of code (whatever might be inside it) would write the corresponding audio data
into the given buffer. Which would then be played at the speakers or
vice-versa when recording.
This procedure is basically what should happen in real-time lots of times when
we are doing audio processing. Audio software is often set up to send these
audio callbacks from a high priority "real-time" thread with a very short
latency between the callbacks, ~1-10ms (varies between systems).
To achieve this minimal latency between callbacks, you often can't rely on
stuff like garbage collection since you can't be sure when your program
launches it. I dare to say that most of the software benefits from GC
significantly, but in the audio making GC right is very hard. What makes it
hard is that if GC launches at the wrong time or the latency between callbacks
gets too large, garbage data will leak into the buffer, causing unwanted
sounds.
Most other software might only see a slight latency in their computations if
they do profiling, so that might not be the end of the world, of course,
depending on your context. But in audio, you cannot let that happen since you
can literally hear that glitch, which is unforgivable.
When it comes to C or C++, I think everyone knows their foot guns that involve
memory management. Thankfully in modern C++, it's not that bad (as long as you
follow core guidelines), but there is still a lot of unnecessary baggage when
it comes to safe code in these languages.
Could there be any way we could use garbage collected language while doing
"real-time" operations and how that could be achieved?
Linear types
GHC 9.0 introduced support for Linear Haskell, which can be enabled with
-XLinearTypes
. One of the significant use cases for linear types is
implementing latency-sensitive real-time services and analytics jobs. As I
mentioned earlier, a major issue in this use case is GC pauses, which can
happen at arbitrary points for reasonably long periods. The problem is
exacerbated as the size of the working set increases. The goal here is to
partially or entirely eliminate garbage collection by controlling aliasing and
making memory management explicit but safe.
So what then are linear types. Henry Baker described linear types and their
benefits in his paper Lively Linear Lisp — 'Look Ma, No Garbage!' and also on
"Use-once" variables and linear objects: storage management, reflection and
multi-threading. As you can see, we are not talking about a new
topic. Basically, we are talking about types whose instances must be held in a
linear variable. A variable is linear if it's accessed exactly once in its
scope. Same for a linear object, their reference count is always 1. When we
have this safety guarantee on type-level, we can avoid synchronization and GC
and also, we could update linear objects in place since that would be
referentially transparent.
Avoiding garbage collection
So why we can avoid synchronization and GC with linear types? If we would
consider the following function as an example:
linearFunc :: a %1-> b
On their own, linear types only gives a type to functions that consume their
argument exactly once when their result is consumed precisely once. So alone,
they don't make your programs any faster or more safe for resources. Still,
they allow you to write many different optimizations and resource-safe
abstractions that weren't possible before.
First, since linear values can only be used once, these values cannot be
shared. This means that, in principle, they shouldn't be a subject to GC. But
this is very dependant on the consumer of the values since that may very well
do some sort of de-allocation on the spot. One way to mitigate this could be
to store these values to heap outside of GC's control.
While utilizing heap for these values alone would diminish the GC, it would
introduce some overhead to your program, which could increase the total
running time of your application. But if we continue to use real-time systems
as an example, this isn't necessarily a bad thing.
In real-time systems, optimizations often happen only to the worst-case
scenarios. This is because you don't really care about your latencies as long
as they stay within the particular window. But you do care that those
latencies should never go above your maximum limit, and this is primarily
where optimizations utilizing linear types could come in handy.
Practical linear types
Linear types are a blessing in GC languages if you intend to do anything
safely in the low-level world. I would like to continue this post with some
practical examples of how Haskell utilizes these types and how they can make
low-level optimizations and resources safer in your Haskell code, but that
deserves their own post.