Posted on 6th of February 2022
| 600 wordsFirst, I would like to say I think Kubernetes is an excellent platform
for its intended purposes. It provides excellent fault tolerance all
over the cluster, a fast and easy way to run updates on your
deployments, great tools for managing services, volumes, metrics, and
more, each has its own lifecycle to manage. Also, implementing your
tooling by extending the Kubernetes’ API is a trivial task, so you can
easily leverage the great tooling to make your own for whatever you
might need.
Today it’s also effortless to spin up a Kubernetes cluster with
various installers and managed options. While complex, it’s still a
step closer to the idea of “just run my code and make it work”. Also,
with containers in the picture, we are pretty close to the magical
situation where we can run the same application similarly on the
laptop and in one cloud.
For me, issues start rising when we use Kubernetes for something other
than its intended purposes. While I don’t have any statistics on this,
I have a pretty strong gut feeling that most of the people running
Kubernetes are using it as a glorified scheduler for placing
containers on nodes as fast as possible. While it’s an excellent and
pretty easy tool to use for orchestrating containers. Its fundamental
purpose is to orchestrate anything crucial to your infrastructure,
like network, storage, and other dependencies.
Kubernetes allows the complete user freedom to run your infrastructure
as you see fit. Despite sounding like a cliche, this kind of freedom
can bear huge responsibilities. I would dare to say that most
developers and system administrators don’t want to make these
decisions. What if, at some point in the development, you wish to
change your networking interface or maybe some dynamic storage
provider? Can you even do such a thing in that stage of development if
the decision was made before you even had anything running in
Kubernetes?
Kelsey Hightower put it nicely a while back when he described
Kubernetes isn’t meant for being a developer platform but a framework
for creating platforms. So while it can work as a developer platform,
and overall it’s pretty easy to get started, kubectl run
and
kubectl expose
, and you’re good to go. That being said, all the API
designs in Kubernetes are created for clusters and how to manage
these. So while containers are part of this, much more must be
leveraged. So should application developers, startups or small
businesses use something like this? Probably not unless they are
developing a platform product.
When we get into cluster management, we need to start thinking about
managing the lifecycle of everything running inside the cluster.
Unfortunately, this is also when things start to get hard. For
example, what to do if something inside the cluster dies? What if I
need to provision something dynamically? Kubernetes is pretty good at
simplifying many of these topics, but all this complexity cannot be
simplified away due to the complexity of things happening behind the
scenes.
Kubernetes has a high entry threshold, and it’s a very complex
project, but still, way too often, I see it marketed as a simple
solution for many problems. While you can effortlessly use Kubernetes
and get lots of stuff done, eventually, you will hit a wall. Deploying
fault-tolerant distributed applications that are scalable against a
pool of machines with dependencies in networking, storage, and more
that’s a complicated problem.
Kubernetes is built for production workloads and running
infrastructure beyond your demo application. For this reason, the
complexity of Kubernetes is justified and should be approached with
that mindset.
Posted on 27th of January 2022
| 474 wordsSo Spotify seems to be a shit show at the moment, at least when
sharing misinformation on their platform. At the centre of it all
seems to be Joe Rogan. Spotify made a widely reported deal with Joe
Rogan for exclusive rights to his podcast in 2020. Weirdly enough,
Rogan has also lately been under heavy scrutiny for his idiotic
statements about COVID.
In January 2022, 270 medical experts submitted an open
letter
for Spotify to moderate this misinformation on their platform. Letter
was largely prompted because of a controversial physician who openly
promoted preposterous nonsense on Joe Rogan Experience (JRE).
A few days ago, Neil Young started a protest against
Spotify
giving them an ultimatum for choosing either Young’s music or JRE. And
well… neither Young nor 270 medical experts were enough to change
Spotify’s mind about the subject since they decided that they would
rather leave JRE on their platform rather than Young.
This is a weird point of view that Spotify has taken in this case
since they have already participated in self-regulation before by
removing harmful content from their platform. This kind of content has
involved music connected one way or another to white supremacy or
neo-nazi movements and much more (very much understandably
so). Spotify also joined many other streaming platforms by removing
another nutjob Alex Jones’ podcast InfoWars from their platform for
spreading misinformation. But guess misinformation from dear Joe Rogan
is not considered harmful by their standards.
Neil Young also wrote:
I sincerely hope that other artists and record companies will move
off the SPOTIFY platform and stop supporting SPOTIFY’s deadly
misinformation about COVID.
Unfortunately, while writing this, not many artists have joined him in
this endeavour. But this also brings a little unfortunate truth about
the situation, Neil Young can leave Spotify without noticing it too
much. Financially, the revenue he gets from Spotify would be a very
minor source of income, especially because last year, he sold 50% of
his publishing rights to the investment fund Hipgnosis, netting him
$150m.
Also, because his fan base is definitely - in general - on the older
end of the spectrum. Yours truly is probably on the younger end. So
most likely, a more significant portion of his revenue comes from
physical products and live shows (although the latter is probably true
for many artists).
Newer artists and bands can’t really take this kind of stand against
large streaming platform since so much of their audience accesses
their music via these means. In most cases, they would like to please
Spotify, hoping to get in one of their playlists and so promoting
their music to a vast audience.
In any case, if more big artists join Young, it definitely can impact
Spotify since these kinds of heavyweight artists can pressure these
big companies. But only time will tell.
Posted on 19th of January 2022
| 310 wordsSome time ago, I wrote a short post about my feelings towards web
analytics
which were sparked due to a spike in
visitors on my site (mainly coming from Hacker News). Due to that
surge, I decided to part ways completely from any sort of tracking
since, for me, it was mainly an unnecessary dopamine fix rather than
anything useful.
Today I stumbled upon big news on the front of the legitimacy of web
analytics from the point of view of privacy. Turns out, as most
suspected, it’s not so good, at least according to Austria’s data
protection
authority
.
Basically, this case dates back to the invalidation of Privacy Shield
data sharing system between the EU and the US because of overreaching
US surveillance. Turns out that many companies in the US have largely
ignored this invalidation happened in 2020, and despite this, they
have still continued to transfer data from the EU to the US. The
Austrian DPA held that the use of Google Analytics by an Austrian
website provider led to transfers of personal data to Google LLC in
the US in violation of Chapter V. of the
GDPR
.
Future of Google Analytics in EU
In the long run, there will be two options: Either the US changes its
surveillance laws to strengthen their tech businesses, or US providers
will have to host the data of European users in Europe. This kind of
transcontinental transfer is currently (as of the time of writing
this) only illegal Austria, but Dutch’s DPA (data protection
authority) has stated that Google Analytics “may soon no longer be
allowed”.
In any case, this is a great thing for privacy in the EU, and
hopefully, many more countries will join Austria in this effort. You
can follow what countries have started to follow this at Is Google
Analytics ILLEGAL in your
country?
Posted on 9th of January 2022
| 1425 wordsLately, I have dedicated a large part of my free time to audio
software. I have done this mainly out of interest in the subject due
to my history in music. But at the same time, I also thought writing
audio software could be a fun passion project or even a small business
that I could work on alongside my day job. I don’t see myself
replacing my current job with this, but maybe I could dedicate 20% of
my work time to it.
The world of audio software is a pretty exciting place. It involves a
lot of low-level systems like signals and real-time operations,
complex math at times, and something you can feel or at least hear.
And what’s great, I don’t have any background in this stuff!
Now I have programmed most of my life and played around with RTOS, but
when it comes to writing algorithms for manipulating digital signals,
that’s new stuff for me. However, I have experience with the topic
from the user point of view since I have been making music for almost
as long as I have programmed. This experience involves playing
instruments, how effects affect the sound, how mixing and mastering
works etc. But what do linear types have to do with any of this?
Signals in the Wild
Like I said earlier, signal processing (not necessarily just audio) is
very low-level stuff. So when working with signals in software, you
often need to work with C or C++. This is mainly due to the performant
and close-to-hardware nature of the languages required to handle and
manipulate signals optimally and efficiently.
Digital signal processing is also full of algorithms. The standard
workflow for people in this industry seems to be that these
applications are prototyped on some high-level language before being
produced. Often in languages/tools like MATLAB, Octave, Mathematica,
and similar, very heavily math-oriented languages and tools. Julia has
appeared to grow in popularity also in this world. These high-level
languages are mainly used due to the speed of development.
It is also not uncommon to see FPGA being used in these applications.
For several reasons, they are reconfigurable hardware, so you can
tailor and deploy computation units and data buses specifically
designed for your particular needs. So if you’re working with digital
hardware, you can’t go wrong with FPGAs. In this world, VHDL or
Verilog comes in handy.
As you can see, overall, the applications tend to involve a lot of
different low-level concepts but, at the same time, high-level topics
in terms of prototyping. But, as the post’s header might hint, I’m not
interested in the prototyping aspects of signal processing since I
think those are all well and good. Instead, I’m interested in having a
small thought experiment on whether the low-level elements could be
improved somehow.
I would consider myself a functional programmer first and foremost,
even though I mainly write imperative and/or object-oriented code, at
least professionally. Now in my free time and in non-trivial side
projects (that are not signal processing related), I like to work with
weird languages like Haskell or Common Lisp. Unfortunately, as I
mentioned above, almost all the work in this signal-processing world
is written in C or C++, emphasizing the latter. However, I completely
understand why these languages are used since we talk about real-time
programming, so latency needs to be minimized.
“Real-time” can be understood that the program has to produce the
correct result but also on a certain amount of time (which varies
between systems).
If we use audio processing as an example, typically, you would have
some sort of processing function in your code that would work in the
audio callback:
process :: BufferRef -> ()
This function would get its callback from a sound card or some input
device, e.g. microphone. After it has received its callback, this
block of code (whatever might be inside it) would write the
corresponding audio data into the given buffer. Which would then be
played by the speakers or vice-versa when recording.
This procedure is basically what should happen in real-time lots of
times when we are doing audio processing. Audio software is often set
up to send these audio callbacks from a high-priority “real-time”
thread with a very short latency between the callbacks, ~1-10ms
(varies between systems).
To achieve this minimal latency between callbacks, you often can’t
rely on garbage collection since you can’t be sure when your program
launches it. I dare to say that most of the software benefits from GC
significantly, but in the audio, making GC right is very hard. What
makes it hard is that if GC launches at the wrong time or the latency
between callbacks gets too large, garbage data will leak into the
buffer, causing unwanted sounds.
Most other software might only see a slight latency in their
computations if they do profiling, so that might not be the end of the
world, of course, depending on your context. But in audio, you cannot
let that happen since you can literally hear that glitch, which is
unforgivable.
When it comes to C or C++, I think everyone knows their foot guns that
involve memory management. Thankfully in modern C++, it’s not that bad
(as long as you follow core
guidelines,
)
but there is still a lot of unnecessary baggage when it comes to safe
code in these languages.
Could there be any way we could use garbage-collected language while
doing “real-time” operations and how that could be achieved?
Linear Types
GHC 9.0
introduced support for Linear Haskell, which can be enabled with
-XLinearTypes
. One of the significant use cases for linear types is
implementing latency-sensitive real-time services and analytics
jobs. As I mentioned earlier a major issue in this use case is GC
pauses, which can happen at arbitrary points for reasonably long
periods. The problem is exacerbated as the size of the working set
increases. The goal is to partially or entirely eliminate garbage
collection by controlling aliasing and making memory management
explicit but safe.
So what, then, are linear types. Henry Baker described linear types
and their benefits in his paper Lively Linear Lisp — ‘Look Ma, No
Garbage!’
and also on “Use-once” variables and linear objects: storage
management, reflection and
multi-threading
. As you
can see, we are not talking about a new topic. We are talking about
types whose instances must be held in a linear variable. A variable is
linear if it’s accessed exactly once in its scope. Same for a linear
object, their reference count is always 1. When we have this safety
guarantee on the type-level, we can avoid synchronization and GC and
also, we could update linear objects in place since that would be
referentially transparent.
Avoiding Garbage Collection
So why can we avoid synchronization and GC with linear types? If we
would consider the following function as an example:
linearFunc :: a %1-> b
On their own, linear types only gives a type to functions that consume
their argument exactly once when their result is consumed precisely
once. So, alone, they don’t make your programs any faster or safer for
resources. Still, they allow you to write many optimizations and
resource-safe abstractions that weren’t possible before.
First, since linear values can only be used once, these values cannot
be shared. This means that, in principle, they shouldn’t be subject to
GC. But this depends on the value consumer since that may very well do
some sort of de-allocation on the spot. One way to mitigate this could
be to store these values to heap outside of GC’s control.
While utilizing heap for these values alone would diminish the GC, it
would introduce some overhead to your program, which could increase
the total running time of your application. But if we continue using
real-time systems as an example, this isn’t necessarily bad.
In real-time systems, optimizations often happen only to the
worst-case scenarios. This is because you don’t really care about your
latencies as long as they stay within the particular window. But you
watch that those latencies should never go above your maximum limit,
and this is primarily where optimisation utilizing linear types could
come in handy.
Practical Linear Types
Linear types are a blessing in GC languages if you intend to do
anything safely in the low-level world. I would like to continue this
post with some practical examples of how Haskell utilizes these types
and how they can make low-level optimizations and resources safer in
your Haskell code, but that deserves its own post.