Why modern programming languages are like this?

20th of April, 2022

For some weird reason I've always enjoyed the topic of performance and optimizations tremendously. Especially trying to understand why some compiler does various optimizations on an instruction level to the hardware. It's quite a trip to see years of expertise on hardware design and how it works in modern computing. But recently that got me wondering is there really a point to that?

Now don't get me wrong, less instructions usually means slightly faster computation, and that's a good thing, right? But considering modern hardware, is that necessary? Should we be concerned about the fact that our compilers work that hard to minimize the amount of instructions in the output of the code? Of course that would make sense if we would be living in a world where computation would still be slow (I don't know, 50s? 70s?).

These kind of actions to minimize the amount of instructions can easily lead up to some funky situations where familiar operations start behaving unintuitively. For example, common operations like '+' or '<'. In these kind of situations, if the program happens to behave incorrectly, often it's considered to be programmer's fault.

In modern hardware, computations are more or less free, and we almost flirt with the idea of concrete Turing machine with infinite amount of memory. Shouldn't the fact that we mostly use this kind of hardware be reflected in the programming languages also? Especially if we consider the fact that one cache miss can easily lead up to a way bigger run-time compared to hundreds of add instructions. If those extra instructions don't increase the size of the data or the program itself, what's wrong with these extra instructions? Especially considering the fact that we could add quite a bit of run-time computation to the program without affecting too much the total running time of the program.

So instead of focusing on minimizing instructions in the output of the language, we could focus on improving the semantics of the language and pretty much completely remove these common hard-to-find errors from our software. This is especially present in many language where we have multiple different features that does more or less the same thing but they might have slight difference when it comes to the performance.

When we start having multiple of these different features that work pretty much the same way to each other, languages easily start having excess amount of features. Using the large amount various features in one code base can easily lead to complex and hard to understand programs. This then often leads that the used features are limited in one code base, so that programmers in the project only use common subset of the language.

Great example in "modern" programming languages of this is C++ regular vs. virtual functions. These kind of features easily lead to a fact that programmers start wasting their precious time on different micro-optimizations which usually in the grand scheme of things aren't really worthwhile. Especially considering the fact that when we start to focus on these kind of optimizations, we can easily loose focus from the stuff that really matters, large-scale behavior of the program.

Can we fix this anyway? Probably no, since we are already so invested in these kind of languages. We can point fingers to various places and blame them that we are in this situation. New programming language doesn't really solve this issue since we just can't rewrite everything in it, and the migration would be a really slow process. Can we fix existing languages? Probably no, which is why we rely on various external tools to analyze and check our programs and various conventions to follow so that we are able to write the best code possible in these languages.

So modern computing is very exciting but it also can be a mess…

If you have any questions or suggestions, write to topi at topikettunen dot com.

Tags: computers, programming

Now playing: Gillian Welch - The Way It Will Be