Some time ago, I wrote a short post about my feelings towards web
analytics which were sparked due to a spike in
visitors on my site (mainly coming from Hacker News). Due to that
surge, I decided to part ways completely from any sort of tracking
since, for me, it was mainly an unnecessary dopamine fix rather than
anything useful.
Today I stumbled upon big news on the front of the legitimacy of web
analytics from the point of view of privacy. Turns out, as most
suspected, it’s not so good, at least according to Austria’s data
protection
authority.
Basically, this case dates back to the invalidation of Privacy Shield
data sharing system between the EU and the US because of overreaching
US surveillance. Turns out that many companies in the US have largely
ignored this invalidation happened in 2020, and despite this, they
have still continued to transfer data from the EU to the US. The
Austrian DPA held that the use of Google Analytics by an Austrian
website provider led to transfers of personal data to Google LLC in
the US in violation of Chapter V. of the
GDPR.
Future of Google Analytics in EU
In the long run, there will be two options: Either the US changes its
surveillance laws to strengthen their tech businesses, or US providers
will have to host the data of European users in Europe. This kind of
transcontinental transfer is currently (as of the time of writing
this) only illegal Austria, but Dutch’s DPA (data protection
authority) has stated that Google Analytics “may soon no longer be
allowed”.
In any case, this is a great thing for privacy in the EU, and
hopefully, many more countries will join Austria in this effort. You
can follow what countries have started to follow this at Is Google
Analytics ILLEGAL in your
country?
I started to rekindle my, unfortunately, lost writing habit a couple
of weeks ago. I set up Google Analytics for this page mainly due to
its easy use to see simple analytics. I was only interested in visitor
count and possibly where my readers’ were coming from. Google
Analytics is a massive tool with massive amounts of data going into
it. I tried to restrict this collection as much as possible, which
suits my personal blog’s needs.
Then my page rose to the front page of Hacker News, and it started to
get a lot of traction. Suddenly, thousands of readers came every day
to my pesky little page with just a few posts as I followed the
visitor counts rising in my Google Analytics view. That got me
thinking about the ethics of this kind of tracking. Which then ended
up with me deleting my account and data from it.
Discomfort With Tracking
Before I deleted my data and account from Google Analytics, I looked
for alternatives. I stumbled upon many other privacy-oriented and
GDPR-compliant analytics platforms, which at first seemed promising.
Also, having good options for ever-prevalent Google Analytics is a
great thing. But despite these features, they don’t remove the
uneasiness mining your users’ data causes. Of course, we are talking
about spying here. Thankfully there are now some restrictions
regarding personally identifiable information (PII), at least in the
GDPR, limiting the shadiness quite a lot. But that brings new issues
in handling this kind of information since you need to be sure that
your software doesn’t leak this information. Thankfully, opting out
entirely from collecting PII in your software is an option.
I understand why people might want to add at least simplistic tracking
to their sites since it can provide helpful information about your
content, companies can see how users use their site, and the list goes
on. Especially when you combine Google Analytics, or similar analytics
tool, with ads, companies can reap significant benefits from this kind
of tracking. But 9 of 10 sites shouldn’t need this. You could argue
that most administrators use this tracking only for dopamine fixes and
don’t utilize the tracked data. Even though they might use it somehow,
how do they inform the user? I dare to say that information about data
usage is almost always written in some shallow boilerplate text or in
no way at all.
GDPR highlights mainly four things about data
usage:
It gives EU citizens the final say on how their data is used. If your
company handles PIIs, there are tighter restrictions on handling
these. Companies can store/use data only if the person consents to
it. User has rights to their data.
Consent is the crucial part here since many sites lack on this front.
There has been a lot of discussion about what should be considered
consent. GDPR Art. 6.1(f) says
that “processing is necessary for the legitimate interests pursued by
the controller or by a third party”. Now legitimate interest is
relatively shallow, and quite a few authorities in Germany, for
example, consider that third-party analytics do not fall under
“legitimate
interest”.
You can utilize consent management platforms to ensure the user’s
consent before dropping the tracking code on your page. But this then
raises the question of what can be considered consent.
Drew DeVault wrote a great post about web analytics and informed
consent.
Informed consent is a principle from healthcare, but it still can
offer significant elements to be utilized, especially in technology
and privacy. Drew split up the essential elements of informed consent
in tracking to these three points:
Disclosure of the nature and purpose of the research and its
implications (risks and benefits) for the participant and the
confidentiality of the collected information. An adequate
understanding of these facts on the part of the participant, requiring
an accessible explanation in lay terms and an assessment of
understanding. The participant must exercise voluntary agreement,
without coercion or fear of repercussions (e.g. not being allowed to
use your website).
Considering these essential elements of informed consent, we agree
that most tracking sites don’t follow these guidelines.
Thankfully trivial tracker blocking is supported already in many
browsers, which makes this issue slightly more bearable, and also,
you’re able to download external tools to do it. But still, this kind
of approach is pretty upside down.
All Kinds of Cookies
Unfortunately, ad-tech companies have tried to make blocking these
harder and harder by constantly evolving these cookies to
evercookies, supercookies,
etc.
The way these have worked is that trackers have stored these
harder-to-detect and delete cookies in different obscure places in the
browser, like Flash storage or HSTS flags. Evercookies were a big
thing in early 2010 since many sites were using Flash and Silverlight,
and those were very exploitable. Today those technologies aren’t used
anymore, but that doesn’t mean the evolution of cookies has
stopped. On the other hand, Supercookies work on the network level of
your service provider.
Thankfully lately, for example, Firefox has been able to start
tackling
these.
In that post, the Firefox team discloses what they had to do to take
some action against this, and it is wild. First, they had to
re-architect the whole connection handling in the browser, which was
first made to increase user experience by reducing overhead to
eliminate these pesky cache-based cookies.
Still, browser
fingerprinting
could be considered the evilest cookie of them all. Browser
fingerprinting identifies everything it can from your system. Like
some cookies, this has real use cases, e.g., preventing fraud in
financial institutions. Still, principally this is just another
intrusive way to track people. Thankfully some modern browsers offer
at least some ways to avoid this, but not a full-fledged solution
(other than disposable systems).
Future of Cookies
Lately, there has been some news about privacy-friendly substitutes
to cookies by tech
giants.
Cookies have been a relatively significant issue privacy-wise for
decades, and since the ad industry is so large, finding a replacement
for these has been hard. So only time will tell. We cannot get rid of
cookies entirely in the near future. They might change into something
else, maybe this kind of API utilizing machine learning to analyze
user data. Which I don’t know is better or worse. So cannot wait!
tin-foil hat tightens
Conclusion
So what is the conclusion here? Probably nothing. Recently started
small-time blogger just got scared from big numbers coming into his
site collecting all kinds of data which ended up with him stopping
this kind of action, at least on his site. Since for most users/sites,
this kind of tracking is just a silly monkey-get-banana dopamine fix.
Don’t track unless you need to; if you do, inform it thoroughly.