The High Privacy Cost of a “Free” Website

kiyanwang | 281 points

Is the problem actually "free?"

I mean, if Disqus or facebook had been offering these as paid services, would we expect them to be respectful of privacy? When buy an android phone, amazon echo or whatnot... When we order stuff on amazon, subscribe to Spotify, download a paid app or even purchase a car... do these come with the expectation of privacy?

Increasingly, data gathering is just built into everything, free, paid or foisted. I think the "if you aren't paying, you're the product" trope is trite at this point. We're "the product" regardless.

Framing this in terms of "the problem with free" is off, IMO. It doesn't point to a solution. A paid facebook or google is neither realistic nor likely to solve anything. All it does is point the finger in the wrong direction, as if we, the ignorant masses, are selling ourselves willingly.

Corporate (and state) espionage is simply the default. Avoiding it takes effort, and compromise. Part of that compromise may be financial, or it may not be.

netcan | 4 years ago

A lot of this comes about, not because of "free," but because of that ol' "solved problem" canard.

As I have stated before, if I had a dollar for every time I've been told "XXX is a solved problem," I'd be rich. Since this phrase is usually bundled into a package, denigrating my own choice to "roll my own," in some effort, I get just a wee bit peeved, when I hear it. I know that I can come across as a cranky bastard that don't trust anyone (possibly because I am), but that doesn't make me wrong. Sometimes, it does mean that I ignore some very good solutions, because I can't bring myself to trust them.

Finding a dependency that does most of what we want, isn't hard. Vetting that dependency; especially in regards to embedded dependencies, is not as easy. This can be made more difficult by downstream dependencies burying ToS statements by upstream dependencies; requiring anyone that includes the dependencies to recurse the chain, studying each ToS.

Nowadays, data is currency. Every application seems to be some form of PID miner. This is why I get a solitaire app that requires me to sign up for an account.

I think there's a lot of legitimately well-done, truly free stuff. A lot of it isn't popular or flashy. Maybe there's an opportunity for someone to create a dependency index that rates things like privacy of dependencies, and unwinds dependency hierarchies, sort of like GitHub does with security issues.

ChrisMarshallNY | 4 years ago

> One of the websites doing this, SunTrust Bank, sent the user name and password we entered to a third party, Jornaya, which says it encrypts and discards the data it collects.

Wow. Deep down the page is a nugget about suntrust just giving away your username and password. Big reminder to use unique passwords of every site.

PrettyPastry | 4 years ago

Creator of the spartapride.org website here! Crazy to see my name pop up in a hackernews post!

This was actually a case of being too tech savy. When I setup spartapride I had all my tracker blockers on and because of that disqus was not loading (hence the like 30 more trackers not loading)

Needless to say the nice people at themarkup contacted me and we had a nice interview/talk.

I allowed the discus tracker through and saw the other 30 being loaded and was just like, "Yep.. that's not worth it" and I ended up removing discus from the site. :)

Lesson learned.

throwaway12757 | 4 years ago

I just used blacklight with my own site and found a perfect score.

My site costs me essentially nothing* to host (netlify and aws serverless technologies that are mostly under the free tier).

*My highest cost so far was when I was debugging serverless websockets and had a bug in my code that caused constant messages between the browser clients I was testing (which I left open for a day when I started work). That cost me $7 dollars.

I have my own service-hosted playground using little more than git and a few cli tools.

We need to rebuild the ad-free web.

ricksharp | 4 years ago

> website operators are often effectively as blind to exactly what information advertising companies and marketers are collecting from their website visitors—and what they’re doing with the data—as the people browsing the internet are

This is part of why I've taken to blocking things at the network level (with pihole ATM). Ignorance does not absolve them of blame, but I can't expect every site to know (or even care) about the issue so I have to take measures myself (or decide not to care).

dspillett | 4 years ago

Perhaps there is an opportunity to create a way to self-host static websites via people's own smartphones. Bandwidth costs for most sites must be tiny.

The business models of website/page-builder platforms are based on gathering a long tail of small businesses, and then selling the audience attention via ad networks and data mining.

Looking at the ecosystem from a distance, it's creating financial value, but there are technical means to provide the same actual business value in cheaper and more privacy-preserving ways.

(By "actual business value" I mean that a person can view the menu at a local cafe in their area -- a valuable interaction for both parties -- without that lookup being intermediated by a platform and N different third parties where a kind of "shadow market value" is extracted)

jka | 4 years ago

> She said she only allowed three trackers on spartapride.org: cookies from Twitter and Facebook that accompany their “like” buttons on the site, and one from Disqus,

There's no need to get tracked just for the like/share buttons. These don't need any JS or third party cookies, just a specially formatted link to the "social" website.

megous | 4 years ago

We're building open-source privacy solutions for websites and we also regularly scan the web for tracking technologies, as the article says Google Analytics is by far the most popular technology we find, followed by Facebook tracking pixels, Adobe Tag Manager and other analytics services. That said the amount of tracking on a site depends greatly on how it is funded. Websites that generate their revenue from advertisements tend to have 5-10 times more trackers installed than e.g. e-commerce websites.

So right now it's pretty much a "Win-Lose" situation as website publishers try to force privacy-invading trackers on users, which clearly have no interest in being tracked. Current consent management solutions are (IMHO) not a great solution for this problem, as consent managers that use dark patterns (e.g. each tracker needs to be disabled by hand, decline buttons are hidden two or three layers deep, ...) are not compliant and consent managers that give users a free choice have bad opt-in rates (which is not surprising). On our own site for example, about 50 % of users decline consent (live stats here: https://kiprotect.com/klaro/demo), based on consent requested via our own, user-friendly consent manager (it's open-source btw: https://github.com/kiprotect/klaro).

Another dilemma is that most publishers don't want to force invasive tracking on users, but they don't have much choice as the tracking/analytics market is highly concentrated and there aren't many privacy-friendly options that can deliver the required functionality and visibility.

ThePhysicist | 4 years ago

This is all a form of privacy arbitrage. Users are trading their privacy in return for access to content at no $$ cost. The problem is that the deal is not fully transparent to the end user. The amount of privacy they are actually trading is far higher than they imagine.

The Blacklight tool is great in that it gives a user visibility into the actual privacy cost of a given website. Whereas they maybe thought their general interest in a topic was the only thing traded, they can now see in how much detail the tracking goes. Far beyond what most likely anticipated.

nathanyz | 4 years ago

The title is quite misleading, the privacy problem is not related with the website being free or not.

p4bl0 | 4 years ago

I mean, yeah, free Disqus is bad, I don't need an "one-of-a-kind free public tool that can be used to inspect websites for potential privacy violations in real time" to tell me that. And that "one-of-a-kind" claim is just laughable, I can name like a dozen number-of-trackers-on-a-page projects.

The only mildly entertaining thing is that the website reporting on this isn't packed filled with trackers of its own (just BlueKai, which unsurprisingly isn't reported by its "one-of-a-kind" tracker detecting tool).

input_sh | 4 years ago

This reminds me of a strange ad I got from Facebook back in the early days( 2009 or so ). I had dated a Korean girl for a brief period of time , and the moment we broke up ( via Facebook) Facebook started spaming me with dating ads. All the dating ads had Asian women in them

I meet this girl in real life, so it's not like they tracked my web activity. I've always felt like they read your messages, the people you interact with , and then build a model of you.

offtop5 | 4 years ago

That title is super misleading. Hosting a free website does not have a high privacy cost. Using "free" third-party services to build your website potentially has a high privacy cost.

CivBase | 4 years ago

It blows my mind how much of the internet economy is built on sucking up user data, some of which makes sense, but I'd imagine the majority of 3rd party tracker data is used for recommending ads. Ads a solid constituent of users ignore entirely. I have personally clicked on an ad in earnest maybe once or twice in my entire life. I know a lot of this data is poured into recommendation engines but it seems like those I use aggregate results mainly based on the product seller's contract with the site and anonymous user data. If you could observe the gross cost of aggregating all of my data would the net of purchases swayed by the results of tracker data aggregation exceed the cost if excluding purchases which would've been made without any intervention based on that data? I just don't understand the economics of, "data is the new oil" because it seems like the number of profitable conclusions that can be made on mass aggregation are limited to a small handful of huge companies, and those selling data snake oil to political campaigns or whatnot.

ManBlanket | 4 years ago

Alright, that settles it. This weekend I'm removing all third party js stuff I have in my website—either dropping the functionality entirely or replacing it with OSS alternatives I'll manage myself, or that I can trust enough to pay for—starting with disqus.

I had been thinking of getting rid of that for a while, but this is the last push I needed.

luord | 4 years ago

Jaron Lanier gave a good Ted talk about the problem with free. I'm not sure it would have changed things as greed always seems to win.

https://www.ted.com/talks/jaron_lanier_how_we_need_to_remake...

dangerboysteve | 4 years ago

Many of the comments, while meaning well, are way off the mark in my opinion. This isn't about coming up with another solution to make friendlier and less privacy-destructive alternatives: People want zero-friction and these companies give it to them.

These anti-privacy organizations (I include ALL ad companies and the like) do not give a fuck what you want... it's all about them and their customers.

The only solution is war. By this I mean, countering all their attacks by disabling all their weapons.

Ad blockers, tracking blockers, disabling javascript, bypassing paywalls, whatever it takes.

You are just $$$ to them and they will take from you whatever they can.

_Understated_ | 4 years ago

Covid: What is self-isolation and who has to do it?

When do I need to self-isolate? You should self-isolate if: • You have Covid symptoms - a new continuous cough, high temperature, or change in sense of taste or smell • You test positive for Covid-19 • You live with someone who has symptoms, or is ill • You arrive in the UK from one of a number of countries which aren't exempt from quarantine rules • You are contacted by NHS Test and Trace to say you have been in close contact with someone who has tested positive

Check more updates on - https://dailyuknews.com/

jack151091 | 4 years ago

In 1998, we had big banner ads that were relevant to the content on the page, instead of viewer's personal profile. They were generating revenue without invading anyone's privacy.

Companies always buy ads that are not personalized: on live TV, in stadiums, on billboards, on highways and so on.

So it is possible to have ad-supported business without invading privacy. But invading privacy brings a lot more revenue. So I don't agree with the argument that a privacy-respecting free services have to invade privacy.

Also in the report, it also says that many paid services like banks and others were also invading privacy.

sally1620 | 4 years ago

They say "To investigate the pervasiveness of online tracking, The Markup spent 18 months building a one-of-a-kind free public tool that can be used to inspect websites for potential privacy violations in real time. Blacklight reveals the trackers loading on any site—including methods created to thwart privacy-protection tools or watch your every scroll and click."

But the EFF's Privacy Badger does exactly this. It's possible that I'm missing something, but how is Blacklight "one of a kind"?

not2b | 4 years ago

> To avoid giving website analytics market leader Google data about every visitor to his website, Butler said Protonmail built proprietary analytics software. Most websites can set up Google Analytics in an hour, he said, but ProtonMail’s system took years to build, cost half a million dollars in server hardware costs alone and requires a permanent full-time staff to continue to maintain it.

Sounds like an opportunity to sell privacy-respecting analytics software ?

BlueTemplar | 4 years ago

Is there a npm package or a "component" you can drop in a website's source with some config that would allow you to block off anything that's not on the configured "allow-list"? Sounds like a "privacy badger" or "uBlock" of the web-app could be a neat thing

fataliss | 4 years ago

See the tracker feature of the latest Safari, it's hard to find any familiar site that isn't horrific. I know my employer's websites are atrocious with all the connections it makes, but of course I can't do squat about it.

coldcode | 4 years ago

What I wish I could be surprised about is that people are still confused by this.

There ain’t no such thing as a free lunch. TANSTAAFL

These services don’t operate out of thin air - they have overhead that has to be paid for - one way or another.

EricE | 4 years ago

The issue with services like Disqus is that even without the advertising (which adds their trackers), Disqus itself is a tracker, the same way Google Analytics is.

You can see more details here: https://data.disqus.com/

One alternative to Disqus, Commento, suggests that Disqus isn't even GDPR compliant.

whatyesaid | 4 years ago

I run a website with no account or login. Is there any way to get 1-2 USD/CPM in revenue without using a privacy destroying ad network?

doctorfoo | 4 years ago

I'm reading Shoshanna Zuboff's Surveillance Capitalism at the moment and the thing I realized is that I don't fully grasp what privacy is. I mean I know abstractly what it is, but I don't feel it strongly. Or perhaps what I mean is that it's so deep down, so complete a part of me that I don't recognize it as a distinct thing.

She tells a story about a small town in England who tried to stop the Google Streetview car coming through and I thought to myself "what were they actually feeling"? Because I doubt I would have felt that as pertaining to my privacy and yet of course it is.

I think a lot of people like me find it hard to quantify what that thing is and where their boundaries for it are and as a result we are obviously wide open to having it exploited.

The Wikipedia opening line is interesting:

"Privacy is the ability of an individual or group to seclude themselves or information about themselves, and thereby express themselves selectively." [1]

Other dictionary definitions:

"someone's right to keep their personal matters and relationships secret" [2]

Now personally I think privacy != secrecy. It's more like obscurity or vagueness or indeterminacy. Like not being pinned down, rather than having any specific information hidden.

"A state in which one is not observed or disturbed by other people" [3]

I think that's a bit closer but it lacks the idea from Wikipedia of "being able to expression themselves selectively"...

[1] https://en.wikipedia.org/wiki/Privacy

[2] https://dictionary.cambridge.org/dictionary/english/privacy

[3] https://www.lexico.com/definition/privacy

scandox | 4 years ago

neocities.org, an attempt to replace groceries, places no trackers or advertisements your website. And it’s free.

TedDoesntTalk | 4 years ago

So... this is an ad for Blacklight?

josefresco | 4 years ago

A couple of months ago a big German publisher asked us to provide information on our service (regarding GDPR). I asked them who is our mutual client (advertiser), that is using our services on their network. The answer was: The tracker was added programmatically, so they cannot say who's behind it.

And recently, a couple of days ago, a bunch of German publishers started a concerted action. They are asking every known mar tech company (I guess from the official TCF vendor list) to give detailed information about tracking cookies, tracking urls and so on.

It's pretty clear that publishers don't really have an idea what they are implementing on their sites, when e.g. adding the GA containers or selling their assets. The mar tech universe is huge and complex, consisting of hundreds and hundres of different companies, using tracking technologies, piggy-backing pixels, transfering data between each other.

In the beginning there were three main stake holders: Advertisers, Publishers and Users. When mar tech companies arrived, the amount of participants increased tremendously. Just to improve the delivery of ads.

And GDPR? Either you'are facing dark patterns or you have to select between "ad supported" and "paid content". GDPR will not reduce the complexity of the system. Best two evidences:

The TCF. An industrial standard from the industry for the industry. It's not designed to be understood by end users. The user is facing mar tech buzzwords, dozens of purposes and hundreds of vendors. How does that help?

And plugins that hide / skip consent banners automatically, because people hat them.

y42 | 4 years ago

would some steps like blocking 3rd party cookies, social media trackers, cross-site tracking cookies, fingerprinters help to thwart most of such tracking stuff ?

tsjq | 4 years ago

BTW what is a good free CMS now? I looked and Wordpress/Joomla /Drupal were most popular. What year is it?

x87678r | 4 years ago

you can reach for finacial news at :

https://financialeditorial.com

nyaiden | 4 years ago