The Web in the age of Surveillance Capitalism

Although early web standards forewarned of the privacy risk of technologies like cookies, they never envisioned that the Web Platform would be coopted for global-scale mass surveillance. In response, browser vendors have been working together to clamp down on the most egregious privacy abuses.

In this talk, Marcos will discuss breaking changes and new APIs that will help make the web platform more private and secure, and what these changes will mean for you as a developer and user going forward.


The Web in the age of Surveillance Capitalism

Marcos Caceres, Standards Engineer Mozilla

So what is “surveillance capitalism”? As it became cheap to process and store data, personal data became commidified. The term was coined by Shoshana Zuboff in the title of her book The Age Of Surveillance Capitalism, to describe this new method of profit-making. She also observed that personal information had become the most valuable resource.

https://www.goodreads.com/book/show/26195941-the-age-of-surveillance-capitalism

This data is captured through tracking – the collection and retention of data about a user as they use websites, which may be shared without their consent. Sometimes there are good reasons for tracking – measuring sales conversion for your business is not unreasonable, and tracking can help you improve the user experience of your product or website.

So why are companies tracking users? The data ise useful for marketing, but also less comfortable things like measuring social or political views. It gets even less comfortable when you see entire user sessions being recreated.

If you can link this data to one person across multiple sites and interactions, that data gets deeper and more valuable – particularly to people who want to sell you something; or sell the information itself.

It’s particularly problematic that there are hundreds of trackers trying to do this. Who are we talking about? The biggest are companies you would recognise like Google and Facebook.

In addition to collecting data, some tracking methods include surprisingly large forced downloads – imagine the impact of pushing 1.5megs to someone in a developing nation on an expensive data connection.

Tracking techniques use lots of methods including cookies, URLs, ‘supercookies’, fingerprinting (to identify the user by their very specific device profile) and dual purpose apps.

Cookies are simple – key/value pairs. They do lots of useful things like maintaining state, keeping your session active, remembering your login on frequently-used systems. So there’s plenty to like about them, they’re not all bad.

Where things get muddy is when we do something read a news website and the embedded ads, loaded in iframes (different origins), third parties gain the ability to serve things to the same person across multiple sites. This is where you get the sensation that an ad is ‘following you’ or ‘knows things about you’.

Supercookies come from the law of unintended consequences. They can’t be cleared and they persist in private browsing. They exploit browser features that make them resistant to erasure – it’s essentially a hack. The user does not control them and that makes them much more scary. They are really only used for dodgy purposes.

A super cookie example: using an HSTS attack to build up a binary data store, exploiting a trick on each site to set a bit on or off.

Fingerprinting example: this attacks capabilities of the browser to create a unique, identifying profile of that browser.

It’s hard to defeat fingerprinting as the sheer number of data points is so high. The settings it has, the extensions installed, system fonts, viewport size, language, Do Not Track header enabled, device hardware, etc.

You can test your vulnerability using EFF’s Panopticlick (https://panopticlick.eff.org).

This obviously has big ramifications, with incidents like the Facebook/Cambridge Analytica scandal – where personal data was used in political campaigns. The data was harvested when people did Facebook quizzes – data was gathered from people who had not used those quizzes or given consent to share the data.

(Video of the BBC’s Katie Hile, talking about the personal information she found in the Cambridge Analytica; and what that identified about her and her life. Ultimately she decided she had to take her information down, to essentially censor her online life.)

So how do we fix this? It’s hard and requires standards, education, governments, law enforcement and industry/NGO engagement. This is a lot of moving parts; and web developers play a role as well.

So what’s the role of educators? Mostly to teach people more about what’s going on; so people can make better choices.

(Video about Mozilla’s Lightbeam, designed to demonstrate the data being gathered; so people can understand what’s going on.)

An important aspect of industry activity is browser manufacturers engaging with this issue… with one key exception…

…Google has a vested interest in tracking, so Chrome does not include blocking options. So even though we love Chrome, if privacy is important it isn’t the best option.

(Demo of what Firefox blocks out of the box just visiting one website. Edge has similar controls built in.)

Where this gets tricky again is dual-purpose applications. Strict blocking will break sites like YouTube, which both provide useful content AND track the user. So there is always a balance to be found between privacy and functionality.

Firefox use disconnect.me as the basis for tracker data; Safari uses an algorithmic approach; Edge will be doing their own version as well.

So what happens to you (or your code) if you are identified as a tracker? Browsers will partition your code, block your cookies (but tell you they were set), block storage, block sensitive APIs. The browser tries to close the gaps they’re sneaking in through, while smokescreening that fact. This can obviously lead to breakage if not applied perfectly.

Other key industry initiatives include Let’s Encrypt, safe browsing list, secure DNS, disconnect.me and so on.

So what role do standards play? Sometimes standards have unintentional consequences, even when they are good attemps (like the failed Do Not Track)… and even when standards (like the cookie standard) include dire warnings about the risks, they can’t stop bad things happening!

But standards evolve and over time the list of questions and requirements get better at heading off negative consequences. People who make standards are highly aware of all of this. The Payment Request API has lots of examples of privacy mechanisms built in (eg. truncating postcodes).

Of course many of these things will require user-disruptive dialogs in order to give control to the user. Again this walks a difficult line between privacy and usability.

So what about developers? We may not have even realised we were complicit in tracking – just by using Google Fonts or a CDN, we will have contributed to user tracking. They’re hard to remove, but you can start with other choices like deciding if you really need to include social media widgets, Google Analytics and so on.

Ask yourself if you can work around including third party code – whether you need it at all, or you can write your own solutions instead. There’s a lot to think about!

@marcosc