The idea that what you do online is not a secret is something that we have all — just about — come to terms with. That said, most people still expect a modicum of privacy, and they certainly don’t expect literally every keystroke they type to be logged by the websites they visit.
But, say researchers at Princeton University, this is exactly what is happening. Hundreds of the most popular websites are using “session replay scripts” that record every single thing a visitor does. They are designed to monitor how visitors interact with a site to help gather information that could improve page design, and the incredibly extensive data that is collected is sent off to a third party for analysis.
What is particularly concerning is the fact that these scripts do not only record the information you purposely give to a website. You would expect that when you fill in a form and send it off, that information will be used in some way. But session replay scripts record everything — including text you type out and then delete before hitting Submit.
The Princeton researchers are starting a new series of posts under the No Boundaries banner. The first post explores session replay scripts and exposes just how invasive they can be. Steven Englehardt writes:
Lately, more and more sites use “session replay” scripts. These scripts record your keystrokes, mouse movements, and scrolling behavior, along with the entire contents of the pages you visit, and send them to third-party servers. Unlike typical analytics services that provide aggregate statistics, these scripts are intended for the recording and playback of individual browsing sessions, as if someone is looking over your shoulder.
The stated purpose of this data collection includes gathering insights into how users interact with websites and discovering broken or confusing pages. However the extent of data collected by these services far exceeds user expectations; text typed into forms is collected before the user submits the form, and precise mouse movements are saved, all without any visual indication to the user. This data can’t reasonably be expected to be kept anonymous. In fact, some companies allow publishers to explicitly link recordings to a user’s real identity.
The researchers found that such scripts were used on 482 of the Alexa top 50,000 sites, and the suggestion that “this data can’t reasonably be expected to be kept anonymous” will set alarm bells ringing.
A video shared by the researchers shows just how much detail these session recording scripts can gather:
Besides the fact that this recording is happening without most people’s knowledge, the fact that even data that is not explicitly shared with a site is recorded is undoubtedly a cause for concern. Remember, everything that’s typed is recorded — and that includes passwords — and shared with a third party. There are a lot of big names making use of session replay scripts — Adobe, Microsoft, WordPress — and while the vast majority of them are gathering data with the best of intentions, it does not make the matter any less concerning. There is always potential for that data to be infiltrated or to fall into the wrong hands.
You can read through the researchers’ findings over on Freedom to Tinker.