The Internet and the Great Data Deletion Debate

Posted on August 15th, 2013 by



Can your data, once uploaded publicly onto the Web, ever realistically be forgotten?  This was the debate I was having with a friend from the IAPP last night.  Much has been said about the EU’s proposals for a ‘right to be forgotten’ but, rather than arguing points of law, we were simply debating whether it is even possible to purge all copies of an individual’s data from the Web.

The answer, I think, is both yes and no: yes, it’s technically possible, and no, it’s very unlikely ever to happen.  Here’s why:

1. To purge all copies of an individual’s data from the Web, you’d need either (a) to know where all copies of those data exist on the Web, or (b) the data would need some kind of built-in ‘self-destruct’ mechanism so that it knows to purge itself after a set period of time.

2.  Solution (a) creates as many privacy issues as it solves.  You’d need either to create some kind of massive database tracking where all copies of data go on the Web or each copy of the data would need, somehow, to be ‘linked’ directly or indirectly to all other copies.  Even assuming it was technically feasible, it would have a chilling effect on freedom of speech – consider how likely a whistleblower would be to post content knowing that every content of that copy could be traced back to its original source.  In fact, how would anyone feel about posting content to the Internet knowing that every single subsequent copy could easily be traced back to their original post and, ultimately, back to them?

3.  That leaves solution (b).  It is wholly possible to create files with built in self-destruct mechanisms, but they would no longer be pure ‘data’ files.  Instead, they would be executable files – i.e. files that can be run as software on the systems on which they’re hosted.  But allowing executable data files to be imported and run on Web-connected IT systems creates huge security exposure – the potential for exploitation by viruses and malicious software would be enormous.  The other possibility would be that the data file contains a separate data field instructing the system on which it is hosted when to delete it – much like a cookie has an expiry date.  That would be fine for propietary data formats on closed IT systems, but is unlikely to catch on across existing, well-established and standardised data formats like .jpgs, .mpgs etc. across the global Web.  So the prospects for solution (b) catching on also appear slim.

What are the consequence of this?  If we can’t purge copies of the individuals’ data spread across the Internet, where does that leave us?  Likely the only realistic solution is to control the propogation of the data at source in the first place.  Achieving that is a combination of:

(a)  Awareness and education – informing individuals through privacy statements and contextual notices how their data may be shared, and educating them not to upload content they (or others) wouldn’t want to share;

(b)  Product design – utilising privacy impact assessments and privacy by design methodologies to assess product / service intrusiveness at the outset and then designing systems that don’t allow illegitimate data propogation; and

(c)  Regulation and sanctions – we need proportionate regulation backed by appropriate sanctions to incentivise realistic protections and discourage illegitimate data trading.  

No one doubts that privacy on the Internet is a challenge, and nowhere does it become more challenging than with the speedy and uncontrolled copying of data.   But let’s not focus on how we stop data once it’s ‘out there’ – however hard we try, that’s likely to remain an unrealistic goal.  Let’s focus instead on source-based controls – this is achievable and, ultimately, will best protect individuals and their data.