Archive for the ‘Data minimisation’ Category

The Internet and the Great Data Deletion Debate

Posted on August 15th, 2013 by



Can your data, once uploaded publicly onto the Web, ever realistically be forgotten?  This was the debate I was having with a friend from the IAPP last night.  Much has been said about the EU’s proposals for a ‘right to be forgotten’ but, rather than arguing points of law, we were simply debating whether it is even possible to purge all copies of an individual’s data from the Web.

The answer, I think, is both yes and no: yes, it’s technically possible, and no, it’s very unlikely ever to happen.  Here’s why:

1. To purge all copies of an individual’s data from the Web, you’d need either (a) to know where all copies of those data exist on the Web, or (b) the data would need some kind of built-in ‘self-destruct’ mechanism so that it knows to purge itself after a set period of time.

2.  Solution (a) creates as many privacy issues as it solves.  You’d need either to create some kind of massive database tracking where all copies of data go on the Web or each copy of the data would need, somehow, to be ‘linked’ directly or indirectly to all other copies.  Even assuming it was technically feasible, it would have a chilling effect on freedom of speech – consider how likely a whistleblower would be to post content knowing that every content of that copy could be traced back to its original source.  In fact, how would anyone feel about posting content to the Internet knowing that every single subsequent copy could easily be traced back to their original post and, ultimately, back to them?

3.  That leaves solution (b).  It is wholly possible to create files with built in self-destruct mechanisms, but they would no longer be pure ‘data’ files.  Instead, they would be executable files – i.e. files that can be run as software on the systems on which they’re hosted.  But allowing executable data files to be imported and run on Web-connected IT systems creates huge security exposure – the potential for exploitation by viruses and malicious software would be enormous.  The other possibility would be that the data file contains a separate data field instructing the system on which it is hosted when to delete it – much like a cookie has an expiry date.  That would be fine for propietary data formats on closed IT systems, but is unlikely to catch on across existing, well-established and standardised data formats like .jpgs, .mpgs etc. across the global Web.  So the prospects for solution (b) catching on also appear slim.

What are the consequence of this?  If we can’t purge copies of the individuals’ data spread across the Internet, where does that leave us?  Likely the only realistic solution is to control the propogation of the data at source in the first place.  Achieving that is a combination of:

(a)  Awareness and education – informing individuals through privacy statements and contextual notices how their data may be shared, and educating them not to upload content they (or others) wouldn’t want to share;

(b)  Product design – utilising privacy impact assessments and privacy by design methodologies to assess product / service intrusiveness at the outset and then designing systems that don’t allow illegitimate data propogation; and

(c)  Regulation and sanctions – we need proportionate regulation backed by appropriate sanctions to incentivise realistic protections and discourage illegitimate data trading.  

No one doubts that privacy on the Internet is a challenge, and nowhere does it become more challenging than with the speedy and uncontrolled copying of data.   But let’s not focus on how we stop data once it’s ‘out there’ – however hard we try, that’s likely to remain an unrealistic goal.  Let’s focus instead on source-based controls – this is achievable and, ultimately, will best protect individuals and their data.

Global protection through mutual recognition

Posted on July 23rd, 2013 by



At present, there is a visible mismatch between the globalisation of data and the multinational approach to privacy regulation. Data is global by nature as, regulatory limits aside, it runs unconstrained through wired and wireless networks across countries and continents. Put in a more poetic way, a digital torrent of information flows freely in all possible directions every second of the day without regard for borders, geographical distance or indeed legal regimes and cultures. Data legislation on the other hand is typically attached to a particular jurisdiction – normally a country, sometimes a specific territory within a country and occasionally a selected group of countries. As a result, today, there is no such thing as a single global data protection law that follows the data as it makes its way around the world.

However, there is light at the end of the tunnel. Despite the current trend of new laws in different shapes and flavours emerging from all corners of the planet, there is still a tendency amongst legislators to rely on a principles-based approach, even if that translates into extremely prescriptive obligations in some cases – such as Spain’s applicable data security measures depending on the category of data or Germany’s rules to include certain language in contracts for data processing services. Whether it is lack of imagination or testimony to the sharp brains behind the original attempts to regulate privacy, it is possible to spot a common pedigree in most laws, which is even more visible in the case of any international attempts to frame privacy rules.

When analysed in practice and through the filter of distant geographical locations and moments in time, it is definitely possible to appreciate the similarities in the way privacy principles have been implemented by fairly diverse regulatory frameworks. Take ‘openness’ in the context of transparency, for example. The words may be slightly different and in the EU directive, it may not be expressly named as a principle, but it is consistently everywhere – from the 1980 OECD Guidelines to Safe Harbor and the APEC Privacy Framework. The same applies to the idea of data being collected for specified purposes, being accurate, complete and up to date, and people having access to their own data. Seeing the similarities or the differences between all of these international instruments is a matter of mindset. If one looks at the words, they are not exactly the same. If one looks at the intention, it does not take much effort to see how they all relate.

Being a lawyer, I am well aware of the importance of each and every word and its correct interpretation, so this is not an attempt to brush away the nuances of each regime. But in the context of something like data and the protection of all individuals throughout the world to whom the data relates, achieving some global consistency is vital. The most obvious approach to resolving the data globalisation conundrum would be to identify and put in place a set of global standards that apply on a worldwide basis. That is exactly what a number of privacy regulators backed by a few influential thinkers tried to do with the Madrid Resolution on International Standards on the Protection of Personal Data and Privacy of 2009. Unfortunately, the Madrid Resolution never became a truly influential framework. Perhaps it was a little too European. Perhaps the regulators ran out of steam to press on with the document. Perhaps the right policy makers and stakeholders were not involved. Whatever it was, the reality is that today there is no recognised set of global standards that can be referred to as the one to follow.

So until businesses, politicians and regulators manage to crack a truly viable set of global privacy standards, there is still an urgent need to address the privacy issues raised by data globalisation. As always, the answer is dialogue. Dialogue and a sense of common purpose. The USA and the EU in particular have some important work to do in the context of their trade discussions and review of Safe Harbor. First they must both acknowledge the differences and recognise that an area like privacy is full of historical connotations and fears. But most important of all, they must accept that principles-based frameworks can deliver a universal baseline of privacy protection. This means that efforts must be made by all involved to see what Safe Harbor and EU privacy law have in common – not what they lack. It is through those efforts that we will be able to create an environment of mutual recognition of approaches and ultimately, a global mechanism for protecting personal information.

This article was first published in Data Protection Law & Policy in July 2013.

The familiar perils of the mobile ecosystem

Posted on March 18th, 2013 by



I had not heard the word ‘ecosystem’ since school biology lessons.  But all of a sudden, someone at a networking event dropped the ‘e’ word and these days, no discussion about mobile communications takes place without the word ‘ecosystem’ being uttered in almost every sentence.   An ecosystem is normally defined as a community of living things helping each other out (some more willingly than others) in a relatively contained environment.  The point of an ecosystem is that completely different organisms – each with different purposes and priorities – are able to co-exist in a more or less harmonious but eclectic way.  The parallel between that description and what is happening in the mobile space is evident.  Mobile communications have evolved around us to adopt a life of their own and separate from traditional desktop based computing and web browsing.  Through the interaction of very different players, our experience of communications on the go via smart devices has become an intrinsic part of our everyday lives. 

Mobile apps in particular have penetrated our devices and lifestyles in the most natural of ways.  Studies show that apparently an average smartphone user downloads 37 apps.  The fact that the term ‘app’ was listed as Word of the Year in 2010 by the American Dialect Society is quite telling.  Originally conceived to provide practical functions like calendars, calculators and ring tones, mobile apps bring us anything that can be digitised and has a role to play in our lives.  In other words, our use of technology has never been as close and personal.  Our mobile devices are an extension of ourselves and mobile apps are an accurate tool to record our every move (in some cases, literally!).  As a result, the way in which we use mobile devices tells a very accurate story of who we are, what we do and what we are about.  Conspiracy theories aside, it is a fact that smartphones are the perfect surveillance device and most of us don’t even know it!

Policy makers and regulators throughout the world are quickly becoming very sensitive to the privacy risks of mobile apps.  Enforcement is the loudest mechanism to show that nervousness but the proliferation of guidance around compliance with the law in relation to the development, provision and operation of apps has been a clear sign of the level of concern.  Regulators in Canada, the USA and more recently in Europe have voiced sombre concerns about such risks.  The close and intimate relationship between the (almost always on) devices and their users is widely seen as an aggravating factor of the potential for snooping, data collection and profiling.  Canadian regulators are particularly concerned about the seeming lightning speed of the app development cycle and the ability to reach hundreds of thousands of users within a very short period of time.  Another generally shared concern is the fragmentation between the many players in the mobile ecosystem – telcos, handset manufacturers, operating system providers, app stores, app developers, app operators and of course anybody else who wants a piece of the rich mobile cake – and the complexity that this adds to it.

All of that appears to compromise undisputed traditional principles of privacy and data protection: transparency, individuals’ control over their data and purpose limitation.  It is easy to see why that is the case.  How can we even attempt to understand – let alone control – all of the ways in which the information generated by our non-stop use of apps may potentially be used when all such uses are not yet known, the communication device is undersized and our eagerness to start using the app acts as a blindfold?  No matter how well intended the regulators’ guidance may be, it is always going to be a tall order to follow, particularly when the expectations of those regulators in terms of the quality of the notice and consent are understandably high.  In addition, the bulk of the guidance has been targeted at app developers, a key but in many cases insignificant player in the whole ecosystem.  Why is the enthusiastic but humble app developer the focus of the compliance guidelines when some of the other parties – led by the operator of the app, which is probably the most visible party to the user – play a much greater role in determining which data will be used and by whom?

Thanks to their ubiquity, physical proximity to the user and personal nature, mobile communications and apps pose a massive regulatory challenge to those who make and interpret privacy rules, and an even harder compliance conundrum to those who have to observe them.  That is obviously not a reason to give up and efforts must be made by anyone who plays a part to contribute to the solution.  People are entitled to use mobile technology in a private, productive and safe way.  But we must acknowledge that this new ecosystem is so complex that granting people full control of the data generated by such use is unlikely to be viable.  As with any other rapidly evolving technology, the privacy perils are genuine but attention must be given to all players and, more importantly, to any mechanisms that allow us to distinguish between legitimate and inappropriate uses of data.  Compliance with data protection in relation to apps should be about giving people what they want whilst avoiding what they would not want.

This article was first published in Data Protection Law & Policy in March 2013.

What to do when you can’t delete data?

Posted on October 2nd, 2012 by



How many lawyers have written terms into data processing contracts along the following lines:  “Upon termination or expiry of this Agreement, the data processor shall delete any and all copies of the Personal Data in its possession or control“?

It’s a classic example of a legal clause that’s ever so easy to draft but, in this day and age, almost impossible to implement in practice.  In most data processing ecosystems, the reality is that there seldom exists just a single copy of our data; instead, our data is distributed, backed-up, and archived across multiple systems, drives and tapes, and often across different geographic locations.  Far from being a bad thing, data distribution, archival and back-up better preserves the availability and integrity of our records.  But the quid pro quo of greater data resilience is that commitments to comprehensively wipe every last trace of our data are simply unrealistic and unachievable.

Nevertheless, once data has fulfilled its purpose, deletion is seemingly what the law requires.  The fifth principle of the Data Protection Act 1998 (implementing Article 6(e) of Directive 95/46/EC) says that: “Personal data processed for any purpose or purposes shall not be kept for longer than is necessary for that purpose or those purposes.“  So how to reconcile this black and white approach to data deletion with the reality of modern day data processing systems?

Thankfully, the ICO has the answer, which it provides in a recently-published guidance note on “Deleting personal data” (available here).  The ICO starts off by acknowledging the difficulties outlined above, commenting that “In the days of paper records it was relatively easy to say whether information had been deleted or not, for example through incineration. The situation can be less certain with electronic storage, where information that has been ‘deleted’ may still exist, in some form or another, within an organisation’s systems.

The sensible answer it arrives at is to say that, if data cannot be deleted for technical or other reasons, then it should instead be put ‘beyond use’.   Putting data ‘beyond use’ has four components, namely:

  1. ensuring that the organisation will not and cannot use the personal data to inform any decision in respect of any individual or in a manner that affects the underlying individuals in any way;
  2. not giving any other organisation access to the personal data;
  3. at all times protecting the personal data with appropriate technical and organisational security; and
  4. committing to delete the personal data if or when this becomes possible.

Broadly speaking, you can condense the four components above into: “Delete it if you can and, if you can’t, make sure it’s stored securely and don’t let anyone use it”. Which is, of course, entirely sensible advice.

It does raise one interesting problem though:  what to do when the individual data subject requests access to his or her data that has been put beyond use?  Here, the ICO again takes a business-friendly view saying simply that “We will not require data controllers to grant individuals subject access to the personal data provided that all four safeguards above are in place.“  In other words, the business does not need to instigate extensive (and expensive) searches of records that have been put beyond use just because an individual requests access to his or her data – for the purposes of subject access, this inert data is treated as if it had been deleted.

But the ICO does issue a warning: “It is bad practice to give a user the impression that a deletion is absolute, when in fact it is not.” So the message to take away is this: make sure you do not commit yourself to data deletion standards that you know, in all likelihood, you can’t and won’t meet.   And, by the same token, don’t let your lawyers commit you to these either!

Why the Big Buzz about Big Data?

Posted on June 29th, 2012 by



Another year, another buzz word, and this time around it’s “Big Data” that’s getting everyone’s attention. But what exactly is Big Data, and why is everyone – commercial organisations, regulators and lawyers – so excited about it?

Put simply, the term Big Data refers to datasets that are very, very large – so large that, traditionally, supercomputers would ordinarily have been required to process them. But, with the irrepressible evolution of technology, falling computing costs, and scalable, distributed data processing models (think cloud computing) Big Data processing is increasingly within the capability of most commercial and research organisations.

In its oft-quoted article “The Data Deluge”, the Economist reports that “Everywhere you look, the quantity of information in the world is soaring. According to one estimate, mankind created 150 exabytes (billion gigabytes) of data in 2005. [In 2010], it will create 1,200 exabytes.“  Let’s put that in perspective – 1,200 exabytes is 1,200,000,000,000 gigabytes of data. A typical Blu-Ray disc can hold 25 gigabytes – so 1,200 exabytes is about the equivalent of about 48 billion Blu-Ray discs. Estimating your typical Blu-Ray movie at about 2 hours long (excluding special features and the like), then there’s at least 96 billion hours of viewing time there, or about 146,000 human life times.  OK, this is a slightly fatuous example, but you get my point – and bear in mind that global data is growing year-on-year at an exponential rate so these figures are already well out of date.

Much of this Big Data will be highly personal to us: think about the value of the data we all put “out there” when we shop online or post status updates, photos and other content through our various social networking accounts (I have at least 5). And don’t forget the search terms we post when we use our favourite search engines, or the data we generate when using mobile – particularly location-enabled – services. Imagine how organisations, if they had access to all this information, could use it to better advertise their products and services, roadmap product development to take account of shifting consumer patterns, spot and respond to potentially-brand damaging viral complaints – ultimately, keep their customers happier and improve their revenues.

The potential benefits of Big Data are vast and, as yet, still largely unrealised. It goes against the grain of any privacy professional to admit that there are societal advantages to data maximisation, but it would be disingenuous to deny this. Peter Fleischer, Google’s Privacy Counsel, expressed it very eloquently on his blog when he wrote “I’m sure that more and more data will be shared and published, sometimes openly to the Web, and sometimes privately to a community of friends or family. But the trend is clear. Most of the sharing will be utterly boring: nope, I don’t care what you had for breakfast today. But what is boring individually can be fascinating in crowd-sourcing terms, as big data analysis discovers ever more insights into human nature, health, and economics from mountains of seemingly banal data bits. We already know that some data sets hold vast information, but we’ve barely begun to know how to read them yet, like genomes. Data holds massive knowledge and value, even, perhaps especially, when we do not yet know how to read it. Maybe it’s a mistake to try to minimize data generation and retention. Maybe the privacy community’s shibboleth of data deletion is a crime against science, in ways that we don’t even understand yet.” (You can access Peter’s blog “Privacy…?” here.)

This quote raises the interesting question of whether the compilation and analysis of Big Data sets should really be considered personal data processing. Of course, many of the individual records within commercial Big Data sets will be personal – but the true value of Big Data processing is often (though not always) in the aggregate trends and patterns they reveal – less about predicting any one individual’s behaviours, reactions and preferences, and more about understanding the global picture. Perhaps its time that we stop thinking of privacy in terms of merely collecting data, and look more to the intrusiveness (or otherwise) of the purposes to which our data are put?

This is perhaps something for a wider, philosophical debate about the pros and cons of Big Data, and I wouldn’t claim to have the answers. What I can say, though, is that Big Data faces some big issues under data protection law as it stands today, not least in terms of data protection principles that mandate user notice and choice, purpose limitation, data minimisation, data retention and – of course – data exports. These are not issues that will go away under the new General Data Protection Regulation which, as if to gear itself up for a fight with Big Data proponents, further bolsters transparency, consent and data minimisation principles, while also proposing a new, highly controversial ‘right to be forgotten’.

So what can and should Big Data collectors do for now? Fundamentally, accountability for the data you collect and process will be key. Your data subjects need to understand how their data will be used, both at the individual and the Big Data level, to feel in control of this and to be comforted that their data won’t be used in ways that sit outside their reasonable expectations of privacy. This is not just a matter of external facing privacy policies, but also a matter of carefully-constructed internal policies that impose sensible checks and balances on the organisation’s use of data. It’s also about adopting Privacy Impact Assessments as a matter of organisational culture to identify and address risks whenever using Big Data analysis for new or exciting reasons.

Big Data is, and should be, the future of data processing, and our laws should not prevent this. But, equally, organisations need to be careful that they do not see the Big Data age as a free for all hunting season on user data that invades personal privacy and control. Big issues for Big Data indeed.

Deconstructing the privacy macaron

Posted on December 7th, 2011 by



Compact.  Self-contained.  Multi-layered.  Hard to penetrate and rich inside with a mix of flavours and tones.  Judging by the commentary surrounding the forthcoming EU data protection framework circulating in the corridors of the IAPP European Data Protection Congress that took place in Paris at the end of November, we could have been describing a typical Parisian macaron instead of a new law.  But if the indications of what we are about to see in the regulation being proposed by the European Commission are true, complying with the future European privacy regime is going to require fine confectionery skills.

So what are the likely ingredients of this extremely elaborate piece of legislation and how will they blend together?

*   A Regulation – It is widely accepted that a regulation, rather than another directive, will be the best recipe for a harmonised regime that delivers a consistent level of protection across the EU.

*   Two-fold objective – Like the original directive, the new regulation will most certainly have a dual aim: protecting personal data and facilitating the intra-EU movement of that data.

*   Applicability based on establishment and targeting of European residents – The novelty being that the use of equipment in the EU will be replaced by data processing directed at those individuals who live in the EU.

*   Privacy principles – Transparency, finality, proportionality and data quality – they are all likely to be there but for added flavour, expect some new ones like data minimisation and accountability.

*   Consent – Individual’s consent will remain a cornerstone of European data protection law but the standard for valid consent will be higher than ever before, with a greater emphasis on the individual’s freedom of choice.

*   Big rights – Some rather radical changes are likely to come in the shape of new or strengthened individuals’ rights.  Top of the list will be the much publicised right to be forgotten followed closely by data portability rights.  No doubt the Commission will want to give people as much control as possible over their data, particularly in relation to profiling activities.

*   Controller’s responsibilities – As a flipside of the increased rights of individuals, controllers are bound to face very specific responsibilities ranging from the adoption of policies and principles such as privacy by design and privacy by default to the training of staff and the appointment of data protection officers.

*   Data breach notification – As is already the case for providers of communications services, an obligation to notify security breaches to data protection authorities (and in some cases to the individuals affected) will now apply to all controllers.

*   International data transfers – Greater flexibility is expected on this issue alongside an express recognition for binding corporate rules, which will be available to both controllers and processors.  An area of concern however is the potential conflict between data requests by non-EU authorities and the limitations on data disclosures, which will probably require the involvement of data protection authorities in determining how to resolve such conflict.

*   Role of data protection authorities – The main novelty on this front is bound to be in relation to their geographical competence.  In all likelihood, the data protection authority of the Member State where the main establishment of a data processing organisation is based will be responsible for supervising that organisation across the whole of the EU.  We can also assume that greater international coordination mechanisms will be in place.

*   Enforcement powers – The promise by the Commission of stronger enforcement powers for the data protection authorities is bound to bring harmonised and succulent monetary fines, which can only be more substantial than what most Member States have at the moment.

All in all, it is beyond doubt that the Commission has been working very hard to craft a framework that fits the regulatory requirements of today’s and tomorrow’s data protection.  Whether the result will suit everyone’s taste is a different matter.

This article was first published in Data Protection Law & Policy in November 2011.