Aug 202015

Today I am back for another talk which forms part of the IFIP Summer School on Privacy and Identity Management hosted in Informatics at the University of Edinburgh.

Today’s talk is from Angela Sasse, Professor of Human Centred Technology at University College London, and she also oversees their Computer Security group (her presentation will include work of Anthony Morton). She is also head of the first research group in the UK researching the science of Cyber Security. Apparently she also authored a seminal paper in the ’90s entitled “Humans are not the enemy” which addressed mismatches of perceptions and behaviours. That motif, that users are not the enemy, is still something which has not quite yet been learned by those designing and implementing systems even now. 

I think my title gives you a good idea of what I will be talking about: I will be starting with talking about how people reason about privacy. That is something which is often not accounted for properly, but is important in understanding behaviours. Then I will be talking about why current technologies do not meet their preferences. Then I will look to the future – both some dystopian and utopian scenarios there.

So, how do people reason about privacy? Some work with Adams (2001) looked at this and we used the crucial subtitles “protecting users not just data”. There we pointed out that there is a real difference between how the law treats this, and how people understand privacy. Individuals are pragmatic in their choices, they are thinking about the risks and the benefits – they trade those off. Some of this stuff came out of early internet networking, video calls, etc. but it has stood the test of time as these things have become commonplace.

There has been a raft of research over the last 15 years, not just by computer scientists but also social scientists, ethicists, economists. And we have come to a place that we understand that people do trade risks for benefits but that is not always efficient in an economic sense, it is not always logical… And there are a number of reasons for this: they may not be aware of all risks and consequences – around secondary level information; and around secondary and tertiary usage, aggregation with other data sources; their perception may be skewed by hyperbolic discounting – entirely dismissing things with low risk; there is a paradox here as people do belief in privacy and security but their actions are not always reflective of this.

So, why don’t people act in line with their own preferences? Well there is “Confusology” (Odlyzko) which I’ll come back to. Hyperbolic discounting is about risk in the future and potential, vs rewards that are immediate and tangible (sometimes). Sometimes users say “they know this anyway” – there is no point obfuscating information as “they” know this stuff already – they are just testing honesty or willingness. When you have done a lot of work on financial disclosure this arguement comes up a lot there. It also comes in with ISPs and perceptions of surveillance. Sometimes this reaction is plausible and logical, but sometimes it is much more of a Cognitive Dissonance defense, something of an excuse to minimise workload. That is also why we really do need to work on the public discourse because the more false information is in the public discourse, the most this encourages individuals to make choices in that way. The more we allow that kind of nonsense to be out there, the more it undermines important dicussions of privacy. The final reason is that technology does offer protection people want – but they still want the benefits.

Back to Confusology (Odlyzko 2014), I really recommend Odlyzko’s work here. He talks about several factors: inadvertant disclosure – complex tools make consequences of actions hard to predict; there is too much work – rules and legal jargon make privacy too much work, and people are loathe to expend effort on tasks they see as secondary to their goal. Legal jargon is practically an orchestrated campaign, “I agree with the terms and conditions…” is the biggest lie on the internet!; lack of choice (so consent is not meaningful) – I challenge you to find a provider who offers genuinely meaningful terms of consent; the hidden persuaders – temptation, nudging, exploiting cognitive biases… encouraging users to think that sharing more is the preferred option. I have seen Google encouraging researchers in privacy to work on “opinionated design” because they have tried everything to get people to click through in the right way – they make warnings different every time, hide other options etc. I think this is a slippery slope. In the privacy area we see this choice as pretty fake, particularly if you hide and obscure other options.

The inadvertant disclosure stuff is still happening. Many users do not understand how technology works and that can catch users out – a key example is peer to peer file sharing, but we also see this with apps and the requests they make of your device (use of contacts, data, etc) and there will be lots more inadvertant disclosures associated with that coming out.

Too  much work leads to over disclosure. Once you are in the habit of doing something, you don’t have to think about it too much. It is less work to fill in a form disclosing information you have given before, than to stop and think about what the implications of sharing that data actually are.

We also see successful adopted technologies that fail on privacy. Platforms for Privacy Preferences (P3P) was far too much work to be useful to many people. It was only IE that implemented it, and they did so in a way that websites could systematically escape cookie blocking. It was too complex and too ambiguous for browser vendors. And there is absolutely no means to verify websites do what they say – 5% of TRUST -e “verified” websites had implementation errors in 2010. This is a place where cognitive dissonance kicks in again – people fixate on something that they see as helping with one form of security and don’t necessarily look at other risks. Meanwhile DoNotTrack – users of this are identified more quickly than those who don’t through web finderprinting. Advertising circumvent with Supercookies.

So, it really isn’t clear what you need to do to ensure that the privacy people want is enabled in websites and tools.

To change tack slightly it is worth reflecting on the fact that privacy preferences vary. It can be useful to frame this in a Technology Adoption Framework – TAM offers a useful framework but privacy needs do vary across cultures, and it varies between people. You need to speak to different people in different ways to get the message across. Westin is a three point scale around privacy that you could use, but that is too coarse-grained since it basically only differentiates between hardcore secure users, pragmatists, and those unconcerned.

However there have been various studies with the Westin Scale (see Berkeley Survey 2009; Harris Poll 2003; Harris Poll 1999) and most users fall into the Privacy Pragmatists category. But behaviours, when studied, consistently DO NOT match their preferences! So we need something better.

There have been attempts to improve the Westin scale but there has been limited scope of other alternative measures of privacy concern, e.g. IUIPC (Malhotra et al 2005) and CFIP (Smith et al 1996). And people engage in information seeking behaviours (Beldad et al 2011), since people seek trust signals (trust symbols and trust symptonms) (Riegelsberger et al 2005). Asking people about the provider of a service, and their trust in that provider is important in terms of understanding their behaviour and their preferences.

So my PhD student (Morton) looked to work on development of the Westin scale to better align preferences and behaviours, using a mixture of qualitative and quantitative methods, investigating subjective viewpoints. He has been interviewing people, analysing their statements, and ordering those statements with research participants asking them how well those statements reflected their views. The number of participants (31 offline, 27 online) is relatively small, but the number of statements generated by them was into the thousands – so this is a really complex picture. So, participants ranked statements as important or unimportant with a Q-sort process (a version of card sorting task).

Morton has found that people sort into five categories:

  • Information Controllers – those really aware of the data, looking at the data and what it says about them. These are skeptical people and do not have a high trust in the cloud and want control over the collection, use and dissemination of personal information. For them things that are not important include: organisational assurances; others’ use of the technology service.
  • Security Concerned – their principal focus is on security of the technology platform, providing organisation;s security processes, potential impact on personal security and finances. They are trading off the benefits and risks here. They are less interested in the technology in abstract.
  • Benefit Seekers – are those happy to trade off the risks
  • Crowd Followers – trust in others’ use to make decisions about privacy and security
  • Organisational Assurance Seekers – they look for the organisation to say the right things, disclaimers etc. They expect bad things to happen, and want assurance against that.

Now I think that work is quite interesting. And we are now undertaking a large scale study with 1000 participants in the UK and US with all participants sorted into one of these categories, and several scenarios to assess. The first 300 participants’ contributions already suggest that this is a better model for connecting preference with behaviour.

I did want to talk about why we need to make privacy more salient. Ultimately privacy is about relationships. People manage relationships with other peoplel through selective disclosure of information – that is a fundamental part of how we engage, how we present different personas. As more information is disclosed, the more that is undermined. And that is most obviously taking place in University admissions or potential employer searches for individuals. The inability to make selective disclosures can undermine relationships.

For exampe: a chocolate biscuit purchase: seeing someone buying chocolate biscuits buys the main shop on card, then buys biscuits in cash. It turns out this person’s partner is a health food nut and manages the finances tightly. So that person and their child agree to the healthy food rules at home, but then have access to chocolate biscuits elsewhere. This is how people manage relationships. That sort of lack of disclosure means you do not need to revisit the same arguement time and again, it helps illustrate why privacy is so fundamental to the fabric of society.

We do have ways of making privacy cost more salient. There is this trade off around privacy – we are often told these things are “for your own good”. And without a significant push for evidence that is hard to counter. We don’t force accountability of promised/stated benefits. CCTV in the UK is a great example. It took almost two decades for any investigation into that investment, when there was research it was all pretty damning (Gill and Spriggs 2005; Metropoliton Police Review 2008 – CCTV only contributes to prevention or resolution in 3% of crime, it is costly and there is only 1 crime per 100 cameras). And we have had misuse of CCTV also coming through courts. Investigations into inappropriate behaviour by the London Met Police over a year show inappropriate disclosure – like the CCTV case – a huge percentage of that issue.

We have the extension of the state into something of military surveillance. We see the rise of drones, robots and autonomous vehibles. There is an increasing number of networks and devices – and we see mission creep in this “deeply technophilic” industry. We also see machine learning and big data being advertised as the solve all solution here… But as Stephen Graham notes “emerging security policies are founded on… profiling” of individuals, a Minority Report state. David Murajami Wood from the Surveillance Studies Network talk about automatic classification and risk based profiling as adding up to “social sorting” and we see this with tools like Experian MOSAIC and ACLU Pizza. We must not let this happen without debate, push back, and a proper understanding of the implications.

Odlyzko raised the issue of who controls the information – it is often big global mega corps. The decline of privacy actually undermines the fundamentals of capitalism and the dynamic nature of the market system – a truly dystopian solution.

So, do people really not care? Post Snowden it can seem that way but there are signs to the contrary: the UK Investigatory Powers Tribunal ruled GCHQ surveillance to be illegal; major tech companies are distancing themselves from government, putting up legal resistance; and deploying better security (encryption) and we see talk of a Digital Charter from Tim Berners Lee, progressing this debate. Privacy protection behaviours are not always obvious though.

We also see the idea that “Digital Natives Don’t Care” – now that is not true, they just care about different things, they engage in “social steganography” hiding in plain sight (boyd 2014).

So, in conclusion: technology has profound impact on privacy, in many ways that people don’t understand – at least not immediately; people often eagerly assume and over estimate benefits and under estimate and discount risks; we need to counter this by better communication about risks and benefits; communication needs to relate to what matters to people with different preferences.


Q1) It seems to me that some of the classical social science sources about relationships, what information to ignore and which to note… It seems those sources can be updated and adapted to the modern world and that you can analogyse up to the point

A1) Yes, you look at this area and there are really three people I always go back to from the 1960s: Goffman, Lumans and Giddon.

Q1) And more recently Henry Jenkins too.

Q2) From your presentation many people make poor decisions around privacy, but those are pragmatic choices. But I really do think we don’t see people understanding the impact of surveillance – there is a lack of understanding that not only might they look for terrorists but of the other implications of machine learning, of other use of data, and that that is a level of data use that is not proportionate the problem.

A2) That is the debate we need to see in the public discourse so urgently. There is a pushing out of tools without any consideration of those implications. Using the language of cost and waste around data can be useful here, but some want a story of the negative consequences in order to make sense of this – for instance someone being denied a job because of errors or disclosure.

Q3) Do you think that education institutions in the United Kingdom have any role to set an example or themselves or others, by practicing what academics would advise.

A3) Online privacy protection is part of the national curriculum now. If I was running a school I wouldn’t want to turn it into a prison – metal detectors etc. But there is also the tracking of learning behaviours and activities, data mining to identify individual learning paths – risks there are also something to think about. It is often the most mundane and banal stories that often hit home: what if someone is worried to search for treatment for a disease, lest their own status be disclosed by that? Being tracked changes behaviour.

Q4) The detection rate of terrorism is so low that it is not just a waste of money, it is also ineffective method.

A4) But then it is more convenient to sit behind a computer than to actually be out on the street facing direct human interaction and risk, that may also be part of it.

Q5) Going back to the topic of education. there are quite a lot of primary schools in the UK where they are using apps, ebooks etc. Is there

A5) There are three technologists who did a fantastic study. They found it makes kids more obedient, and they start to behave like people in prison which is damaging to individuals as well as to society. This will foster rather than discourage criminal activity.

Comment) Emmerline Taylor, in Australia, has done a book on how kids respond to technology in schools.

And with that we close a really interesting talk with clear relevance for some of the findings and recommendations coming out of our Managing Your Digital Footprint research work.

Aug 182015

All of this week, whilst I am mainly working on Managing Your Digital Footprint research work, there is a summer school taking place at the University of Edinburgh School of Informatics on Security and Privacy with several talks on social media. This afternoon I’ll be blogging one of these: “Policing and Social Media Surveillance : Should We Have any Privacy in Public?” from the wonderful Professor Lilian Edwards from University of Strathclyde and Deputy Director, CREATe.

I come to you as a lawyer. I often say what I do is translate law to geek, and vice versa. How many here would identify themselves as from a legal discipline (about 10 are), I know most of you are from a computer science or HCI area. What I will talk about is an overlap between law and computer science.

So, a nice way to start is probably David Cameron saying: “In extremis, it has been possible to read someone’s letter, to listen to someone’s call to listen in on mobile communications,” he said. “The question remains: are we going to allow a means of communications where it simply is not possible to do that? My answer to that question is: no, we must not.

I’m going to argue that encryption, privacy, etc. is a good thing and that there should be some aspect of privacy around all of those social media posts we make etc. Now, what if you didn’t have to listen to secret conversations? Well right now the security services kind of don’t… they can use Tumblr, Facebook, Twitter etc..

So, a quick note on the structure of this talk. I will set some context on open source intelligence (OSINT), and Social Media Intelligence (SOCMINT). Then I will talk about legal issues and societal implications.

So, SOCMINT and OSINT. In the last 5-7 years we’ve seen the rise of something called “intelligence led” policing, some talk about this as the Minority Report world – trying to detect crime before they take place. We have general risk aversion, predictive profiles, and we see big data. And we see “Assemblages” of data via private intermediaries. So we see not only the use of policing and intelligence data, but also the wide range of publicly available data.

There has been the growth in open source intelligence, the kind of stuff that easy to get for free, including SOCMINT – the stuff people share on social media. You can often learn a great deal from friends graphs, their social graph – even with good privacy settings that can be exposed (used to always be open) and that is used in friend of friends analysis etc. The appeal of this is obvious – there is a lot of it and it is very cheap to get hold of it (RUSI and Anderson Report 2015), 95% of intelligence gathered is from this sort of “open source” origins, the stuff that is out there (ISC 2015). There have been a number of reports in the last year with increadibly interesting information included. Another report stated that 90% of what you need to know if from this sort of open source, and it’s great because it is cheap.

In terms of uses (Barlett and Miller 2013) are various, but worth noting things like sentiment analysis – e.g. to predict a riot etc, apparently very useful. Acquiring information from the public – have you seen this robber, etc. is very useful. Horison scanning is about predicting disturbance, riots etc. We are also seeing predictive analytics (e.g. IBM Memphis P.D.; PredPol in Kent) and that is very popular in the US, increasingly in the UK too – towards that Minority Report. Now in all of these report there is talk of predition and monitoring, but little mention of monitoring individuals – but clearly that is one of the things this data usage enables.

These practices are rising policy challenges (Omand 2012) of public trust, legitimacy and necessity, transparency. And there is the issue of the European Convention on Human Rights: article 8 gives us the right to a private life, which this sort of practice may breach. Under that article you can only invade privacy for legitimate reasons, only when necessary, and it the level of invasion of privacy can only be proportionate to the need in society.

So, looking at what else is taking place here in contemporary practice: we had the Summer Riots in 2011 where the security services used #tweets, BB texts etc. and post riot reports really capture some of the practice and issues there; Flickr stream of suspect photos leading to 770 arrests ad 167 charges, Facewatch mobile app During the 2012 Olympics the police wanted to use social media data, but basically did not know how. So issues here include police managerial capacity; there is sampling bias (see “Reading the Riots”) as Twitter is a very partial view of what is occuring; And there is human error – e.g. in crowdsourced attempts to identify and locate the Boston Bombings.

So I want to talk about the possibility of using public social media posts and question whether they have any protection as private material.

An individual tweets something, says she didn’t intend for it to be seen by the police, commentators online say “What planet is this individual on? Her tweets are public domain” and that is the attitude one tends to see, including in the law courts. e.g. “a person who walks down the street will inevitably be visible” (PG v UK 2008 ECt HR). In the UK that seems to be the standard perspective, that no reasonable expectation to privacy when expressing yourself in public.

In the US there is even less privacy of social media posts, e.g. see C.f. Bartow (2011) who says “Facebook is a giant surveillance tool, no warrant required, which the government can use… with almost no practical constraints from existing laws”. There is no idea of privacy in the US constitution effectively.

You’d think that the EU would be better but where are our traditional concepts of when “reasonable expectation of privacy arises?” Is it in our body, our home (Rynes ECJ 2013), car, what about our data “relating to you” vs “public sphere” (Cf Koops).

So, what are the legal controls? Well the Data Protection law seems obvious but there are strong UK exemptions around detection and prevention of crime – so there is no need for consent.

How about the European Convention on Human Rights article 8, the right to a “private life”. So, the start of my arguement is Von Hannover ECtHR (2004) about intrusion by press rather than police – Princess Caroline of Monaco was being followed by the press in all of her activities. The Court says, seminally, that this is absolutely an invasion of her private life – even though she is a public figure in a public sphere. So we have a concept of privacy being beyond the bounds of your home, of being able to have a right to privacy when out in public.

Now, that was an important case… But it hasn’t had that much impact. So you have cases where the police take photos of people (Wood v Metropolitan Police 2008) or CCTV (reapplication by JR38 for Jusicial review (2015). In the case of Wood a serial activist was going to a corporate AGM, expected to cause trouble, so police followed him and photographed him. Judge said that he was an activist and well known, and could expect to be followed. The arguement was that the image was a one off thing – that not part of ongoing profile.

The most recent case, which was in Northern Ireland, was caught on CCTV during the NI equivelent of the London Riots. The person in question was 14 year old and images were circulated widely, possibly including to the Derry Journal. Again he uses, but in an interesting way. There are at least three judgements.

Lord Kerr says “The facet that the activity… Is suspected to be criminal… will not alone be sufficient to remove it from… application of article 8”. That’s a big deal – suspicion of criminal activity isn’t enough for your rights to be exempt. However in this case the second test, whether the intrusion is justified, was found to be the case. And they took very little time to decide it was a justified act. Under proportionality of rights of individual, and rights of community to protect itself, they felt this intrusion was justified. They say that he’d benefit too – saying that that 14 year old might be diverted from a life of crime. They lay it on a bit but they are under pressure to justify why they have not stigmatised this youth through sharing his image. So, an interesting case.

So, there is some expectation of privacy in public but even so interference can be justified. Interferance must be justified as necessary, proportionate and according to law. But security usually seems to win in UK? (Wood, JR38). Even if no reasonable expectation of privacy, may still be part of “private life”. But all of this assumes that you know you are being surveilled, of your information being accessed. But you may not know if your data is being used to build up profiles, to build up an airport stop list, etc.

Now, in response to Snowdon, we have something called RIPA – an envisioned “digital” scheme to cover surveillance of personal data. This scheme covers real time interceptions of emails, warrant from secretary of state needed. But social media isn’t part of this. They just seem to be making up how they manage that data.

Now I want to argue that use of SOCMINT shouldn’t have any special excemption…

Demos in 2013 asseted “open” SOCMINT collection (and processing) needs no authorisation of any kind. Why? They argued that no expectation of privacy so long as user new from T&C that public data might be collected, especially via API. I think that is just egregiously stupid… Even if you believed that it would apply to the platform – not for the police, the rest of the world, etc.

The other argument is the detailed profile argument. And that is that even if we admit that this material is “public” there is still part of ECHR which is that detailed profiles of this sort need to be treated with respect – that comes from practices by the Stasi and concerns around the possibility of a secret police state, Juris Prudence (Rotaru v Romania) covers this.

So, my perspective is that there is a real difference between structured and unstructured data… Even if in public is SOCMINT an autoamatic dossier? With Google most of the internet is a structured dossier. With that in mind ECtHR case law has seen structured dossiers maintained ver time as a key threat – Rotaru v Romainis dictum: “public information can fall within the scope of private life where it is systematically collected and stored in files held by authorities”. So does the Rotaru distinction between structured data in files held by police, and unstructured data hold up in the age of Google and data mining (e.g. Google Spain (ECJ 2014), UK RIPA case (2015).

As we move into the internet as the main site for key publishing of data, and as the internet of things and smart cities come online


Q1) Should we be able to do data mining on large sets of social data?

A1) Big data, data mining and the internet of things can be seen as the three horsemen of the apocalypse in a way. And that’s the other talk I could have given. The police, using this sort of data are using data in a different context, and that isn’t ok under ECHR art 8.

Q2) I remember a paper about a year ago about the distinction between what an individual can do in terms of asking about others etc. They have more right that the police in some contexts.

A2) There is this weird thing where if you are not looking at specific people, you aren’t as restrained. That’s because it used to be the case that you could find out very little without investigating an individual. That has changed considerable but he law hasn’t been updated to reflect that.

Q3) A lot about us is public, so don’t we just have to deal with this. I see the concerns of a police state, but I don’t understand where you are drawing the line on legal controls on policing. If they can only do the same as a member of the public then there shouldn’t be an issue there…

A3) You’ve given that answer yourself – the power dynamic is asymmetrical. They have capacity to join data up to their own databases – which may include your being a witness or victim of crime, not always suspect or perpetrator. There is a lot of black boxing of data here…

Q3) What controls are you proposing?

A3) Honestly, I don’t know if the quick answer. But if we look at the requirements for intercepting letters, email, telephone are strict, searching homes, pretending to be friend etc. are less strict… But that scooping up of mass data is something different in terms of implications and we need some form of safeguarding around that, even if less strict than some other approaches/interceptions.

There is overwhelming evidence that young people don’t realise the potential implications of their sharing of data, and see these spaces as a private space away from other areas of their life in which they find themselves surveilled. So there is a reasonable presumption of privacy there.

Q3) I think there is a need for appropriate controls on police activities, I agree with that. If I share things only with friends on facebook and police look at that, that is an investigation. But if I tweet something it is public

A3) This is the classic liberal argument I don’t agree with. Tweeting is a bit different. Facebook is the new mall, the new social space, they use openness to serve them socially, believing it will only be read by peers. So they have a reasonable expectation of privacy. Part of Bartett and Millar work is about the use of the word “rape” – in gaming culture it is being used to take a game. Imagine that being crunched. That’s the sort of issue that can arise in big data. I’m not saying police needs a warrant for all Twitter data capture, I’m saying we need to think about what is appropriate.

Q4) There is a perspective that taking the UK out of the EU Human Rights Act is a red herring to distract from other legislation.

A4) Even if we left the EU Human Rights Act, the UK Government would find many of its protections are embedded in other part of EU law, so it would still require appropriate respect of individual rights to privacy. But that’s a political conversation really.

Q5) So, in terms of the issues you have raised, how do we understand what is private and what is public data?

A5) I think essentially that we need to safeguard certain points in what has become a continuum in privacy around human rights, something that will set some barriers about the types of interventions that can occur, and what kind of oversight they require.

And with that Lilian’s excellent and information-packed talk is done. Really interesting and there were clearly plenty more questions arising. Particularly interesting for me thinking about the Digital Footprints work, and the legislative context for the research we have been undertaking on student expectations, experiences, practices.