Jul 042017
 

Today I am again at the Mykolo Romerio Universitetas in Vilnius, Lithuania, for the European Conference on Social Media 2017. As usual this is a liveblog so additions, corrections etc. all welcome… 

Keynote presentation: Daiva Lialytė, Integrity PR, Lithuania: Practical point of view: push or pull strategy works on social media 

I attended your presentations yesterday, and you are going so far into detail in social media. I am a practitioner and we can’t go into that same sort of depth because things are changing so fast. I have to confess that a colleague, a few years ago, suggested using social media and I thought “Oh, it’s all just cats” and I wasn’t sure. But it was a big success, we have six people working in this area now. And I’m now addicted to social media. In fact, how many times do you check your phone per day? (various guesses)…

Well, we are checking our smartphones 100-150 times per day. And some people would rather give up sex than smartphones! And we have this constant flood of updates and information – notifications that pop up all over the place… And there are a lot of people, organisations, brands, NGOs, etc. all want our attention on social media.

So, today, I want to introduce three main ideas here as a practitioner and marketer…

#1 Right Mindset

Brands want to control everything, absolutely everything… The colour, the font, the images, etc. But now social media says that you have to share your brand in other spaces, to lose some control. And I want to draw on Paul Holmes, a PR expert (see www.holmesreport.com) and he says when he fell in love with social media, there were four key aspects:

  • Brands (in)dependency
  • Possibilities of (non)control
  • Dialogue vs monologue
  • Dynamic 24×7

And I am going to give some examples here. So Gap, the US fashion brand, they looked at updating their brand. They spent a great deal of money to do this – not just the logo but all the paperwork, branded items, etc. They launched it, it went to the media… And it was a disaster. The Gap thought for a few days. They said “Thank you brand lover, we appreciate that you love our brand and we are going to stick with the old one”. And this raises the question of to whom a brand belongs… Shareholders or customers? Perhaps now we must think about customers as owning the brand.

Yesterday I saw a presentation from Syracuse on University traditions – and some of the restrictions of maintaining brand – but in social media that isn’t always possible. So, another example… Lagerhaus (like a smaller scale Ikea). They were launching a new online store, and wanted to build community (see videos) so targeted interior six design blogs and created “pop up online stores” – bloggers could select products from the store’s selection, and promote them as they like. That gained media attention, gained Facebook likes for the store’s Facebook page. And there was then an online store launch, with invitees approached by bloggers, and their pop up stores continue. So this is a great example of giving control to others, and building authentic interest in your brand.

In terms of dialogue vs monologue I’d quote from Michael Dell here, on the importance of engaging in honest, direct conversations with customers and stakeholders. This is all great… But the reality is that many who talk about this, many are never ever doing this… Indeed some just shut down spaces when they can’t engage properly. However, Dell has set up a social media listening and command centre. 22k+posts are monitored daily, engaging 1000+ customers per week. This was tightly integrated with @dellcares Twitter/Facebook team. And they have managed to convert “ranters” to “ravers” in 30% of cases. And a decrease of negative commentary since engagement in this space. Posts need quick responses as a few minutes, or hours, are great, longer and it becomes less and less useful…

Similarly we’ve seen scandinavian countries and banks engaging, even when they have been afraid of negative comments. And this is part of the thing about being part of social media – the ability to engage in dialogue, to be part of and react to the conversations.

Social media is really dynamic, 24×7. You have to move fast to take advantage. So, Lidl… They heard about a scandal in Lithuania about the army paying a fortune for spoons – some were €40 each. So Lidl ran a promotion for being able to get everything, including spoons there cheaper. It was funny, clever, creative and worked well.

Similarly Starbucks vowing to hire 10,000 refugees in the US (and now in EU) following Trump’s travel ban, that was also being dynamic, responding quickly.

#2 Bold Actions

When we first started doing social media… we faced challenges… Because the future is uncertain… So I want to talk about several social media apps here…

Google+ launched claiming to be bigger than Facebook, to do it all better. Meanwhile WhatsApp… Did great… But disappearing as a brand, at least in Lithuania. SnapChat has posts disappearing quickly… Young people love it. The owner has said that it won’t be sold to Facebook. Meanwhile Facebook is trying desperately to copy functionality. We have clients using SnapChat, fun but challenging to do well… Instagram has been a big success story… And it is starting to be bigger than Facebook in some demographics.

A little history here… If you look at a world map of social networks from December 2009, we see quite a lot of countries having their own social networks which are much more popular. By 2013, it’s much more Facebook, but there are still some national social media networks in Lithuania or Latvia. And then by 2017 we see in Africa uptake of Twitter and Instagram. Still a lot of Facebook. My point here is that things move really quickly. For instance young people love SnapChat, so we professionally need to be there too. You can learn new spaces quickly… But it doesn’t matter as you don’t have to retain that for long, everything changes fast. For instance in the US I have read that Facebook is banning posts by celebrities where they promote items… That is good, that means they are not sharing other content…

I want to go in depth on Facebook and Twitter. Of course the most eminent social media platform is Facebook. They are too big to be ignored. 2 billion monthly active Facebook users (June 2017). 1.28 billion people log onto Facebook daily. 83 million fake profiles. Age 25 to 34 at 29.7% of users are biggest age group. For many people they check Facebook first in the morning when they wake up. And 42% of marketers report that Facebook is very important to their business. And we now have brands approaching us to set up Facebook presence no matter what their area of work.

What Facebook does well is most precise targeting – the more precise the more you pay, but that’s ok. So that’s based on geolocation, demographic characteristic, social status, interests, even real time location. That works well but remember that there are 83 million fake profiles too.

So that’s push, what about pull? Well there are the posts, clicks, etc. And there is Canvas – which works for mobile users, story driven ads (mini landing), creative story, generate better results and click through rates. (we are watching a Nespresso mobile canvas demo). Another key tool is Livestream – free of charge, notifications for your followers, and it’s live discussion. But you need to be well prepared and tell a compelling story to make proper use of this. But you can do it from anywhere in the world. For instance one time I saw livestream of farewell of Barack Obama – that only had 15k viewers though so it’s free but you have to work to get engagement.

No matter which tool, “content is the king!” (Bill Gates, 1996). Clients want us to create good stories here but it is hard to do… So what makes the difference? The Content Marketing Institute (US), 2015 suggest:

  1. Content
  2. Photos
  3. Newsletters
  4. Video
  5. Article
  6. Blogs
  7. Events
  8. Infographics
  9. Mobile applications
  10. Conferences and Livestreams

So, I will give some examples here… I’ll show you the recent winner of Cannes Lions 2017 for social media and digital category. This is “Project Graham” – a public driver safety campaign about how humans are not designed to survive a crash… Here is how we’d look if we were – this was promoted heavily in social media.

Help for push from Facebook – well the algorithms prioritise content that does well. And auctions to reach your audience mean that it is cheaper to run good content that really works for your audience.

And LinkedIn meanwhile is having a renaissance. It was quite dull, but they changed their interface significantly a few months back, and now we see influencers (in Lithunia) now using LinkedIn, sharing content there. For instance lawyers have adopted the space. Some were predicting LinkedIn would die, but I am not so sure… It is the biggest professional social network – 467 million users in 200 countries. And it is the biggest network of professionals – a third have LinkedIn profile. Users spend 17 minutes per dat, 40% use it every day, 28% of all internet users use LinkedIn. And it is really functioning as a public CV, recruitment, and for ambassadorship – you can share richer information here.

I wanted to give a recent example – it is not a sexy looking case study – but it worked very well. This was work with Ruptela, a high tech company that provides fleet management based on GPS tracking and real-time vehicle monitoring and control. They needed to hire rapidly 15 new sales representatives via social media. That’s a challenge as young people, especially in the IT sector – are leaving Lithuania or working in Lithuania-based expertise centres for UK, Danish, etc. brands.

So we ran a campaign, on a tiny budget (incomparable with headhunters for instance), around “get a job in 2 days” and successfully recruited 20 sales representatives. LinkedIn marketing is expensive, but very targeted and much cheaper than you’d otherwise pay.

#3 Right Skills

In terms of the skills for these spaces:

  • copywriter (for good storytelling)
  • visualist (graphics, photo, video)
  • community manager (to maintain appropriate contact) – the skills for that cannot be underestimated.
  • And… Something that I missed… 

You have to be like a one man band – good at everything. But then we have young people coming in with lots of those skills, and can develop them further…

So, I wanted to end on a nice story/campaign… An add for Budweiser for not drinking and driving

Q&A

Q1) Authenticity is the big thing right now… But do you think all that “authentic” advertising content may get old and less effective over time?

A1) People want to hear from their friends, from people like them, in their own words. Big brands want that authenticity… But they also want total control which doesn’t fit with that. The reality is probably that something between those two levels is what we need but that change will only happen as it becomes clear to big brands that their controlled content isn’t working anymore.

Q2) With that social media map… What age group was that? I didn’t see SnapChat there.

A2) I’m not sure, it was a map of dominant social media spaces…

Q3) I wanted to talk about the hierarchy of content… Written posts, visual content etc… What seemed to do best was sponsored video content that was subtitled.

A3) Facebook itself, they prioritise video content – it is cheaper to use this in your marketing. If you do video yes, you have to have subtitles so that you can see rather than listen to the videos… And with videos, especially “authentic video” that will be heavily prioritised by Facebook. So we are doing a lot of video work.

Introduction to ECSM 2018 Niall Corcoran, Limerick Institute of Technology, Ireland

I wanted to start by thanking our hosts this year, Vilnius has been excellent this year. Next year we’ll a bit earlier in the year – late June – and we’ll be at the Limerick Institute of Technology, Ireland. We have campuses around the region with 7000 students and 650 staff, teaching from levels 6 to 10. The nearest airport is Shannon, or easy distance from Cork or Dublin airports.

In terms of social media we do research on Social MEdia Interactive Learning Environment, Limerick Interactive Storytelling Network, Social Media for teaching and research, Social Media for cancer recovery.

In terms of Limerick itself, 80-90% of the Europe’s contact lenses are manufactured there! There is a lot of manufacturing in Limerick, with many companies having their European headquarters there. So, I’ve got a short video made by one of our students to give you a sense of the town. And we hope to see you there next year!

Social Media Competition Update

The top three placed entries are: Developing Social Paleantology – Lisa Lundgren; EDINA Digital Footprint Consulting and Training Service – Nicola Osborne (yay!); Traditions Mobile App – Adam Peruta.

Stream A: Mini track on Ethical use of social media data – Chair: Dragana Calic

The Benefits and Complications of Facebook Memorials – White Michelle, University of Hawai’i at Manoa, USA

I wanted to look at who people imagine are their audience for these memorials. And this happened because after the death made me look at this, and I decided to look into this in more depth.

So, I’m using danah boyd’s definition of social networking here. We are talking Facebook, Twitter, SnapChat etc. So, a Facebook Memorial is a group that is created specifically to mark the death of a friend or family members – or for public figures (e.g. Michael Jackson).

Robert Zebruck and Brubecker talk about imagined audience as the flattening of realities. So, right now I can see people in the room, I can see who you are, how you react, how to modify my tone or style to meet you, to respond to you. But it is hard to do that on social media. We see context collapse. And we can be sat there alone at our computer and not have that sense of being public. Sometimes with memorials we will say things for that audience, but in other cases perhaps it is sharing memories of drinking together, or smoking weed with something… Memories that may jar with others.

It was a long road to get to this research. My review board were concerned about emotional distress of interviewees. I agreed in the end to interview via Skype or Facebook and to check everything was ok after every question, to make it easier to see and review their state of mind. I had to wait over a year to interview people, the death had to not be by suicide, and the participants had to be over 18 years old. So I did conduct qualitative research over Skype and Facebook… And I found interviewees by looking at memorial pages that are out there – there are loads there, not all labelled as memorials.

So, my data… I began by asking who people thought they were talking to… Many hadn’t thought about it. They talked about family members, friends… Even in a very controlled group you can have trolls and haters who can get in… But often people assumed that other people were like them. A lot of people would write to the deceased – as if visiting a grave, say. I asked if they thought the person could hear or understand.. But they hadn’t really thought about it, it felt like the right thing to do… And they wanted family and friends to hear from them. They felt likes, shares, etc. were validating and therapeutic, and that sense of connection was therapeutic. Some even made friends through going out drinking, or family gatherings… with friends of friends who they hadn’t met before…

This inability to really think or understand the imagine audience, that led to context collapse. Usually family is in charge of these pages… And that can be challenging… For instance an up and coming football star died suddenly, and then it was evident that it was the result of a drug overdose… And that was distressing for the family who tried to remove that content. There is an idea of alternative narratives. Fake news or alternative facts has a particular meaning right now… But we are all used to presenting ourselves in a particular way to different friends, etc. In one memorial site the deceased had owed money to a friend, and they still felt owed that money and were posting about that – like a fight at the funeral… It’s very hard to monitor ourselves and other people…

And there was fighting about who owned the person… Some claiming that someone was their best friend, fights over who was more important or who was more influenced. It happens in real life… But not quite as visibly or with all involved…

So, in conclusion… There are  a lot of benefits for Facebook Memorials. Pyschologists talk of the benefit of connecting, grieving, not feeling alone, to get support. Death happens. We are usually sad when it happens… Social networking sites provide another way to engage and connect. So if I’m in Lithuania and there is a funeral in Hawaii that I can’t travel to, I can still connect. It is changing our social norms, and how we connect. But we can do more to make it work better – safety and security needs improving. Facebook have now added the ability to will your page to someone. And now if someone dies you can notify Twitter – it changes it slightly, birthday reminders no longer pop up, it acts as a memorial. There are new affordances.

Personally, doing this research was very sad, and it’s not an area I want to continue looking at. It was emotionally distressing for me to do this work.

Q&A

Q1) I am old enough to remember LiveJournal and remember memorials there. They used to turn a page into a memorial, then were deleted… Do you think Facebook should sunset these memorials?

A1) I personally spoke to people who would stare at the page for a month, expecting posts… Maybe you go to a funeral, you mourn, you are sad… But that page sticking around feels like it extends that… But I bet Mark Zuckerberg has some money making plan for keeping those profiles there!

Q2) What is the motivation for such public sharing in this way?

A2) I think young people want to put it out there, to share their pain, to have it validated – “feel my pain with me”. One lady I spoke to, her boyfriend was killed in a mass shooting… Eventually she couldn’t look at it, it was all debate about gun control and she didn’t want to engage with that any more…

Q3) Why no suicides? I struggle to see why they are automatically more distressing than other upsetting deaths…

A3) I don’t know… But my review board thought it would be more distressing for people…

Q4) How do private memorials differ from celebrity memorials?

A4) I deliberately avoided celebrities, but also my IRB didn’t want me to look at any groups without permission from every member of that group…

Comment) I’ve done work with public Facebook groups, my IRB was fine with that.

A4) I think it was just this group really… But there was concern about publicly identifiable information.

Online Privacy: Present Need or Relic From the Past? – Aguirre-Jaramillo Lina Maria, Universidad Pontificia Bolivariana, Colombia

In the influential essay, The Right to Privacy, in the Harvard Law Review (1890) – Warren and Brandeis, privacy was defined as “Privacy – the right to be let alone”. But in the last ten years or so we now see sharing of information that not long ago would have been seen and expected to be private. Earl Warren is a famous US judge and he said “The fantastic advances in the field of electronic communication constitute a greater danger to the privacy of the individual.”

We see privacy particularly threatened by systematic data collection. Mark Zuckerberg (1999) claims “Privacy is no longer a social norm”. This has been used as evidence of disregard toward users rights and data. The manner in which data is stored, changed and used and the associated threats. But we also see counter arguments such as the American Library Association’s Privacy Revolution campaign.

So, this is the context for this work in Columbia. It is important to understand literature in this area, particularly around data use, data combinations, and the connection between privacy concerns and behaviours online (Joinsen et al 2008). And we also refer to the work of Sheenan (2002) in the characterisations of online users. Particularly we are interested in new privacy concerns and platforms, particularly Facebook. The impact of culture on online privacy has been studied by Cho, Rivera Sanchez and Lim (2009).

The State of the Internet from OxII found that Columbia had between 40 and 60% of people online. Internet uptake is, however, lower than in e.g. the US. And in Columbia our population is 46% 25-54 years old.

So, my study is currently online. A wider group is also engaging in personal and group interviews. Our analysis will focus on what background knowledge, risk and privacy awareness there is amongst participants. Wat self-efficacy level is regealed by participants – their knowledge and habits. And what interest and willingness is there to acquire more knowledge and gain more skills to manage privacy. At a later stage we will be building a prototype tool.

Our conclusions so far… Privacy is hard to define and we need to do more to define it. Privacy is not a concept articulated in one only universally accepted definition. Different groups trade off privacy differently. Relevant concepts here include background knowledge, computer literacy, privacy risk, self efficacy.

And finally… Privacy is still important but often ignored as important in the wider culture. Privacy is not a relic but a changing necessity…

Q&A

Q1) Did age play a role in privacy? Do young people care as much as older people?

A1) They seem to care when they hear stories of peers being bullied, or harassed, or hear stories of hacking Instagram accounts. But their idea of privacy is different. But there is information that they do not want to have public or stolen. So we are looking more at that, and also a need to understand how they want to engage in privacy. As my colleague Nicola Osborne form Edinburgh said in her presentation yesterday, we have to remember students already come in with a long internet/social media history and presence.

Q2) I was wondering about cultural aspect… Apps used and whether privacy is important… For instance SnapChat is very exhibitionist but also ephemeral…

A2) I don’t have full answers yet but… Young people share on SnapChat and Instagram to build popularity with peers… But almost none of them are interested in Twitter… At least that’s the case in Columbia. But they do know some content on Facebook may be more vulnerable that SnapChat and Instagram… It may be that they have the idea of SnapChat as a space they can control perhaps…

Q3) I often feel more liberal with what I share on Facebook, than students who are 10 or 15 years younger… I would have some privacy settings but don’t think about the long story of that… From my experience students are a lot more savvy in that way… When they first come in, they are very aware of that… Don’t want a bigger footprint there…

A3) That is not exactly true in Columbia. The idea of Digital Footprint affecting their career is not a thing in the same way… Just becoming aware of it… But that idea of exhibitionism… I have found that most of the students in Columbia seem quite happy to share lots of selfies and images of their feet… That became a trend in other countries about three years ago… They don’t want to write much… Just to say “I’m here”… And there has been some interesting research in terms of the selfie generation and ideas of expressing yourself and showing yourself… May be partly to do with other issues… In Columbia many young women have plastic surgery – came out of the 1980s and 1990s… Many women, young women, have cosmetic surgery and want to share that… More on Instagram than Pinterest – Pinterest is for flowers and little girlie things…

Q4) You were talking about gender, how do privacy attitudes differ between males and females?

A4) The literature review suggests women tend to be more careful about what they publish online… They may be more careful selecting networks and where they share content… More willing to double check settings, and to delete content they might have difficulty explaining… Also more willing to discuss issues of privacy… Things may change over time… Suggestion that people will get to an age where they do care more… But we also need to see how the generation that have all of their images online, even from being a baby, will think about this… But generally seems to be slightly more concern or awareness from women…

Comment) I wanted to just follow up the Facebook comment and say that I think it may not be age but experience of prior use that may shape different habits there… Students typically arrive at our university with hundreds of friends having used Facebook since school, and so they see that page as a very public space – in our research some students commented specifically on that and their changing use and filtering back of Facebook contacts… For a lot of academics and mid career professionals Facebook is quite a private social space, Twitter plays more that public role. But it’s not age per se perhaps, it’s that baggage and experience.

Constructing Malleable Truth: Memes from the 2016 U.S. Presidential Campaign – Wiggins Bradley, Webster University, Vienna, Austria, Austria

Now, when I wrote this… Trump was “a candidate”. Then he was nominee. Then president elect… And now President. And that’s been… surprising… So that’s the context.

I look at various aspects in my research, including internet memes. So, in the 2008 Obama’s campaign was great at using social media, at getting people out there and sharing and campaigning for them on a voluntary and enthusiastic basis. 2016 was the meme election I think. Now people researching Memes feel they must refer to Richard Dawkins talking about memes. He meant ideas… That’s not the same as internet memes… So what are the differences betwen Dawkins’ memes and Internet memes? Well honestly they are totally different EXCEPT that they require attention, and have to be reproducable….

Mikhail Bakhtin wrote about the Carnivalesque as something that subverts the dominant mode or perspective, it turns the world on its head… The king becomes the jester and the jester becomes the king. So the Trump tie memes… We need no text here, the absurd is made more absurd. It is very critical. It has that circus level laugh… He’s a clown or a buffoon… You know about it and how to reproduce this.

In terms of literature.. There is work on memes but I think when understanding memes with millennials, but also baby boomers, even people in their 70’s and 80s… We have to go back to major theorists, concepts and perspectives – Henry Jenkins, Erving Goffman, etc. This is a new mode of communication I think, not a new language, but a new mode.

So method wise… I wanted to do a rhetorical-critical analysis of selected internet memes from the facebook page Bernie Sanders Dank Meme Stash, which had over 420k members when I wrote this slide – more now. It was founded by a college student in October 2015. And there are hundreds of thousands of memes there. People create and curate them.

Two months before nad one month after the US Election I did two sets of samples… Memes that received 1000 or more likes/retweets. And memes that received at least 500 or more likes/reactions and at least 100 shares. As an unexpected side note I found that I needed to define “media narrative”. There doesn’t seem to be a good definition. I spoke to Brooke Gladstone of WYNC, I spoke with colleagues in Vienna… We don’t usually take time to think about media narrative… For instance the shooting at Pulse Nightclub has a narrative on the right around gun control, for others its around it being a sad and horrible event…

So, media narrative I am defining as:

  1. Malleable depending upon the ability to ask critical questions
  2.  Able to shape opinion as well as perceptions of reality and a person’s decision-making process and…
  3.  Linguistic and image-based simulations of real-world events which adhere and/or appeal to ontologically specific perspectives, which may include any intentional use of stereotyping, ideology, allegory, etc.

Some findings… The relational roles between image and text are interchangable because of the relationship to popular culture. Barthes (1977) takls about the text loading the image burdening it with culture, a moral, an imagination. And therefore the text in internet memes fluctuates depending n the intended message and the dependence on popular culture.

So, for instance we have an image from Nightmare at 20,000 ft, a classic Twilight Zone image… You need to know nothing here and if I replace a monster face with Donald Trump’s face… It’s instantly accessible and funny. But you can put any image there depending on the directionality of the intended meaning. So you have the idea of the mytheme or function of the monster/devil/etc. can be replaced by any other monster… It doesn’t matter, the reaction will depend on your audience.

Back to Barthes (1977) again, I find him incredibly salient to the work I’ve done here. One thing emerging from this and Russian memes work done before, is the idea of Polysemic directionality. It has one direction and intentionality.. No matter what version of this image you use…

So, here’s a quick clip of the Silence of the Lambs. And here Buffallo Bill, who kills women and skins them… A very scary character… We have him in a meme being a disturbing advisor in memes. If you get that reference it has more weight, but you don’t need to know the reference.

We have the image of Hillary as Two Face, we have Donald as The Joker… And a poster saying “Choose”. The vitriol directed at Clinton was far worse than that at Trump… Perhaps because Sanders supporters were disappointed at not getting the nomination.

We have intertextuality, we also have inter-memetic references… For example the Hilary deletes electoral colleges meme which plays on Grandma on the internet memes… YOu also have the Superman vs Trump – particularly relevant to immigrant populations (Jenkins 2010).

So, conclusions… The construction of a meme is affected and dependent on the media around it… That is crucial… We have heard about fake news, and we see memes in support of that fake news… And you may see that on all sides here. Intertextual references rely on popular culture and inter memetic references which assumes knowledge, a new form of communication. And I would argue that memes are a digital myth – I think Levi Strauss might agree with me on that…

And to close, for your viewing pleasure, the Trump Executive Order meme… The idea of a meme, an idea that can be infinitely replaced with anything really…

Q&A

Q1) This new sphere of memes… Do you think that Trump represents a new era of presidency… Do you think that this will pass? With Trump posting to his own Twitter account…

A1) I think that it will get more intense… And offline too… We see stickers in Austrian elections around meme like images… These are tools for millennials. They are hugely popular in Turkey… There are governments in Turkey, Iran and China are using memes as propaganda against other parties… I’m not sure it’s new but we are certainly more aware of it… Trump is a reality TV star with the nucleaur keys… That should scare us… But memes won’t go away…

Q2) In terms of memes in real life… What about bumper stickers… ? They were huge before… They are kind of IRL memes…

A2) I am working on a book at the moment… And one of the chapters is on pre-digital memes. WWII used to write “Kilroy was here”. Is Magritte’s Ceci n’est pas une pipe a meme? There is definitely a legacy of that… So yes, but depends on national regional context…

Q3) So… In Egypt we saw memes about Trump… We were surprised at the election outcome… What happened?

A3) Firstly, there is that bias that reinforcing narrative… If you looked at the Sanders meme page you might have had that idea that Clinton would not win because, for whatever reason, these people hated Hillary. Real rage and hatred towards her… And Trump as clown hitler… Won’t happen… Then it did… Then rage against him went up… After the Muslim ban, the women’s march etc…

Q4) There are some memes that seem to be everywhere – Charlie and the Chocolate Factory, Sean Bean, etc… Why are we picking those specific particular memes of all things?

A4) Like the Picard WTF meme… Know Your Meme.com is a great resource… In the scene that Picard image is from he’s reciting Shakespeare to get Louixana Troy away from the aliens… It doesn’t matter… But it just fits, it has a meaning

Q5) Gender and memes: I wondered about the aspect of gender in memes, particularly thinking about Clinton – many of those reminded me of the Mary Beard memes and trolling… There are trolling memes – the frog for Trump… the semi-pornographic memes against women… Is there more to that than just (with all her baggage) Clinton herself?

A5) Lisa Silfestry from Gonzaga, Washington State and Lemour Shipman in Tel Aviv do work in that area. Shipman looks at Online Jokes of all types and has done some work on gender.

Q6) Who makes memes? Why?

A6) I taught a course on internet memes and cultures. That was one of the best attended courses ever. My students concluded that the author didn’t matter… But look at 4Chan and Reddit or Know Your Meme… And you can tell who created it… But does that matter… It’s almost a public good. Who cares who created the Trump tie meme. With the United Airline you can see that video, it turned into a meme… and it had lost millions in stock.

Stream B: Mini track on Enterprise Social Media – Chair: Paul Alpar

The Role of Social Media in Crowdfunding – Makina Daniel, University of South Africa, Pretoria, South Africa

My work seeks to find the connection between social media and finance, specifically crowd funding. And the paper introduces the phenomena of crowdfunding, and how the theory of social networking underpins social media. The theory around social media is still developing… Underpinned by theory of information systems and technology adoption, with different characteristics from what happens in social media.

So, a definition of crowdfunding. Crowdfunding is essentially an aspect of crowdsourcing, spurred by ubiquitous web 2.0 technologies. And “Crowdfunding refers to the efforts of entrepreneurial individuals and groups – cultural, social and for-profit – to fund their ventures by drawing on relatively small contributions from a relatively large number of individuals using the internet, without standard financial intermediaries” (Mollick 2014).

Since 2010 there have been growing amounts of money raised globally through crowdfunding. Fobes estimates $34 billion in 2015 (compared to $16 billion in 2014, and $880 million in 2010). The World Bank estimates that crowdfunding will raise $93 billion annually by 2025. This growth couldn’t be achieved in the absence of internet technology, and social media are critical in promoting this form of alternative finance.

Cheung and Lee (2010) examined social influence processes in determining collective social action in the context of online social networks. Their model shows intentional soial action, with users considering themselves part of the social fabric. And they explain three processes of social influence: subjective norm – self outside of any group; group norm – self awareness as a member of a group; and social identity – self in context. Other authors explain social media popularity because of a lack of trust in traditional media, with people wary of information that emanates from people they do not know personally. Kaplin and Haenlein (2010) define social media as “a group of internet-based applications that build on the ideological and technological foundations of web 2.0 applications that allow the creation and exchange of user generated content” So it is a form of online interaction that enables people to create, comment, share and exchange content with other people.

So, how does social media facilitate finance, or crowd sourcing? Since social media assists in maintaining social ties, this should in turn aid facilitation of crowdfunding campaigns. Draw on Linus’s Law “given enough eyeballs, all bugs are shallow”. Large groups are more adept at detecting potential flaws in a campaign than individuals (alone). Thus providing fraudulent campaigns from raising money for crowdfunding projects. Facebook, Twitter, etc. provide spaces for sharing and connection are therefore suitable for crowdfunding campaigns. Studies have shown that 51% of Facebook users are more likely to buy a product after becoming a fan of the products Facebook page (Knudsen 2015).

Brossman (2015) views crowdfunding as existing in two phases (i) brand awareness and (ii) targeting people to support/back one’s campaign. And crowdfunding sites such as Kickstarted and IndieGoGo allow project creators to publish pertinent information and updates, as well as to link to social media. Those connections are present and that also helps deal with a relative lack of social networking functionality within the platform itself, where they are able to create project descriptions, they have a community of users and utilise web 2.0 technologies that allow users to comment on projects and attract money.

A study by Moisseyez (2013) on 100 Kickstarter projects found that connection between social media approval and success in funding. Mollick (2014) observed that crowdfunding success is associated with having a large number of friends in online social networks: a founder with ten Facebook friends would have a 9% chance of succeeding; one with 100 friends would have a 20% chance of success; one with 1000 friends would have a 40% chance of success. He cited a film industry example where more friends mapped to a much higher potential success rates.

So, in conclusion, we don’t have many studies on this are yet. But social media is observed to aid crowdfunding campaigns through its ability to network disparate people through the internet. One notable feature is that although there are main forms of social media, crowdfunding utilizes a limited number of spaces, primarily Facebook and Twitter. Furthermore future research should examine how the expertise of the creator (requestor of funds) and project type, social network, and online presence influence motivations.

Q&A

Q1) I was wondering if you see any connection between the types of people who back crowdfunding campaigns, and why particular patterns of social media use, or popularity are being found. For instance anecdotally the people who back lots of crowdfunding campaigns – not just one off – tend to be young men in their 30s and 40s. So I was wondering about that profile of backers and what that looks like… And if that profile of backer is part of what makes those social media approaches work.

A1) The majority of people using social media are young people… But young people as sources of finance for, say, small businesses… They are mainly likely to be either studying or starting professional career… But not accumulating money to give it out… So we see a disconnect… Between who is on social media… On Twitter, Facebook, etc. to raise finance… You successful in raising funding from people who cannot raise much… So one would expect people in mid career were using most social media, would expect more money coming from crowdfunding… One aspect of crowdfunding… We are looking at resources… You asking for small amounts… Then young people are able to spare that much…

Q2) So most people giving funding on crowdfunding sites are young people, and they give small amounts…

A2) Yes… And that data from Mollick… combined with evidence of people who are using Facebook…

Q2) What about other specialised crowdfunding networks… ?

A2) There is more work to be done. But even small crowdfunding networks will connect to supporters through social media…

Q3) Have you looked at the relative offerings of the crowdfunding campaigns?

A3) Yes, technology products are more successful on these platforms than other projects…

Using Enterprise Social Networks to Support Staff Knowledge Sharing in Higher Education – Corcoran Niall, Limerick Institute of Technology, Ireland and Aidan Duane, Waterford Institute of Technology, Ireland

This work is rooted in knowledge management, this is the basis for the whole study. So I wanted to start with a Ikujio Nonaka “in an econoy where the only certainty is uncertainty… ” And Lew Platt, former CEO of Hewlett-Packard said “If HP knew what HP knows it would be three times more productive” – highlighting the crucial role of knowledge sharing.

Organisations can gain competitive advantage through encouraging and promoting knowledge sharing – that’s the theory at least. It’s very important in knowledge-intensive organisations, such as public HEIs. HEIs need to compete in a global market place… We need to share knowledge… Do we do this?

And I want to think about this in the context of social media. We know that social media enable creation, sharing or exchange of information, ideas and media in virtual communities and networks. And organisational applications are close to some of the ideals of knowledge management: supporting group interaction towards establishing communities; enable creation and sharing of content; can help improve collaboration and communication with organisations; distinct technological features that are ideally suited for knowledge sharing; fundamental disruption in knowledge management; and social media is reinvigorating knowledge management as a field.

We do see Enterprise Social Networks (ESN). If you just bring one into an organisation, people don’t necessarily just go and use it. People need a reason to share. So another aspect is communities of practice (Lave and Wenger 1991), this is an important knowledge management strategy, increasingly used. This is about groups pf people who share a passion for something – loose and informal social structures, largely voluntary, and about sharing tacit knowledge. So Communities of Practice (CoP) tend to meet from time to time – in person or virtually.

ESN can be used to create virtual communities. This is particularly suitable for distributed communities – our university has multiple campuses for instance.

So, knowledge sharing in HEIs… Well many don’t do it. A number of studies have shown that KM implementation and knowledge sharing in HEIs is at a low level. Why? Organisational culture, organisational structures, beurocractic characteristics. And there is well documented divide/mistrust between faculty and support staff (silos) – particularly work from Australia, US and UK. So, can CoP and ESN help? Well in theory they can bypass structures that can reinforce silos. That’s an ideal picture, whether we get there is a different thing.

So our research looked at what the antecedents for staff knowledge sharing are; what the dominant problems in the implementation of ESN and CoP. The contextual setting here is Limerick Institute of Technology. I used to work in IT services and this work came significantly from this interest. There is a significant practical aspect to the research so action research seemed like the most appropriate approach.

So we had a three cycle action research project. We looked at Yammer. It has all the features of social networking you’d expect – can engage in conversations, tagged, shared, can upload content. It lends itself well to setting up virtual communities, very flexible and powerful tools for virtual communities. We started from scratch and grew to 209 users.

Some key findings… We found culture and structure are major barriers to staff knowledge sharing. We theorised that and found it to be absolutely the case. The faculty staff divide in HEI exacerbates the problem. Management have an important role to play in shaping a knowledge sharing environment. The existence of CoP are essential to build a successful knowledge sharing environment, and community leaders and champions are require for the ESN. Motivation to participate is also crucial. If they feel motivated, and they see benefit, that can be very helpful. And those benefits can potentially lead to culture change, which then effects motivation…

We found that our organisation has a strong hierarchical model. Very beaurocratic and rigid. Geographic dispersal doesn’t help. To fix this we need to move from a transactional culture. The current organisational structure contributes to the faculty staff divide, limits opportunities and motivations for staff and faculty to work together. But we also found examples where they were working well together. And in terms of the role of management, they have significant importance, and have to be involved to make this work.

Virtual communities are a Knowledge Management strategy has the potential to improve collaboration and interaction between staff, and it has to be seen as valued, relevant, a valid work activity. Staff motivation wise there are some highly motivated people, but not all. Management have to understand that.

So management need to understand the organisational culture; recognise the existence of structural and cultural problems; etc. Some of the challenges here are the public sector hierarchical structures – public accountability, media scrutiny, transitional culture etc.

Q&A

Q1) On a technical level, which tools are most promising for tacit knowledge sharing…

A1) The whole ability to have a conversation. Email doesn’t work for that, you can’t branch threads… That is a distinctive feature of Yammer groups, to also like/view/be onlookers in a conversation. We encourage people to like something if they read it, to see that it is useful. But the ability to have a proper conversation, and organised meetings and conversations in real time.

Q2) What kind of things are they sharing?

A2) We’ve seen some communities that are large, they have a real sense of collaboration. We’re had research coming out of that, some really positive outcomes.

Q3) Have you seen any evidence of use in different countries… What are barriers across different regions, if known?

A3) I think the barriers are similar to the conceptual model (in the proceedings) – both personal and organisational barriers… People are afraid largely to share stuff… They are nervous of being judged… Also that engagement on this platform might make managers thing that they are not working. Age is a limiting factor – economic issues mean we haven’t recruited new staff for almost 10 years, so we are older as a staff group.

Q3) Might be interested to compare to different cultures, with asian culture more closed I think…

A3) Yes, that would be really interesting to do…

Q4) I am trying to think how and what I might share with my colleagues in professional services, technical staff, etc.

A4) The way this is constructed is in communities… We have staff interested in using Office 365 and Classroom Notebook, and so we set up a group to discuss that. We have champions who lead that group and guide it. So what is posted there would be quite specific… But in Yammer you can also share to all… But we monitor and also train our users in how and where to post… You can sign up for groups or create new groups… And it is moderated. But not limited to specifically work related groups – sports and social groups are there too. And that helps grow the user base and helps people see benefits.

Q5) Have you looked at Slack at all? Or done any comparison there?

A5) We chose Yammer because of price… We have it in O365, very practical reason for that… We have looked at Slack but no direct comparison.

Finalists in the Social Media in Practice Excellence Competition present their Case Histories

EDINA Digital Footprint Consulting and Training Service – Nicola Osborne

No notes for this one…

Developing Social Paleantology – Lisa Lundgren;

This is work with a software development company, funded by the National Science Foundation. And this was a project to develop a community of practice around paleontology… People often think “dinosaur” but actually it’s about a much wider set of research and studies of fossils. For our fossil project to meet it’s goal, to develop and support that community, we needed to use social media. So we have a My Fossil community, which is closed to the community, but also a Facebook group and Twitter presence. We wanted to use social media in an educative way to engage the community with our work.

We began with design studies which looked at what basic elements to contribute to engage with social media, and how to engage. We were able to assess practical contributions and build an educatie and evidence-based social media plan. So we wanted to create daily posts using social paleantology, e.g. #TrilobiteTuesday; design branded image-focused posts that are practice-specific, meet design principles, often huperlinks to vetted paleontological websites; respond to members in ways that encourage chains of communication. There is a theoretical contribution here as well. And we think there are further opportunities to engage more with social paleontology and we are keen for feedback and further discussion. So, I’m here to chat!

 

Traditions Mobile App – Adam Peruta.

When new university students come to campus they have lots of concerns like what is this place, where do I fit in, how can I make new friends. That is particularly the case at small universities who want to ensure students feel part of the community, and want to stay around. his is where the Traditions Challenge app comes in – it provides challenges and activities to engage new students in university traditions and features. This was trialled at Ithaca University. So, for instance we encourage students to head along to go along to events, meet other new students, etc. We encourage students to meet their academic advisors outside of the classroom. To explore notable campus features. And to explore the local community more – like the farmers market. So we have a social feed – you can like, comment, there is an event calendar, a history of the school, etc. And the whole process is gamified, you gain points through challenges, you can go on the leaderboard so there are incentives to gain status… And there are prizes too.

Looking at the results this year… We had about 200 students who collectively completed over 1400 challenges, the person who completed the most (and won a shirt) completed 53 challenges. There are about 100 challenges in the app so it’s good they weren’t all done in one year. And we see over 50k screen views so we know that the app is getting more attention whether or not people engage in the challenges. Students focus groups raised themes of the enjoyment of the challenge list, motivation for participation (which varied), app design and user experience – if there’s one key takeaway: this demographic has really high expectations for user interface, design and tone; contribution to identity… Lots of academic research that the more students are engaged on campus, the more likely they will remain at that university and remain engaged through their studies and as alumni. So there is loads of potential here, and opportunity to do more with the data.

So, the digital experience is preferred, mobile development is expensive and time consuming, good UI/UX is imperative to success, universities are good at protecting their brands, and we learned that students really want to augment their on-campus academic experiences.

Conference organiser: Those were the finalists from yesterday, so we will award the prizes for first, second and third… and the PhD prize…

Third place is Lisa; Second place is me (yay!); First place is Adam and the Traditions mobile app.

I’m going to rely on others to tweet the PhD winners…

The best poster went to IT Alignment through Artificial Intelligence – Amir  – this was mainly based on Amir’s performance as his poster went missing so he had to present to an A4 version of the poster so he did a great job of presenting.

Thank you to our hosts here… And we hope you can join us in Limerick next year!

Thanks to all at ECSM 2017.

Apr 092017
 
Digital Footprint MOOC logo

Last Monday we launched the new Digital Footprint MOOC, a free three week online course (running on Coursera) led by myself and Louise Connelly (Royal (Dick) School of Veterinary Studies). The course builds upon our work on the Managing Your Digital Footprints research project, campaign and also draws on some of the work I’ve been doing in piloting a Digital Footprint training and consultancy service at EDINA.

It has been a really interesting and demanding process working with the University of Edinburgh MOOCs team to create this course, particularly focusing in on the most essential parts of our Digital Footprints work. Our intention for this MOOC is to provide an introduction to the issues and equip participants with appropriate skills and understanding to manage their own digital tracks and traces. Most of all we wanted to provide a space for reflection and for participants to think deeply about what their digital footprint means to them and how they want to manage it in the future. We don’t have a prescriptive stance – Louise and I manage our own digital footprints quite differently but both of us see huge value in public online presence – but we do think that understanding and considering your online presence and the meaning of the traces you leave behind online is an essential modern life skill and want to contribute something to that wider understanding and debate.

Since MOOCs – Massive Open Online Courses – are courses which people tend to take in their own time for pleasure and interest but also as part of their CPD and personal development so that fit of format and digital footprint skills and reflection seemed like a good fit, along with some of the theory and emerging trends from our research work. We also think the course has potential to be used in supporting digital literacy programmes and activities, and those looking for skills for transitioning into and out of education, and in developing their careers. On that note we were delighted to see the All Aboard: Digital Skills in Higher Education‘s 2017 event programme running last week – their website, created to support digital skills in Ireland, is a great complementary resource to our course which we made a (small) contribution to during their development phase.

Over the last week it has been wonderful to see our participants engaging with the Digital Footprint course, sharing their reflections on the #DFMOOC hashtag, and really starting to think about what their digital footprint means for them. From the discussion so far the concept of the “Uncontainable Self” (Barbour & Marshall 2012) seems to have struck a particular chord for many of our participants, which is perhaps not surprising given the degree to which our digital tracks and traces can propagate through others posts, tags, listings, etc. whether or not we are sharing content ourselves.

When we were building the MOOC we were keen to reflect the fact that our own work sits in a context of, and benefits from, the work of many researchers and social media experts both in our own local context and the wider field. We were delighted to be able to include guest contributors including Karen Gregory (University of Edinburgh), Rachel Buchanan (University of Newcastle, Australia), Lilian Edwards (Strathclyde University), Ben Marder (University of Edinburgh), and David Brake (author of Sharing Our Lives Online).

The usefulness of making these connections across disciplines and across the wider debate on digital identity seems particularly pertinent given recent developments that emphasise how fast things are changing around us, and how our own agency in managing our digital footprints and digital identities is being challenged by policy, commercial and social factors. Those notable recent developments include…

On 28th March the US Government voted to remove restrictions on the sale of data by ISPs (Internet Service Providers), potentially allowing them to sell an incredibly rich picture of browsing, search, behavioural and intimate details without further consultation (you can read the full measure here). This came as the UK Government mooted the banning of encryption technologies – essential for private messaging, financial transactions, access management and authentication – claiming that terror threats justified such a wide ranging loss of privacy. Whilst that does not seem likely to come to fruition given the economic and practical implications of such a measure, we do already have the  Investigatory Powers Act 2016 in place which requires web and communications companies to retain full records of activity for 12 months and allows police and security forces significant powers to access and collect personal communications data and records in bulk.

On 30th March, a group of influential privacy researchers, including danah boyd and Kate Crawford, published Ten simple rules for responsible big data research in PLoSOne. The article/manifesto is an accessible and well argued guide to the core issues in responsible big data research. In many ways it summarises the core issues highlight in the excellent (but much more academic and comprehensive) AoIR ethics guidance. The PLoSOne article is notably directed to academia as well as industry and government, since big data research is at least as much a part of commercial activity (particularly social media and data driven start ups, see e.g. Uber’s recent attention for profiling and manipulating drivers) as traditional academic research contexts. Whilst academic research does usually build ethical approval processes (albeit conducted with varying degrees of digital savvy) and peer review into research processes, industry is not typically structured in that way and often not held to the same standards particularly around privacy and boundary crossing (see, e.g. Michael Zimmers work on both academic and commercial use of Facebook data).

The Ten simple rules… are also particularly timely given the current discussion of Cambridge Analytica and it’s role in the 2016 US Election, and the UK’s EU Referendum. An article published in Das Magazin in December 2016, and a subsequent English language version published on Vice’s Motherboard have been widely circulated on social media over recent weeks. These articles suggest that the company’s large scale psychometrics analysis of social media data essentially handed victory to Trump and the Leave/Brexit campaigns, which naturally raises personal data and privacy concerns as well as influence, regulation and governance issues. There remains some skepticism about just how influential this work was… I tend to agree with Aleks Krotoski (social psychologist and host of BBC’s The Digital Human) who – speaking with Pat Kane at an Edinburgh Science Festival event last night on digital identity and authenticity – commented that she thought the Cambridge Analytica work was probably a mix of significant hyperbole but also some genuine impact.

These developments focus attention on access, use and reuse of personal data and personal tracks and traces, and that is something we we hope our MOOC participants will have opportunity to pause and reflect on as they think about what they leave behind online when they share, tag, delete, and particularly when they consider terms and conditions, privacy settings and how they curate what is available and to whom.

So, the Digital Footprint course is launched and open to anyone in the world to join for free (although Coursera will also prompt you with the – very optional – possibility of paying a small fee for a certificate), and we are just starting to get a sense of how our videos and content are being received. We’ll be sharing more highlights from the course, retweeting interesting comments, etc. throughout this run (which began on Monday 3rd April), but also future runs since this is an “on demand” MOOC which will run regularly every four weeks. If you do decide to take a look then I would love to hear your comments and feedback – join the conversation on #DFMOOC, or leave a comment here or email me.

And if you’d like to find out more about our digital footprint consultancy, or would be interested in working with the digital footprints research team on future work, do also get in touch. Although I’ve been working in this space for a while this whole area of privacy, identity and our social spaces seems to continue to grow in interest, relevance, and importance in our day to day (digital) lives.

 

Oct 072016
 

PS-15: Divides (Chair: Christoph Lutz)

The Empowered Refugee: The Smartphone as a Tool of Resistance on the Journey to Europe – Katja Kaufmann

For those of you from other continents we had a great deal of refugees coming to Europe last year, from Turkey, Syria, etc. who were travelling to Germany, Sweden, and Vienna – where I am from – was also a hub. Some of these refugees had smartphones and that was covered in the (right wing) press about this, criticising this group’s ownership of devices but it was not clear how many had smartphones, how they were being used and that’s what I wanted to look at.

So we undertook interviews with refugees to see if they used them, how they used them. We were researching empowerment by mobile phones, following Svensson and Wamala Larsson (2015) on the role of the mobile phone in transforming capacilities of users. Also with reference to N. Kabeer (1999), A. Sen (1999) etc. on meanings of empowerment in these contexts. Smith, Spend and Rashid (2011) describe mobiles and their networs altering users capability sets, and about phone increasing access to flows of information (Castell 2012).

So, I wanted to identify how smartphones were empowering refugees through: gaining an advantage in knowledge by the experiences of other refugees; sensory information; cross-checking information; and capabilities to opposse actions of others.

In terms of an advantage in knowledge refugees described gaining knowledge from previous refugees on reports, routes, maps, administrative processes, warnings, etc. This was through social networks and Facebook groups in particular. So, a male refugee (age 22) described which people smugglers cannot be trusted, and which can. And another (same age) felt that smart phones were essential to being able to get to Europe – because you find information, plan, check, etc.

So, there was retrospective knowledge here, but also engagement with others during their refugee experience and with those ahead on their journey. This was mainly in WhatsApp. So a male refugee (aged 24) described being in Macedonia and speaking to refugees in Serbia, finding out the situation. This was particularly important last year when approaches were changes, border access changed on an hour by hour basis.

In terms of Applying Sensory Abilities, this was particularly manifested in identifying own GPS position – whilst crossing the Aegean or woods. Finding the road with their GPS, or identifying routes and maps. They also used GPS to find other refugees – friends, family members… Using location based services was also very important as they could share data elsewhere – sending GPS location to family members in Sweden for instance.

In terms of Cross-checking information and actions, refugees were able to track routes whilst in the hand of smugglers. A male Syrian refugee (aged 30) checked information every day whilst with people smugglers, to make sure that they were being taken in the right direction – he wanted to head west. But it wasn’t just routes, it was also weather condiions, also rumous, and cross-checking weather conditions before entering a boat. A female Syrian refugee downloaded an app to check conditions and ensure her smuggler was honest and her trip would be safer.

In terms of opposing actions of others, this was about being capable of opposing actions of others – orders of authorities, potential acts of (police) violence, risks, fraud attempts, etc. Also disobedience by knowledge – the Greek government gave orders about the borders, but smartphones allowed annotated map sharing that allowed orders to be disobeyed. And access to timely information – exchange rates for example – a refugee described negotiating price of changing money down by Google searching for this. And opposition was also about a means to apply pressure – threatening with or publishing photos. A male refugee (aged 25) described holding up phones to threaten to document policy violence, and that was impactful. Also some refugees took pictures of people smugglers as a form of personal protection and information exchange, particularly with publication of images as a threat held in case of mistreatment.

So, in summary the smartphones

Q&A

Q1) Did you have any examples of privacy concerns in your interviews, or was this a concern for later perhaps?

A1) Some mentioned this, some felt some apps and spaces are more scrutinised than others. There was concern that others may have been identified through Facebook – a feeling rather than proof. One said that they do not send their parents any pictures in case she was mistaken by Syrian government as a fighter. But mostly privacy wasn’t an immediate concern, access to information was – and it was very succesful.

Q2) I saw two women in the data here, were there gender differences?

A2) We tried to get more women but there were difficulties there. On the journey they were using smartphones in similar ways – but I did talk to them and they described differences in use before their journey and talked about picture taking and sharing, the hijab effect, etc.

Social media, participation, peer pressure, and the European refugee crisis: a force awakens? – Nils Gustafsson, Lund university, Sweden

My paper is about receiving/host nations. Sweden took in 160,000 refugees during the crisis in 2015. I wanted to look at this as it was a strange time to live in. A lot of people started coming in late summer and early autumn… Numbers were rising. At first response was quite enthusiastic and welcoming in host populations in Germany, Austria, Sweden. But as it became more difficult to cope with larger groups of people, there were changes and organising to address challenge.

And the organisation will remind you of Alexander (??) on the “logic of collective action” – where groups organise around shared ideas that can be joined, ideas, almost a brand, e.g. “refugees welcome”. And there were strange collaborations between government, NGOs, and then these ad hoc networks. But there was also a boom and bust aspect here… In Sweden there were statements about opening hearts, of not shutting borders… But people kept coming through autumn and winter… By December Denmark, Sweden, etc. did a 180 degree turn, closing borders. There were border controls between Denmark and Sweden for the first time in 60 years. And that shift had popular support. And I was intrigued about this. And this work is all part of a longer 3 year project on young people in Sweden and their political engagement – how they choose to engage, how they respond to each other. We draw on Bennett & Segerberg (2013), social participation, social psychology, and the notion of “latent participation” – where people are waiting to engage so just need asking to mobilise.

So, this is work in progress and I don’t know where it will go… But I’ll share what I have so far. And I tried to focus on recruitment – I am interested in when young people are recruited into action by their peers. I am interested in peer pressure here – friends encouraging behaviours, particularly important given that we develop values as young people that have lasting impacts. But also information sharing through young people’s networks…

So, as part of the larger project, we have a survey, so we added some specific questions about the refugee crisis to that. So we asked, “you remember the refugee crisis, did you discuss it with your friends?” – 93.5% had, and this was not surprising as it is a major issue. When we asked if they had discussed it on social media it was around 33.3% – much lower perhaps due to controversy of subject matter, but this number was also similar to those in the 16-25 year old age group.

We also asked whether they did “work” around the refugee crisis – volunteering or work for NGOs, traditional organisations. Around 13.8% had. We also asked about work with non-traditional organisations and 26% said that they had (and in 16-25% age group, it was 29.6%), which seems high – but we have nothing to compare this too.

Colleagues and I looked at Facebook refugee groups in Sweden – those that were open – and I looked at and scraped these (n=67) and I coded these as being either set up as groups by NGOs, churches, mosques, traditional organisations, or whether they were networks… Looking across autumn and winter of 2015 the posts to these groups looked consistent across traditional groups, but there was a major spike from the networks around the crisis.

We have also been conducting interviews in Malmo, with 16-19 and 19-25 year olds. They commented on media coverage, and the degree to which the media influences them, even with social media. Many commented on volunteering at the central station, receiving refugees. Some felt it was inspiring to share stories, but others talked about their peers doing it as part of peer pressure, and critical commenting about “bragging” in Facebook posts. Then as the mood changed, the young people talked about going to the central station being less inviting, on fewer Facebook posts… about feeling that “maybe it’s ok then”. One of our participants was from a refugee background and ;;;***

Q&A

Q1) I think you should focus on where interest drops off – there is a real lack of research there. But on the discussion question, I wasn’t surprised that only 30% discussed the crisis there really.

A1) I wasn’t too surprised either here as people tend to be happier to let others engage in the discussion, and to stand back from posting on social media themselves on these sorts of issues.

Q2) I am from Finland, and we also helped in the crisis, but I am intrigued at the degree of public turnaround as it hasn’t shifted like that in Finland.

A2) Yeah, I don’t know… The middleground changed. Maybe something Swedish about it… But also perhaps to do with the numbers…

Q2) I wonder… There was already a strong anti-immigrant movement from 2008, I wonder if it didn’t shift in the same way.

A2) Yes, I think that probably is fair, but I think how the Finnish media treated the crisis would also have played a role here too.

An interrupted history of digital divides – Bianca Christin Reisdorf, Whisnu Triwibowo, Michael Nelson, William Dutton, Michigan State University, United States of America

I am going to switch gears a bit with some more theoretical work. We have been researching internet use and how it changes over time – from a period where there was very little knowledge of or use of the internet to the present day. And I’ll give some background than talk about survey data – but that is an issue of itself… I’ll be talking about quantitative survey data as it’s hard to find systematic collection of qualitative research instruments that I could use in my work.

So we have been asking about internet use for over 20 years… And right now I have data from Michigan, the UK, and the US… I have also just received further data from South Africa (this week!).

When we think about Digital Inequality the idea of the digital divide emerged in the late 1990s – there was government interest, data collection, academic work. This was largely about the haves vs. have-nots; on vs. off. And we saw a move to digital inequalities (Hargittai) in the early 2000s… Then it went quite aside from work from Neil Selwyn in the UK, from Helsper and Livingstone… But the discussion has moved onto skills…

Policy wise we have also seen a shift… Lots of policies around digital divide up to around 2002, then a real pause as there was an assumption that problems would be solved. Then, in the US at least, Obama refocused on that divide from 2009.

So, I have been looking at data from questionnaires from Michigan State of the State Survey (1997-2016); questionnaires from digital future survey in the US (2000, 2002, 2003, 2014); questionnaires from the Oxford Internet Surveys in the UK (2003, 2005, 2007, 2009, 2013); Hungarian World Internet Project (2009); South African World Internet Project (2012).

Across these data sets we have looked at questionnaires and frequency of use of particular questions here on use, on lack of use, etc. When internet penetration was less high there was a lot of explanation in questions, but we have shifted away from that, so that we assume that people understand that… And we’ve never returned to that. We’ve shifted to devices questions, but we don’t ask other than that. We asked about number of hours online… But that increasingly made less sense, we do that less as it is essentially “all day” – shifting to how frequently they go online though.

Now the State of the State Survey in Michigan is different from the other data here – all the others are World Internet Project surveys but SOSS is not looking at the same areas as not interent researchers neccassarily. In Hungary (2009 data) similar patterns of question use emerged, but particular focus on mobile use. But the South African questionnaire was very different – they ask how many people in the household is using the internet – we ask about the individual but not others in the house, or others coming to the house. South Africa has around 40% penetration of internet connection (at least in 2012 when we have data here), that is a very different context. There they ask for lack of access and use, and the reasons for that. We ask about use/non-use rather than reasons.

So there is this gap in the literature, there is a need for quantitative and qualitative methods here. We also need to understand that we need to consider other factors here, particularly technology itself being a moving target – in South Africa they ask about internet use and also Facebook – people don’t always identify Facebook as internet use. Indeed so many devices are connected – maybe we need

Q&A

Q1) I have a question about the questionnaires – do any ask about costs? I was in Peru and lack of connections, but phones often offer free WhatsApp and free Pokemon Go.

A1) Only the South African one asks that… It’s a great question though…

Q2) You can get Pew questionnaires and also Ofcom questionnaires from their website. And you can contact the World Internet Project directly… And there is an issue with people not knowing if they are on the internet or not – increasingly you ask a battery of questions… and then filtering on that – e.g. if you use email you get counted as an internet user.

A2) I have done that… Trying to locate those questionnaires isn’t always proving that straightforward.

Q3) In terms of instruments – maybe there is a need to developmore nuanced questionnaires there.

A3) Yes.

Levelling the socio-economic playing field with the Internet? A case study in how (not) to help disadvantaged young people thrive online – Huw Crighton Davies, Rebecca Eynon, Sarah Wilkin, Oxford Internet Institute, United Kingdom

This is about a scheme called the “Home Access Scheme” and I’m going to talk about why we could not make it work. The origins here was a city council’s initiative – they came to us. DCLG (2016) data showed 20-30% of the population were below the poverty line, and we new around 7-8% locally had no internet access (known through survey responses). And the players here were researchers, local government, schools, and also an (unnamed) ISP.

The aim of the scheme was to raise attainment in GCSEs, to build confidence, and to improve employability skills. The Schools had a responsibility to identify students in need at school, to procure laptops, memory sticks and software, provide regular, structured in-school pastoral skills and opportunities – not just in computing class. The ISP was to provide set up help, technical support, free internet connections for 2 years.

This scheme has been running two years, so where are we? Well we’ve had successes: preventing arguments and conflict; helped with schoolwork, job hunting; saved money; and improved access to essential services – this is partly as cost cutting by local authorities have moved transactions online like bidding for council housing, repeat prescription etc. There was also some intergenerational bonding as families shared interests. Families commented on the success and opportunities.

We did 25 interiews, 84 1-1 sessions in schools, 3 group workshops, 17 ethnographic visits, plus many more informal meet ups. So we have lots of data about these families, their context, their lives. But…

Only three families had consistent internet access throughout. Only 8 families are still in the programme. It fell apart… Why?

Some schools were so nervous about use that they filtered and locked down their laptops. One school used the scheme money to buy teacher laptops, gave students old laptops instead. Technical support was low priority. Lead teachers left/delegated/didn’t answer emails. Very narrow use of digital technology. No in-house skills training. Very little cross-curriculum integration. Lack of ICT classes after year 11. And no matter how often we asked about it we got no data from schools.

The ISP didn’t set up collections, didn’t support the families, didn’t do what they had agreed to. They tried to bill families and one was threatened with debt collectors!

So, how did this happen? Well maybe these are neoliberalist currents? I use that term cautiously but… We can offer an emergent definition of neoliberalism from this experience.

There is a neoliberalist disfigurement of schools: teachers under intense pressue to meet auditable targets; the scheme’s students subject to a range of targets used to problematise a school’s performance – exclusions, attendance, C grades; the scheme shuffled down priorities; ICT not deemed academic enough under Govian school changes; and learning is stribbed back to narrow range of subjects and focus towards these targets.

There were effects of neoliberalism on the city council: targets and “more for less” culture; scheme disincentivised; erosion of authority of democratic institutional councils – schools beyond authority controls, and high turn over of staff.

There were neoliberalist practices at the ISP: commodifying philanthropy; couldn’t not treat families as customers. And there were dysfunctional mini-markets: they subcontracted delivery and set up; they subcontracted support; they charged for support and charged for internet even if they couldn’t help…

Q&A

Q1) Is the problem digital divides but divides… Any attempt to overcome class separation and marketisation is working against the attempts to fix this issue here.

A1) We have a paper coming and yes, there were big issues here for policy and a need to be holistic… We found parents unable to attend parents evening due to shift work, and nothing in the school processes to accommodate this. And the measure of poverty for children is “free school meals” but many do not want to apply as it is stigmatising, and many don’t qualify even on very low incomes… That leads to children and parents being labelled disengaged or problematic

Q2) Isn’t the whole basis of this work neoliberal though?]

A2) I agree. We didn’t set the terms of this work..

Panel Q&A

Q1/comment) RSE and access

A1 – Huw) Other companies the same

Q2) Did the refugees in your work Katja have access to Sim cards and internet?

A2 – Katja) It was a challenge. Most downloaded maps and resources… And actually they preferred Apple to Android as the GPS is more accurate without an internet connection – that makes a big difference in the Aegean sea for instance. So refugees shared sim cards, used power banks for the energy.

Q3) I had a sort of reflection on Nils’ paper and where to take this next… It occurs to me that you have quite a few different arguements… You have this survey data, the interviews, and then a different sort of participation from the Facebook groups… I have students in Berlin here looking at the boom and bust – and I wondered about that Facebook group work being worth connecting up to that type of work – it seems quite separate to the youth participation section.

A3 – Nils) I wasn’t planning on talking about that, but yes.

Comment) I think there is a really interesting aspect of these campaigns and how they become part of social media and the everyday life online… The way they are becoming engaged… And the latent participation there…

Q3) I can totally see that, though challenging to cover in one article.

Q4) I think it might be interesting to talk to the people who created the surveys to understand motivations…

A4) Absolutely, that is one of the reasons I am so keen to hear about other surveys.

Q5) You said you were struggling to find qualitative data?

A5 – Katja) You can usually download quantitative instruments, but that is harder for qualitative instruments including questions and interview guides…

XP-02: Carnival of Privacy and Security Delights – Jason Edward Archer, Nathanael Edward Bassett, Peter Snyder, University of Illinois at Chicago, United States of America

Note: I’m not quite sure how to write up this session… So these are some notes from the more presentation parts of the session and I’ll add further thoughts and notes later… 

Nathanial: We have prepared three interventions for you today and this is going to be kind of a gallery exploring space. And we are experimenting with wearables…

Fitbits on a Hamster Wheel and Other Oddities, oh my!

Nathanial: I have been wearing a FitBit this week… but these aren’t new ideas… People used to have beads for counting, there are self-training books for wrestling published in the 16th Century. Pedometers were conceived of in Leonardo di Vinci’s drawings… These devices are old, and tie into ideas of posture, and mastering control of physical selves… And we see the pedometer being connected with regimes of fitness – like the Manpo-Meter (“10,000 steps meter) (1965). This narrative takes us to the 1970s running boom and the idea of recreational discipline. And now the world of smart devices… Wearables are taking us to biometric analysis as a mental model (Neff – preprint).

So, these are ways to track, but what happens with insurance companies, with those monitoring you. At Oriel Roberts university students have to track their fitness as part of their role as students. What does that mean? I encourage you all to check out “unfitbit” – interventions to undermine tracking. Or we could, rather than going to the gym with a FitBit, give it to Terry Crews – he’s going anyway! – and he could earn money… Are fitness slaves in our future?

So, use my FitBit – it’s on my account

And so, that’s the first part of our session…

?: Now, you might like to hear about the challenges of running this session… We had to think about how to make things uncomfortable… But then how do you get people to take part… We considered a man-in-the-middle site that was ethically far too problematic! And no-one was comfortable participating in that way… Certainly raising the privacy and security issue… But as we talk of data as a proxy for us… As internet researchers a lot of us are more aware of privacy and security issues than the general population, particularly around metadata. But this would have been one day… I was curious if people might have faked your data for that one day capture…

Nathanial: And the other issue is why we are so much more comfortable sharing information with FitBit, and other sharing platforms, faceless entities versus people you meet at a conference… And we didn’t think about a gender aspect here… We are three white guys here and we are less sensitive to that being publicised rather than privatised. Men talk about how much they can benchpress… but personal metadata can make you feel under scrutiny

Me: I wouldn’t want to share my data and personal data collection tools…

Borrowing laptop vs borrowing phone…

?: In the US there have been a few cases where FitBits have been submitted as evidence in court… But that data is easier to fake… In one case a woman claimed to have been raped, and they used her FitBit to suggest that

Nathanial: You talked about not being comfortable handing someone your phone… It is really this blackbox… Is it a wearable? It has all that stuff, but you wear it on your body…

??: On cellphones there is FOMO – Fear Of Missing Out… What you might mix…

Me: Device as security

Comment: Ableism embedded in devices… I am a cancer survivor and I first used step counts as part of a research project on chemotherapy and activity… When I see a low step day on my phone now… I can feel this stress of those triggers on someone going through that stress…

Nathanial: FitBit’s vibrate when you have/have not done a number of steps… Trying to put you in an ideological state apparatus…

Jh: That nudge… That can be good for able bodied… But if you can’t move that is a very different experience… How does that add to their stress load.

Interperspectival Goggles

Again looking at the condition of virtuality – Hayles 2006(?)

Vision is constructed… Thinking of higher resolution… From small phone to big phone… Lower resolution to higher resolution TV… We have spectacles, quizzing glasses and monocles… And there is the strange idea of training ourselves to see better (William Horation Bates, 1920s)… And emotional state interfering with how you do something… Rgeb we have optomitry and x-rays as a concept of seeing what could not be seen before… And you have special goggles and helmets… LIke the idea of the Image Accumulator in Videodrome (1985?), or the idea of the Memory recorder and playback device in Brainstorm (1983). We see embodied work stations – Da Vinci Surgery Robot (2000) – divorcing what is seen, from what is in front of them…

There are also playful ideas: binocular football; the Decelerator Helmet; Meta-perceptional Helmet (Cleary and Donnelly 2014); and most recently Google Glass – what is there and also extra layers… Finally we have Oculus Rift and VR devices – seeing something else entirely… We can divorce what we see from what we are perceiving… We want to swap people’s vision…

1. Raise awareness about the complexity of electronic privacy and security issues.

2. Identify potential gaps in the research agenda through playful interventions, subversions, and moments of the absurd.

3. Be weird, have fun!

Mathius

“Cell phones are tracking devices that make phonecalls” (Applebaum, 2012)

I am interested in IMSI catcher which masquerades as a wireless base station, prompting phones to communicate with it. They are used by police, law inforcement, etc. They can be small and handheld, or they can be drone mounted. And they can track people, people in crowds, etc. There is always a different way to use it – you can scan for people in crowds. So if you know someone is there you can scan for it in a different way. So, these tools are simple and disruptive and problematic, especially in activism contexts.

But these tools are also capable of caturing transmitted content, and all the data in your phone. These devices are problematic and have raised all sorts of issues about their use, who and how you use them. I’d like to think of this a different way… Is there a right to protest? And to protest anonymously? We do have anti-masking laws in some places – that suggests no right to anonymous protest. But that’s still a different privacy right – covering my face is different from participating at all…

Protests are generally about a minority persuading a majoruty about some sort of change. There is no legal rights to protest anonymously, but there are lots of protected anoymous spaces. So, in the 19th century there was big debate on whether or not the voting ballot should be anonymous – democracy is really the C19th killer app. So there is a lovely quote here about the “The Australian system” by Bernheim (1889) and the introduction of anonymous voting. It wasn’t brought in to preserve privacy. At the time politicians brought votes – buying a keg of beer or whatever – and anonymity was there to stop that, not to preserve individual privacy. But Jill LePore (2008) writes about how our forebears considered casting a “secret ballot” to be “cowardly, underhanded and dispicable”.

So, back to these devices… There can be an idea that “if you have nothing to fear, you have nothing to hide”, but many of us understand that it is not true. And this type of device silences uncomfortable discourse.

Mathias Klang, University of Massachusetts Boston

Q1) How do you think that these devices fit into the move to allow law inforcement to block/”switch off” the camera on protestors/individuals’ phones?

A1) Well people can resist these surveillance efforts, and you will see subversive moves. People can cover cameras, conceal devices etc. But with these devices it may be that the phone becomes unusable, requiring protestors to disable phones or leave phones at home… And phones are really popular and well used for coordinating protests

Bryce Newell, Tilburg Institute for Law, Technology, and Society

I have been working on research in Washington Stat, working with law enforcement on license plate recognition systems and public disclosure law. And looking at what you can tell. So, here is a map of license plate data from Seattle, showing vehicle activity. In Minneapolis similar data being released led to mapping of the governer’s registered vehicles..

The second area is about law enforcement and body cameras. Several years ago peaceful protestors at UC Davis were pepper sprayed. Even in the cropped version of that image you can see a vast number of phones out, recording the event. And indeed there are a range of police surveillance apps that allow you to capture police encounters without that being visible on the phone, including: ACLU Police Tape, Stop and Frisk Watch; OpenWatch; CopRecorder2. And some of these apps upload the recording to the cloud right away to ensure capture. And there have certainly been a number of incidents from Rodney King to Oscar Grant (BART), Eric Garner, Ian Tomlinson, Michael Brown. Of these only the Michael Brown case featured law enforcement with bodycams. There has been a huge call for more cameras on law enforcement… During a training meeting some officers told me “Where’s the direct-to-YouTube button?” and “If citizens can do it, why can’t we also benefit from the ability to record in public places?”. There is a real awareness of control and of citizen videos. I also heard a lot of there being “a witch hunt about to begin…”.

So, I’m in the middle of focused coding on police attitudes to body cameras. Police are concerned that citizen video is edited, out of context, distorting. And they are concerned that it doesn’t show wider contexts – when recording starts, perspective, the wider scene, the fact that provocation occurs before filming usually. But there is also the issue of control, and immediate physical interaction, framing, disclosure, visibility – around their own safety, around how visible they are on the web. They don’t know why it is being recorded, where it will go…

There have been a number of regulatory responses to this challenge: (1) restrict collection – not many, usually budgetary and rarely on privacy; (2) restrict access – going back to the Minneapolis case, within two weeks of the map of governer vehicles being published in the paper they had an exemption to public disclosure law which is now permanent for this sort of data. In the North Carolina protests recently the call was “release the tapes” – and they released only some – then the cry was “release all the tapes”… But on 1st October law changed to again restrict access to this type of data.

But different state provide different access. Some provide access. In Oakland, California, data was released on how many license plates had been scanned. In Seattle data on scans can, because the data for many scans of one licence plates over 90 days is quite specific, you can almost figure out the householder. But granularity varies.

Now, we do see body cameras of sobriety tests, foot chases, and a half hour long interview with prostitute that discloses a lot of data. Washington shares a lot of video to YouTube. We see that in Rotterdam, Netherlands police doing this too.

But one patrol office told me that he would never give his information to an officer with a camera. Another noted that police choose when to start recording with little guidance on when and how to do this.

And we see a “collatoreal visibility” issue for police around these technologies.

Q&A

Q1) Is there any process where police have to disclose that they are filming with a body cam?

A1) Interesting question… Initially they didn’t know. We used to have two party consent process – as for tapings – to ensure consent/implied consent. But the State attorney general described this as outside of that privacy regulation, saying that a conversation with a police officer is a public conversation. But police are starting to have policies that officers should disclose that they have cameras – partly as they hope and sometimes it may reduce violence to police.

Data Privacy in commercial users of municipal location data – Meg Young, University of Washington

My work looks at how companies use Seattle’s location data. I wanted to look at how data privacy is enacted by Seattle municipal government? And I am drawing on the work of Annemarie Mol and John Law (2004), an ethnographer working on health, that focuses on the lived experience. My data is drawing on ethnographic as as well as focus groups, interviews with municipal government and local civic technology communities. I really wanted to present the role of commercial actors in data privacy in city government.

We know that cities collect location data to provide services, and so share it for third parties to do so. In Washinton we have a state freedom of information (FOI) law, which states “The people of this state do not yield their sovereignty to the government…”, making data requestable.

In Seattle the traffic data is collected by a company called Acyclica. The city is growing and the infrastructure is struggling, so they are gathering data to deal with this, to shape traffic signals. This is a large scale longitudinal data collection process. Acyclica are doing that with wi-fi sensors sniff MAC addresses, the location traces sent to Acyclica (MAC salted). The data is aggregated and sent to the city – they don’t see the detailed creepy tracking, but the company does. And this is where the FOI law comes in. The raw data is on the company side here. If the raw data was a public record, it would be requestable. The company becomes a shield for collecting sensitive data – it is proprietizing.

So you can collect data, have service needs met, but without it becoming public to you and I. But analysing the contract the terms do not preclude the resale of data – though a Seattle Dept. of Transport (DOT) worker notes that right now people trust companies more than government. Now I did ask about this data collection – not approved elsewhere – and was told that having wifi settings on in public making you open to data collection – as it is in public space.

My next example is the data from parking meters/pay stations. This shows only the start, end, no credit card #s etc. The DOT is happy to make this available via public records requests. But you can track each individual, and they are using this data to model parking needs.

The third example is the Open Data Portal for Seattle. They pay Socrata to host that public-facing data portal. They also sell access to cleaned, aggregated data to companies through a separate API called the Open Data Network. The Seattle Open Data Manager didn’t see this situation as different from any other reseller. But there is little thought about third party data users – they rarely come up in converations – who may combine this data with other data sets for data analysis.

So, in summary, municipal government data is no less by and for commercial actors as it is the public. Proprietary protections around data are a strategy for protecting sensitive data. Government transfers data to third party

Q&A

Q1) Seattle has a wifi for all programme

A1) Promisingly this data isn’t being held side by side… But the routers that we connect to collect so much data… Seeing an Oracle database of the websites fokls

Q2) What are you policy recommendations based on your work?

A2) We would recommend licensing data with some restrictions on use, so that if the data is used inappropriately their use could be cut off…

Q2) So activists could be blocked by that recommendation?

A2) That is a tension… Activists are keen for no licensing here for that reason… It is challenging, particularly when data brokers can do problematic profiling…

Q2) But that restricts activists from questioning the state as well.

Response – Sandra Braman

I think that these presentations highlight many of the issues that raise questions about values we hold as key as humans. And I want to start from an aggressive position, thinking about how and why you might effectively be an activist in this sort of environment. And I want to say that any concerns about algorithmically driven processes should be evaluated in the same way as we would social process. So, for instance we need think about how the press and media interrogate data and politicians

? “Decoding the social” (coming soon) is looking at social data and analysis of social data in the context of big data. She argues that social life is too big and complex than predicatable data. Everything that people who use big data “do” to understand patterns, are things that activists can do too. We can be just as sophisticated as corporations.

The two things I am thinking about are how to mask the local, and how to use the local… When I talk of masking the local I look back to work I did several years back on local broadcasting. There is mammoth literature on TV as locale, and production and how that is separate, misrepresenting, and the assumptions versus the actual information provided vs actual decision making. My perception is that social activism is that there is some brilliant activity taking place – brilliance at moments, specific apps often. And I think that if you look at the essays that Julian Assange before he founded WikiLeaks, particularly n weak links and how those work… He uses sophisticated social theory in a political manner.

But anonymity is practicably impossible… What can we learn from local broadcast? You can use phones in organised ways – there was training for phone cameras for the Battle of Seattle for instance. You can fight with indistinguishable actions – all doing the same things. Encryption is cat and mouse… Often we have activists presenting themselves as mice, although we did see an app discussed at the plenary on apps to alert you to protest and risk. And I have written before on tactical memory.

In terms of using the local… If you know you will be sensed all the time, there are things you can do as an activist to use that. It is useful to think about how we can conceive of ourselves as activists as part of the network. And I was inspired by US libel laws – if a journalist has transmission/recording devices but are a neutral observer, you are not “repeating” the libel and can share that footage. That goes back to 1970s law, but that can be useful to us.

We are at risk of being censored, but that means that you have choices about what to share, being deliberate in giving signals. We have witnessing, which can be taken as a serious commitment. That can happen with people with phones, you can train witnessing. There are many moments were leakage can be an opportunity – maybe not with volume or content of Snowden, but we can do that. There are also ways to learn and shape learning. But we can also be routers, and be critically engaged in that – what we share, the acceptable error rate. National Security are concerned about where in the stream they should target the misinformation – activists can adopt that too. The server functions – see my strategic memory piece. We certainly have community-based wifi, MESH networks, and that is useful politically and socially. We have responsibilities to build the public that is appropriate, and the networking infrastructure that enables those freedom. We can use more computational power to resolve issues. Information can be an enabler as well as influencing your own activism. Thank you to Anne and her group in Amsterdam for triggering thinking here, but big data we should be engaging critically. If you can’t make decisions in some way, there’s no point to doing it.

I think there needs to be more robustness in managing and working with data. If you go far then you need a very high level of methodological trust. Information has to stand up in court, to respect activist contributions to data. Use as your standard, what would be acceptable in court. And in a Panspectrum (not Panopticon) environment, when data is collected all the times, you absolutely have to ask the right questions.

Panel Q&A

Q1) I was really interested in that idea of witnessing as being part of being a modern digital citizens… Is there more on protections or on that which you can say

A1 – Sandra) We’ve seen all protections for whistle blowing in government disappear under Bush (II)… We still have protections for private sector whistle blowers. But there would be an interesting research project in there…

Q2) I wondered about that idea of cat and mouse use of technology… Isn’t that potentially making access a matter of securitisation…?

A2) I don’t think that “securitisation” makes you a military force… One thing I forgot to say was about network relations… If a system is interacting with another system – the principle of requisite variety – they have to be as complex as the system you are dealing with. You have to be at least as sophisticated as the other guy…

Q3) For Bryce and Meg, there are so many tensions over when data should be public and when it should be private, and tensions there… And police desires to show the good things they do. Also Meg, this idea of privatising data to ensure privacy of data – it’s problematic for us to collect data, but now a third party can do that.

A3 – Bryce) One thing I didn’t explain well enough is that video online comes from police, and from activists – it depends on the video here. Some videos are accessed via public records requests and published to YouTube channel – in fact in Washington you can make requests for free and you can do it anonymously. Police department does public video. Whilst they did a pilot in 2014 they had a hackathon to consider how to deal with redaction issues… detect faces, blur them, etc.. And proactive posting of – only some – video. The narrative of sharing everything, but that isn’t the case. The rhetoric has been about being open, by privacy rights and the new police chief. A lot of it was administrative cost concerns… In the hackathon they asked if posting in a blurred form, it would do away with blanket requests to focus requests. At that time they dealt with all requests for email. They were receiving so many emails and under state law they had to give up all the data and for free. But state law varies, in Charlotte they gave up less data. In some states there is a a differnet approach with press conferences, narratives around the footage as they release parts of videos…

A3 – Meg) The city has worked on how to release data… They have a privacy screening process. They try to provide data in a way that is embedded. They still have a hard core central value that any public record is requestable. Collection limitation is an important and essential part of what cities should be doing… In a way private companies collecting data results in large data sets that will end up insecure in those data sets… Going back to what Bryce was saying, the bodycam initiative was really controversial… There was so much footage and unclear what should be public and when… And the faultlines have been pretty deep. We have the Coalition for Open Government advocates for full access, the ACLU worried that these become surveillance cameras… This was really contentious… They passed a version of a compromise but the bottom line is that the PRA is still a core value for the state.

A3 – Bryce) Much of the ACLU, nationally certainly, was to support bodycams, but individuals and local ACLUs change and vary… They were very pro, then backing off, then local variance… It’s a very different picture hence that variance.

Q4) For Matthias, you talked about anti-masking laws. Are there cases where people have been brought in for jamming signals under that law.

A4 – Matthias) Right now the American cases is looking for keywords – manufacturers of devices, the ways data is discussed. I haven’t seen cases like that, but perhaps it is too new… I am a Swedish lawyer and that jamming would be illegal in protest…

A4 – Sandra) Would that be under antimasking or under jamming law.

A4 – Matthias) It would be under hacking laws…

Q4) If you counter with information… But not if switching phone off…

A4 – Matthias) That’s still allowed right now.

Q5) Do you do work comparing US and UK bodycamera?

A5 – Bryce) I don’t but I have come across the Rotterdam footage. One of my colleagues has looked at this… The impetus for adoption in the Netherlands has been different. In the US it is transparancy, in the Netherlands it was protection of public servants as the narrative. A number of co-authors have just published recently on the use of cameras and how they may increase assault on officers… Seeing some counter-intuitive results… But the why question is interesting.

Comment) Is there any aspect of cameras being used in higher risk areas that makes that more likely perhaps?

A5 – Sandra) It’s the YouTube on-air question – everyone imagines themselves on air.

Q6) Two speakers quoted individuals accused of serious sexual assault… And I was wondering how we account for the fact that activists are not homogenous here… Particularly when tech activists are often white males, they can be problematic…

A6) Techies don’t tend to be the most politically correct people – to generalise a great deal…

A6 – Sandra) I think they are separate issues, if I didn’t engage with people whose behaviour is problematic it would be hard to do any job at all. Those things have to be fought, but as a woman you should also challenge and call those white male activists on their actions.

Q7 – me) I was wondering about the retention of data. In Europe there is a lot of use of CCTV and the model  there is record everything, and retain any incident. In the US CCTV is not in widespread use I think and the bodycam model is record incidents in progress only… So I was wondering about that choice in practice and about the retention of those videos and the data after capture.

A7 – Bryce) The ACLU has looked at retention of data. It is a state based issue. In Washington there are mandatory minimu periods… They are interesting as due to findings in conduct they are under requirements to keep everything for as long as possible so auditors from DOJ can access and audit. Bellingham and Spokane, officers can flag items, and supervisors can… And that is what dictates retention schedule. There are issues there of course. Default when I was there was 2 years. If it is publicly available and hits YouTube then that will be far more long lasting, can pop up again… Perpetual memory there… So actual retention schedule won’t matter.

A7 – Sandra) A small follow up – you may have answered with that metadata… Do they treat bodycam data like other types of police data, or is it a separate class of data?

A7 – Bryce) Generally it is being thought of as data collection… And there is no difference from public disclosure, but they are really worried about public access. And how they share that with prosecutors… They could share on DVD… And wanted to use share function of software… But they didn’t want emails to be publicly disclosable with that link… So being thought about as like email.

Q8 – Sandra) On behalf of colleagues working on visual evidence in course.

Comment – Micheal) There is work on video and how it can be perceived as “truth” without awareness of potential for manipulation.

A8 – Bryce) One of the interesting things in Bellingham was release of that video I showed of a suspect running away… The footage was following a police pick up for suspected drug dealing but the footage showed evasion of arrest and the whole encounter… And in that case, whether or not he was guilty of the drug charge, that video told a story of the encounter. In preparing for the court case the police shared the video with his defence team and almost immediately they entered a guilty plea in response to that… And I think we will see more of that kind of invisible use of footage that never goes to court.

And with that this session ends… 

PA-31:Caught in a feedback loop? Algorithmic personalization and digital traces (Chair: Katrin Weller)

Wiebke Loosen1, Marco T Bastos2, Cornelius Puschmann3, Uwe Hasebrink1, Sascha Hölig1, Lisa Merten1, Jan­-Hinrik Schmidt1, Katharina E Kinder­-Kurlanda4, Katrin Weller4

1Hans Bredow Institute for Media Research; 2; 3Alexander von Humboldt Institute for Internet and Society; 4GESIS Leibniz Institute for the Social Sciences

?? – Marco T Bastos, University of California, Davis  and Cornelius Puschmann, Alexander von Humboldt Institute for Internet and Society

Marco: This is a long-running project that Cornelius and I have been working on. At the time we started, in 2012, it wasn’t clear what impact social media might have on the filtering of news, but they are now huge mediators of news and news content in Western countries.

Since then there is some challenge and conflict between journalists, news editors and audiences and that raises the issue of how to monitor and understand that through digital trace data. We want to think about which topics are emphasized by news editors, and which are most shared by social media, etc.

So we will talk about taking two weeks of content from the NYT and The Guardian across a range of social media sites – that’s work I’ve been doing. And Cornelius has tracked 1.5/4 years worth of content from four German newspapers (Suddeutsche Zeitung, Die Zeit, FAZ, Die Welt).

With the Guardian we accessed data from the API which tells you which articles were published in print, and which have not – that is baseline data for the emphasis editors place on different types of content.

So, I’ll talk about my data from the NY Times and the Guardian, from 2013, though we now have 2014 and 2015 data too. This data from two weeks is about 16k+ articles. The Guardian runs around 800 articles per day, the NYT does around 1000. And we could track the items on Twitter, Facebook, Google+, Delicious, Pinterest and Stumbleupon. We do that by grabbing the unique identifyer for the news article, then use the social media endpoints of social platforms to find sharing. But we had a challenge with Twitter – in 2014 they killed the end point we and others had been using to track sharing of URLs. The other sites are active, but relatively irrelevant in the sharing of news items! And there are considerable differences across the ecosystems, some of these social networks are not immediately identifiable as social networks – will Delicious or Pinterest impact popularity?

This data allows us to contrast the differences in topics identified by news editors and social media users.

So, looking at the NYT there is a lot of world news, local news, opinion. But looking at the range of articles Twitter maps relatively well (higher sharing of national news, opinion and technology news), but Facebook is really different – there is huge sharing of opinion, as people share what lies with their interests etc. We see outliers in every section – some articles skew the data here.

If we look at everything that appeared in print, we can look at a horrible diagram that shows all shares… When you look here you see how big Pinterest is, but in fashion in lifestyle areas. The sharing there doesn’t reflect ratio of articles published really though. Google+ has sharing in science and technology in the Guardian, in environment, jobs, local news, opinion and technology in the NYT.

Interestingly news and sports, which are real staples of newspapers but barely feature here. Economics are even worse. Now the articles are english-speaking but they are available globally… But what about differences in Germany… Over to Cornelius…

Cornelius: So Marcos’ work is ahead of mine – he’s already published some of this work. But I have been applying his approach to German newspapers. I’ve been looking at usage metrics and how that relationship between audiences and publishers, and how that relationship changes over time.

So, I’ve looked at Facebook engagement with articles in four German newspapers. I have compared comments, likes and shares and how contribution varies… Opinion is important for newspapers but not necessarily where the action is. And I don’t think people share stories in some areas less – in economics they like and comment, but they don’t share. So interesting to think about the social perception of sharability.

So, a graph here of Die Zeit here shows articles published and the articles shared on Facebook… You see a real change in 2014 to greater numbers (in both). I have also looked at type of articles and print vs. web versions.

So, some observations: niche social networks (e.g. Pinterest) are more relevant to news sharing than expected. Reliance on FB at Die Zeit grew suddenly in 2014. Social nors of liking, sharing and discussing differ significantly across news desks. Some sections (e.g. sports) see a mismatch of importance and use versus liking and sharing.

In the future we want to look at temporal shifts in social media feedback and newspapers coverage. Monitoring

Q&A

Q1) Have you accounted for the possibility of bots sharing content?

A1 – Marcus) No, we haven’t But we are looking across the board but we cannot account for that with the data we have.

Q2) How did you define or find out that an article was shared from the URLs

A2) Tricky… We wrote a script for parsing shortened URLs to check that.

A2 – Cornelius) Read Marco’s excellent documentation.

Q3) What do you make of how readers are engaging, what they like more, what they share more… and what influences that?

A3 – Cornelius) I think it is hard to judge. There are some indications, and have some idea of some functions that are marketed by the platforms being used in different ways… But wouldn’t want to speculate.

Twitter Friend Reportoires: Inferring sources of information management from digital traces – Jan-Hinri Schmidt; Lisa Merton, Wiebke Loosen, Uwe, Kartin?

Our starting point was to think about shifting the focus of Twitter Research. Many studies are on Twitter – explicitly or implicitly – as a broadcast paradigm, but we want to conceive of it as an information tool, and the concept of “Twitter Friend Reportoires” – using “Friend” in the Twitter terminology – someone I follow. We ware looking for patterns in composition of friend sets.

So we take a user, take their friends list, and compare to list of accounts identified previously. So our index has 7,528 Twitter account of media outlets (20.8%) of organisations (political parties, companies, civil society organisations (53.4%) and of individuals (politicians, celebrities and journalists, 25.8%) – all in Germany. We take our sample, compare with a relational table, and then to our master index. And if the account isn’t found in the master index, we can’t say anything about them yet.

To demonstrate the answers we can find with this approach…. We have looked at five different samples:

  • Audience_TS – sample following PSB TV News
  • Audience_SZ – sample following quality daily newspapers
  • MdB – members of federal parliament
  • BPK – political journalists registerd for the bundespressekonferenz
  • Random – random sample of German Twitter users (via Axel Bruns)

We can look at the friends here, and we can categorise the account catagories. In our random sample 77.8% are not identifiable, 22.2% are in our index (around 13% are individual accounts). That is lower than the percentages of friends in our index for all other audiences – for MdB and BPK a high percentage of their friends are in our index. Across the groups there is less following of organisational accounts (in our index) – with the exception of the MdB and political parties. If we look at the media accounts we can see that with the two audience samples they have more following of media accounts than others, including MdB and BPK… When it comes to individual public figures in our indexes, celebrities are prominent for audiences, much less so for MdB and BPK, but MdB follow other politicians, and journalists tend to follow other politicians. And journalists do follow politicians, and politicians – to a less extent – follow journalists.

In terms of patterns of preference we can suggest a model of a fictional user to understand preference between our three categories (organisational account, media account, individual account). And we can use that profile example and compare with our own data, to see how others behaviours fit that typology. So, in our random sample over 30% (37,9%) didn’t follow any organisational accounts. Amongst MdB and BPK there is a real preference for individual accounts.

So, this is what we are measuring right now… I am still not quite happy yet. It is complex to explain, but hard to also show the detail behind that… We have 20 categories in our master index but only three are shown here… Some frequently asked questions that I will ask and answer based on previous talks…

  1. Around 40% identified accounts is not very must is it?
    Yes and no! We have increased this over time. But initially we did not include international accounts, if we did that we’d increase share, especially with celebrities, also international media outlets. However, there is always a trade off, there will also be a long tail… And we are interested in specific categorisations and in public speakers as sources on Twitter.
  2. What does friending mean on Twitter anyway?
    Good question! More qualitative research is needed to understand that – but there is some work on journalists (only). Maybe people friend people for information management reasons, reciprocity norms, public signal of connection, etc. And also how important are algorithmic recommendations in building your set of friends?

Q&A

Q1 – me) I’m glad you raised the issue of recommendation algorithms – the celebrity issue you identified is something Twitter really pushes as a platform now. I was wondering though if you have been looking at how long the people you are looking at have been on Twitter – as behavioural norms

A1) It would be possible to collect it, but we don’t now. We do, for journalists and politicians we do gather list of friends of each month to get longitudinal idea of changes. Over a year, there haven’t been many changes yet…

Q2) Really interesting talk, could you go further with the reportoire? Could there be a discrepancy between the reportoire and their use in terms of retweeting, replying etc.

A2) We haven’t so far… Could see which types of tweets accounts are favouriting or retweeting – but we are not there yet.

Q3) A problem here…

A3) I am not completely happy to establish preference based on indexes… But not sure how else to do this, so maybe you can help me with it. 

Analysing digital traces: The epistemological dimension of algorithms and (big) internet data – Katharine Kinder-Kuranda and Katrin Weller

Katherine: We are interested in the epistemiological aspects of algorithms, so how we research these. So, our research subjects are researchers themselves.

So we are seeing real focus on algorithms in Internet Research, and we need to understand the (hidden) influence of algorithms on all kinds of research, including researchers themselves. So we have researchers interested in algorithms… And in platforms, users and data… But all of these aspects are totally intertwined.

So lets take a Twitter profile… A user of Twitter gets recommendations of who to follow in a given moment of time, and they see newsfeeds at a given moment of time. That user has context that as a researcher I cannot see or interpret the impact of that context on the user’s choice of e.g. who they then follow.

So, algorithms observe, count, sort and rank information on the basis of a variety of different data sources – they are highly heterogeneous and transient. Online data can be user-generated content or activity, traces or location data from various internet platforms. That promises new possibilities, but also raises significant challenge, including because of its heterogeneity.

Social media data has uncertain origins, about users and their motivations; often uncertain provenance of the data. The “users that we see are not users” but highly structured profiles and the result of careful image-management. And we see renewed discussion of methods and epistemology, particularly within the social sciences, for instance suggestions include “messiness” (Knupf 2014), and ? (Kitchen 2012).

So, what does this mean for algorithms? Algorithms operate on an uncertain basis and present real challenges for internet research. So I’m going to now talk about work that Katrin and I did in a qualitative study of social media researchers (Kinder-Kurlanda and Weller 2014). We conducted interviews at conferences – highly varied – speaking to those working with data obtained from social media. There were 40 interviews in total and we focused on research data management.

We found that researchers found very individual ways to address epistemological challenges in order to realise the potential of this data for research. And there were three real concerns here: accessibility, methodology, research ethics.

  1. Data access and quality of research

Here there were challenges of data access, restrictions on privacy of social media data, technical skills; adjusting research questions due to data availability; struggle for data access often consumes much effort. Researchers talks about difficulty in finding publicatio outlets, recognition, jobs in the disciplinary “mainstream” – it is getting better but a big issue. There was also comment on this being a computer science dominated fields – which had highly formalised review processes, few high ranking conferences, and this enforces highly strategic planning of resources and research topics. So researchers attempts to acieve validity and good research quality are constrained. So, this is really challenging for researchers.

2. New Methodologies for “big data”

Methodologies in this research often defy traditional ways of achieveing research validity – through ensuring reproducability, sharing of data sets (ethically not possible). There is a need to find patterns in large data sets by analysis of keywords, or automated analysis. It is hard for others to understand process and validate it. Data sets cannot be shared…

3. Research ethics

There is a lack of users informed consent to studies based on online data (Hutton and Henderson 2015). There are ethical complexity. Data cannot really be anonymised…

So, how do algorithms influence our research data and what does this mean for researchers who want to learn something about the users? Algoritms influence what content users interact with, for example: How to study user networks without knowing the algorithms behind follower/friend suggestions? How to study populations?

To get back to the question of observing algorithms? Well the problem is that various actors in the most diverse situations react out of different interests to the results of algorithic calculations, and may even try to influence algorithms. You see that with tactics around trending hashtags as part of protest for instance. The results of algorithmic analyses presented to internet users with information on how algorithms take part.

In terms of next steps. researchers need to be aware that online environments are influenced by algorithms and so are the users and the data they leave behind. It may mean capturing the “look and feel” of the platform as part of research.

Q&A

Q1) One thing I wasn’t sure about… Is your sense when you were interviewing researchers that they were unaware of algorithmic shaping… Or was it about not being sure how to capture that?

A1) Algorithms wasn’t the terminology when we started our work… They talked about big data… the framing and terminology is shifting… So we are adding the algorithms now… But we did find varying levels of understanding of platform function – some were very aware of platform dynamics, but some felt that if they have a Twitter dataset that’s a representation of the real world.

Q1) I would think that if we think about recognising how algorithms and platform function come in as an object… Presumably some working on interfaces were aware but others looking at, e.g. friendship group, took data and weren’t thinking about platform function, but that is something they should be thinking about…

A1) Yes.

Q2) What do you mean by the term “algorithm” now, and how that term is different from previously…

A2) I’m sure there is a messyness of this term. I do believe that looking at programmes, wouldn’t solve that problem. You have the algorithm in itself, gaining attention… From researchers and industry… So you have programmers tweaking algorithms here… as part of different structures and pressures and contexts… But algorithms are part of a lot of peoples’ everyday practice… It makes sense to focus on those.

Q3) You started at the beginning with an illustration of the researcher in the middle, then moved onto the agency of the user… And the changes to the analytical capacities working with this type of data… But how much is the awareness amongst researchers of how the data, the tools they work with, and how they are inscribed into the research…

A3) Thank you for making that distinction here. The problem in a way is that we saw what we might expect – highly varied awareness… This was determined by disciplinary background – whether STS researchers in sociology, or whether a computer scientist, say. We didn’t find too many disciplinary trends, but we looked across many disciplines…. But there were huge ranges of approach and attitude here – our data was too broad.

Panel Q&A

Q1 – Cornelius) I think that we should say that if you are wondering about “feedback” here, it’s about thinking about metrics and how they then feedback into practice, if there is a feedback loop… From very different perspectives… I would like to return to that – maybe next year when research has progressed. More qualitative understanding is needed. But a challenge is that stakeholder groups vary greatly… What if one finding doesn’t hold for other groups…

Q2) I am from the Wikimedia Foundation… I’m someone who does data analysis a lot. I am curious if in looking at these problems you have looked at recommender systems research which has been researching this space for 10 years, work on messy data and cleaning messy data… There are so many tiny differences that can really make a difference. I work on predictive algorithms, but that’s a new bit of turbulence in a turbulent sea… How much of this do you want to bring this space…

A2 – Katrin) These communities have not come together yet. I know people who work in socio-technical studies who do study interface changes… There is another community that is aware that this exists… And is not aware so closely… But see it as tiny bits of the same puzzle… And can be harder to understand for historical data… And getting an idea of what factors influence your data set. In our data sets we have interviewees more like you, and some with people at sessions like this… There is some connection, but not all of those areas coming together…

A2 – Cornelius) I think that there is a clash between computational social science data work, and this stuff here… That predictable aspect screws with big claims about society… Maybe an awareness but not a keenness. In terms of older computer science research that we are not engaging in, but should be… But often there is a conflict of interests sometimes… I saw a presentation that showed changes to the interface, changing behaviour… But companies don’t want to disclose that manipulation…

Comment) We’ve gone through a period, disheartened to see it is still there, that researchers are so excited to trace human activities, that they treat hashtags as the political debate… This community helpfully problematises or contextualises this… But I think that these papers are raising the question of people orientating practices towards the platform, from machine learning… I find it hard to talk about that… And how behaviour feeds into machine learn… Our system tips to behaviour, and technology shifts and reacts to that which is hard.

Q3) I wanted to agree with that idea of the  need to document. But I want to push at your implicit position that this is messy and difficult and hard to measure… But I think that applies to *any* methods… Standards of data removal, arise elsewhere, messiness occurs elsewhere… Some of those issues apply across all kinds of research…

A3 – Cornelius) Christian would have had an example on his algorithm audit work that might have been helpful there.

Comment) I wanted to comment on social media research versus traditional social science research… We don’t have much power over our data set – that’s quite different in comparison with those running surveys, undertaking interviews… and I have control of that tool… And I think that argument isn’t just about survey analysis, but other qualitative analysis… Your research design can fit your purposes…

 

Twitter recommend algorithms, celebrities and noise. Time on twitter. Overall follower/following counts? Does friend suggest influence?

Advertistors? and role in shaping content in news

Time:
Friday, 07/Oct/2016:

4:00pm – 5:30pm

Session Chair:

Location: HU 1.205
Humboldt University of Berlin Dorotheenstr. 24 Building 1, second floor 80 seats
Show help for 'Increase or decrease the abstract text size'

Presentations

Wiebke Loosen1, Marco T Bastos2, Cornelius Puschmann3, Uwe Hasebrink1, Sascha Hölig1, Lisa Merten1, Jan­-Hinrik Schmidt1, Katharina E Kinder­-Kurlanda4, Katrin Weller4

1Hans Bredow Institute for Media Research; 2University of California, Davis; 3Alexander von Humboldt Institute for Internet and Society; 4GESIS Leibniz Institute for the Social Sciences

Aug 202015
 

Today I am back for another talk which forms part of the IFIP Summer School on Privacy and Identity Management hosted in Informatics at the University of Edinburgh.

Today’s talk is from Angela Sasse, Professor of Human Centred Technology at University College London, and she also oversees their Computer Security group (her presentation will include work of Anthony Morton). She is also head of the first research group in the UK researching the science of Cyber Security. Apparently she also authored a seminal paper in the ’90s entitled “Humans are not the enemy” which addressed mismatches of perceptions and behaviours. That motif, that users are not the enemy, is still something which has not quite yet been learned by those designing and implementing systems even now. 

I think my title gives you a good idea of what I will be talking about: I will be starting with talking about how people reason about privacy. That is something which is often not accounted for properly, but is important in understanding behaviours. Then I will be talking about why current technologies do not meet their preferences. Then I will look to the future – both some dystopian and utopian scenarios there.

So, how do people reason about privacy? Some work with Adams (2001) looked at this and we used the crucial subtitles “protecting users not just data”. There we pointed out that there is a real difference between how the law treats this, and how people understand privacy. Individuals are pragmatic in their choices, they are thinking about the risks and the benefits – they trade those off. Some of this stuff came out of early internet networking, video calls, etc. but it has stood the test of time as these things have become commonplace.

There has been a raft of research over the last 15 years, not just by computer scientists but also social scientists, ethicists, economists. And we have come to a place that we understand that people do trade risks for benefits but that is not always efficient in an economic sense, it is not always logical… And there are a number of reasons for this: they may not be aware of all risks and consequences – around secondary level information; and around secondary and tertiary usage, aggregation with other data sources; their perception may be skewed by hyperbolic discounting – entirely dismissing things with low risk; there is a paradox here as people do belief in privacy and security but their actions are not always reflective of this.

So, why don’t people act in line with their own preferences? Well there is “Confusology” (Odlyzko) which I’ll come back to. Hyperbolic discounting is about risk in the future and potential, vs rewards that are immediate and tangible (sometimes). Sometimes users say “they know this anyway” – there is no point obfuscating information as “they” know this stuff already – they are just testing honesty or willingness. When you have done a lot of work on financial disclosure this arguement comes up a lot there. It also comes in with ISPs and perceptions of surveillance. Sometimes this reaction is plausible and logical, but sometimes it is much more of a Cognitive Dissonance defense, something of an excuse to minimise workload. That is also why we really do need to work on the public discourse because the more false information is in the public discourse, the most this encourages individuals to make choices in that way. The more we allow that kind of nonsense to be out there, the more it undermines important dicussions of privacy. The final reason is that technology does offer protection people want – but they still want the benefits.

Back to Confusology (Odlyzko 2014), I really recommend Odlyzko’s work here. He talks about several factors: inadvertant disclosure – complex tools make consequences of actions hard to predict; there is too much work – rules and legal jargon make privacy too much work, and people are loathe to expend effort on tasks they see as secondary to their goal. Legal jargon is practically an orchestrated campaign, “I agree with the terms and conditions…” is the biggest lie on the internet!; lack of choice (so consent is not meaningful) – I challenge you to find a provider who offers genuinely meaningful terms of consent; the hidden persuaders – temptation, nudging, exploiting cognitive biases… encouraging users to think that sharing more is the preferred option. I have seen Google encouraging researchers in privacy to work on “opinionated design” because they have tried everything to get people to click through in the right way – they make warnings different every time, hide other options etc. I think this is a slippery slope. In the privacy area we see this choice as pretty fake, particularly if you hide and obscure other options.

The inadvertant disclosure stuff is still happening. Many users do not understand how technology works and that can catch users out – a key example is peer to peer file sharing, but we also see this with apps and the requests they make of your device (use of contacts, data, etc) and there will be lots more inadvertant disclosures associated with that coming out.

Too  much work leads to over disclosure. Once you are in the habit of doing something, you don’t have to think about it too much. It is less work to fill in a form disclosing information you have given before, than to stop and think about what the implications of sharing that data actually are.

We also see successful adopted technologies that fail on privacy. Platforms for Privacy Preferences (P3P) was far too much work to be useful to many people. It was only IE that implemented it, and they did so in a way that websites could systematically escape cookie blocking. It was too complex and too ambiguous for browser vendors. And there is absolutely no means to verify websites do what they say – 5% of TRUST -e “verified” websites had implementation errors in 2010. This is a place where cognitive dissonance kicks in again – people fixate on something that they see as helping with one form of security and don’t necessarily look at other risks. Meanwhile DoNotTrack – users of this are identified more quickly than those who don’t through web finderprinting. Advertising circumvent with Supercookies.

So, it really isn’t clear what you need to do to ensure that the privacy people want is enabled in websites and tools.

To change tack slightly it is worth reflecting on the fact that privacy preferences vary. It can be useful to frame this in a Technology Adoption Framework – TAM offers a useful framework but privacy needs do vary across cultures, and it varies between people. You need to speak to different people in different ways to get the message across. Westin is a three point scale around privacy that you could use, but that is too coarse-grained since it basically only differentiates between hardcore secure users, pragmatists, and those unconcerned.

However there have been various studies with the Westin Scale (see Berkeley Survey 2009; Harris Poll 2003; Harris Poll 1999) and most users fall into the Privacy Pragmatists category. But behaviours, when studied, consistently DO NOT match their preferences! So we need something better.

There have been attempts to improve the Westin scale but there has been limited scope of other alternative measures of privacy concern, e.g. IUIPC (Malhotra et al 2005) and CFIP (Smith et al 1996). And people engage in information seeking behaviours (Beldad et al 2011), since people seek trust signals (trust symbols and trust symptonms) (Riegelsberger et al 2005). Asking people about the provider of a service, and their trust in that provider is important in terms of understanding their behaviour and their preferences.

So my PhD student (Morton) looked to work on development of the Westin scale to better align preferences and behaviours, using a mixture of qualitative and quantitative methods, investigating subjective viewpoints. He has been interviewing people, analysing their statements, and ordering those statements with research participants asking them how well those statements reflected their views. The number of participants (31 offline, 27 online) is relatively small, but the number of statements generated by them was into the thousands – so this is a really complex picture. So, participants ranked statements as important or unimportant with a Q-sort process (a version of card sorting task).

Morton has found that people sort into five categories:

  • Information Controllers – those really aware of the data, looking at the data and what it says about them. These are skeptical people and do not have a high trust in the cloud and want control over the collection, use and dissemination of personal information. For them things that are not important include: organisational assurances; others’ use of the technology service.
  • Security Concerned – their principal focus is on security of the technology platform, providing organisation;s security processes, potential impact on personal security and finances. They are trading off the benefits and risks here. They are less interested in the technology in abstract.
  • Benefit Seekers – are those happy to trade off the risks
  • Crowd Followers – trust in others’ use to make decisions about privacy and security
  • Organisational Assurance Seekers – they look for the organisation to say the right things, disclaimers etc. They expect bad things to happen, and want assurance against that.

Now I think that work is quite interesting. And we are now undertaking a large scale study with 1000 participants in the UK and US with all participants sorted into one of these categories, and several scenarios to assess. The first 300 participants’ contributions already suggest that this is a better model for connecting preference with behaviour.

I did want to talk about why we need to make privacy more salient. Ultimately privacy is about relationships. People manage relationships with other peoplel through selective disclosure of information – that is a fundamental part of how we engage, how we present different personas. As more information is disclosed, the more that is undermined. And that is most obviously taking place in University admissions or potential employer searches for individuals. The inability to make selective disclosures can undermine relationships.

For exampe: a chocolate biscuit purchase: seeing someone buying chocolate biscuits buys the main shop on card, then buys biscuits in cash. It turns out this person’s partner is a health food nut and manages the finances tightly. So that person and their child agree to the healthy food rules at home, but then have access to chocolate biscuits elsewhere. This is how people manage relationships. That sort of lack of disclosure means you do not need to revisit the same arguement time and again, it helps illustrate why privacy is so fundamental to the fabric of society.

We do have ways of making privacy cost more salient. There is this trade off around privacy – we are often told these things are “for your own good”. And without a significant push for evidence that is hard to counter. We don’t force accountability of promised/stated benefits. CCTV in the UK is a great example. It took almost two decades for any investigation into that investment, when there was research it was all pretty damning (Gill and Spriggs 2005; Metropoliton Police Review 2008 – CCTV only contributes to prevention or resolution in 3% of crime, it is costly and there is only 1 crime per 100 cameras). And we have had misuse of CCTV also coming through courts. Investigations into inappropriate behaviour by the London Met Police over a year show inappropriate disclosure – like the CCTV case – a huge percentage of that issue.

We have the extension of the state into something of military surveillance. We see the rise of drones, robots and autonomous vehibles. There is an increasing number of networks and devices – and we see mission creep in this “deeply technophilic” industry. We also see machine learning and big data being advertised as the solve all solution here… But as Stephen Graham notes “emerging security policies are founded on… profiling” of individuals, a Minority Report state. David Murajami Wood from the Surveillance Studies Network talk about automatic classification and risk based profiling as adding up to “social sorting” and we see this with tools like Experian MOSAIC and ACLU Pizza. We must not let this happen without debate, push back, and a proper understanding of the implications.

Odlyzko raised the issue of who controls the information – it is often big global mega corps. The decline of privacy actually undermines the fundamentals of capitalism and the dynamic nature of the market system – a truly dystopian solution.

So, do people really not care? Post Snowden it can seem that way but there are signs to the contrary: the UK Investigatory Powers Tribunal ruled GCHQ surveillance to be illegal; major tech companies are distancing themselves from government, putting up legal resistance; and deploying better security (encryption) and we see talk of a Digital Charter from Tim Berners Lee, progressing this debate. Privacy protection behaviours are not always obvious though.

We also see the idea that “Digital Natives Don’t Care” – now that is not true, they just care about different things, they engage in “social steganography” hiding in plain sight (boyd 2014).

So, in conclusion: technology has profound impact on privacy, in many ways that people don’t understand – at least not immediately; people often eagerly assume and over estimate benefits and under estimate and discount risks; we need to counter this by better communication about risks and benefits; communication needs to relate to what matters to people with different preferences.

Q&A

Q1) It seems to me that some of the classical social science sources about relationships, what information to ignore and which to note… It seems those sources can be updated and adapted to the modern world and that you can analogyse up to the point

A1) Yes, you look at this area and there are really three people I always go back to from the 1960s: Goffman, Lumans and Giddon.

Q1) And more recently Henry Jenkins too.

Q2) From your presentation many people make poor decisions around privacy, but those are pragmatic choices. But I really do think we don’t see people understanding the impact of surveillance – there is a lack of understanding that not only might they look for terrorists but of the other implications of machine learning, of other use of data, and that that is a level of data use that is not proportionate the problem.

A2) That is the debate we need to see in the public discourse so urgently. There is a pushing out of tools without any consideration of those implications. Using the language of cost and waste around data can be useful here, but some want a story of the negative consequences in order to make sense of this – for instance someone being denied a job because of errors or disclosure.

Q3) Do you think that education institutions in the United Kingdom have any role to set an example or themselves or others, by practicing what academics would advise.

A3) Online privacy protection is part of the national curriculum now. If I was running a school I wouldn’t want to turn it into a prison – metal detectors etc. But there is also the tracking of learning behaviours and activities, data mining to identify individual learning paths – risks there are also something to think about. It is often the most mundane and banal stories that often hit home: what if someone is worried to search for treatment for a disease, lest their own status be disclosed by that? Being tracked changes behaviour.

Q4) The detection rate of terrorism is so low that it is not just a waste of money, it is also ineffective method.

A4) But then it is more convenient to sit behind a computer than to actually be out on the street facing direct human interaction and risk, that may also be part of it.

Q5) Going back to the topic of education. there are quite a lot of primary schools in the UK where they are using apps, ebooks etc. Is there

A5) There are three technologists who did a fantastic study. They found it makes kids more obedient, and they start to behave like people in prison which is damaging to individuals as well as to society. This will foster rather than discourage criminal activity.

Comment) Emmerline Taylor, in Australia, has done a book on how kids respond to technology in schools.

And with that we close a really interesting talk with clear relevance for some of the findings and recommendations coming out of our Managing Your Digital Footprint research work.

Aug 182015
 

All of this week, whilst I am mainly working on Managing Your Digital Footprint research work, there is a summer school taking place at the University of Edinburgh School of Informatics on Security and Privacy with several talks on social media. This afternoon I’ll be blogging one of these: “Policing and Social Media Surveillance : Should We Have any Privacy in Public?” from the wonderful Professor Lilian Edwards from University of Strathclyde and Deputy Director, CREATe.

I come to you as a lawyer. I often say what I do is translate law to geek, and vice versa. How many here would identify themselves as from a legal discipline (about 10 are), I know most of you are from a computer science or HCI area. What I will talk about is an overlap between law and computer science.

So, a nice way to start is probably David Cameron saying: “In extremis, it has been possible to read someone’s letter, to listen to someone’s call to listen in on mobile communications,” he said. “The question remains: are we going to allow a means of communications where it simply is not possible to do that? My answer to that question is: no, we must not.

I’m going to argue that encryption, privacy, etc. is a good thing and that there should be some aspect of privacy around all of those social media posts we make etc. Now, what if you didn’t have to listen to secret conversations? Well right now the security services kind of don’t… they can use Tumblr, Facebook, Twitter etc..

So, a quick note on the structure of this talk. I will set some context on open source intelligence (OSINT), and Social Media Intelligence (SOCMINT). Then I will talk about legal issues and societal implications.

So, SOCMINT and OSINT. In the last 5-7 years we’ve seen the rise of something called “intelligence led” policing, some talk about this as the Minority Report world – trying to detect crime before they take place. We have general risk aversion, predictive profiles, and we see big data. And we see “Assemblages” of data via private intermediaries. So we see not only the use of policing and intelligence data, but also the wide range of publicly available data.

There has been the growth in open source intelligence, the kind of stuff that easy to get for free, including SOCMINT – the stuff people share on social media. You can often learn a great deal from friends graphs, their social graph – even with good privacy settings that can be exposed (used to always be open) and that is used in friend of friends analysis etc. The appeal of this is obvious – there is a lot of it and it is very cheap to get hold of it (RUSI and Anderson Report 2015), 95% of intelligence gathered is from this sort of “open source” origins, the stuff that is out there (ISC 2015). There have been a number of reports in the last year with increadibly interesting information included. Another report stated that 90% of what you need to know if from this sort of open source, and it’s great because it is cheap.

In terms of uses (Barlett and Miller 2013) are various, but worth noting things like sentiment analysis – e.g. to predict a riot etc, apparently very useful. Acquiring information from the public – have you seen this robber, etc. is very useful. Horison scanning is about predicting disturbance, riots etc. We are also seeing predictive analytics (e.g. IBM Memphis P.D.; PredPol in Kent) and that is very popular in the US, increasingly in the UK too – towards that Minority Report. Now in all of these report there is talk of predition and monitoring, but little mention of monitoring individuals – but clearly that is one of the things this data usage enables.

These practices are rising policy challenges (Omand 2012) of public trust, legitimacy and necessity, transparency. And there is the issue of the European Convention on Human Rights: article 8 gives us the right to a private life, which this sort of practice may breach. Under that article you can only invade privacy for legitimate reasons, only when necessary, and it the level of invasion of privacy can only be proportionate to the need in society.

So, looking at what else is taking place here in contemporary practice: we had the Summer Riots in 2011 where the security services used #tweets, BB texts etc. and post riot reports really capture some of the practice and issues there; Flickr stream of suspect photos leading to 770 arrests ad 167 charges, Facewatch mobile app During the 2012 Olympics the police wanted to use social media data, but basically did not know how. So issues here include police managerial capacity; there is sampling bias (see “Reading the Riots”) as Twitter is a very partial view of what is occuring; And there is human error – e.g. in crowdsourced attempts to identify and locate the Boston Bombings.

So I want to talk about the possibility of using public social media posts and question whether they have any protection as private material.

An individual tweets something, says she didn’t intend for it to be seen by the police, commentators online say “What planet is this individual on? Her tweets are public domain” and that is the attitude one tends to see, including in the law courts. e.g. “a person who walks down the street will inevitably be visible” (PG v UK 2008 ECt HR). In the UK that seems to be the standard perspective, that no reasonable expectation to privacy when expressing yourself in public.

In the US there is even less privacy of social media posts, e.g. see C.f. Bartow (2011) who says “Facebook is a giant surveillance tool, no warrant required, which the government can use… with almost no practical constraints from existing laws”. There is no idea of privacy in the US constitution effectively.

You’d think that the EU would be better but where are our traditional concepts of when “reasonable expectation of privacy arises?” Is it in our body, our home (Rynes ECJ 2013), car, what about our data “relating to you” vs “public sphere” (Cf Koops).

So, what are the legal controls? Well the Data Protection law seems obvious but there are strong UK exemptions around detection and prevention of crime – so there is no need for consent.

How about the European Convention on Human Rights article 8, the right to a “private life”. So, the start of my arguement is Von Hannover ECtHR (2004) about intrusion by press rather than police – Princess Caroline of Monaco was being followed by the press in all of her activities. The Court says, seminally, that this is absolutely an invasion of her private life – even though she is a public figure in a public sphere. So we have a concept of privacy being beyond the bounds of your home, of being able to have a right to privacy when out in public.

Now, that was an important case… But it hasn’t had that much impact. So you have cases where the police take photos of people (Wood v Metropolitan Police 2008) or CCTV (reapplication by JR38 for Jusicial review (2015). In the case of Wood a serial activist was going to a corporate AGM, expected to cause trouble, so police followed him and photographed him. Judge said that he was an activist and well known, and could expect to be followed. The arguement was that the image was a one off thing – that not part of ongoing profile.

The most recent case, which was in Northern Ireland, was caught on CCTV during the NI equivelent of the London Riots. The person in question was 14 year old and images were circulated widely, possibly including to the Derry Journal. Again he uses, but in an interesting way. There are at least three judgements.

Lord Kerr says “The facet that the activity… Is suspected to be criminal… will not alone be sufficient to remove it from… application of article 8”. That’s a big deal – suspicion of criminal activity isn’t enough for your rights to be exempt. However in this case the second test, whether the intrusion is justified, was found to be the case. And they took very little time to decide it was a justified act. Under proportionality of rights of individual, and rights of community to protect itself, they felt this intrusion was justified. They say that he’d benefit too – saying that that 14 year old might be diverted from a life of crime. They lay it on a bit but they are under pressure to justify why they have not stigmatised this youth through sharing his image. So, an interesting case.

So, there is some expectation of privacy in public but even so interference can be justified. Interferance must be justified as necessary, proportionate and according to law. But security usually seems to win in UK? (Wood, JR38). Even if no reasonable expectation of privacy, may still be part of “private life”. But all of this assumes that you know you are being surveilled, of your information being accessed. But you may not know if your data is being used to build up profiles, to build up an airport stop list, etc.

Now, in response to Snowdon, we have something called RIPA – an envisioned “digital” scheme to cover surveillance of personal data. This scheme covers real time interceptions of emails, warrant from secretary of state needed. But social media isn’t part of this. They just seem to be making up how they manage that data.

Now I want to argue that use of SOCMINT shouldn’t have any special excemption…

Demos in 2013 asseted “open” SOCMINT collection (and processing) needs no authorisation of any kind. Why? They argued that no expectation of privacy so long as user new from T&C that public data might be collected, especially via API. I think that is just egregiously stupid… Even if you believed that it would apply to the platform – not for the police, the rest of the world, etc.

The other argument is the detailed profile argument. And that is that even if we admit that this material is “public” there is still part of ECHR which is that detailed profiles of this sort need to be treated with respect – that comes from practices by the Stasi and concerns around the possibility of a secret police state, Juris Prudence (Rotaru v Romania) covers this.

So, my perspective is that there is a real difference between structured and unstructured data… Even if in public is SOCMINT an autoamatic dossier? With Google most of the internet is a structured dossier. With that in mind ECtHR case law has seen structured dossiers maintained ver time as a key threat – Rotaru v Romainis dictum: “public information can fall within the scope of private life where it is systematically collected and stored in files held by authorities”. So does the Rotaru distinction between structured data in files held by police, and unstructured data hold up in the age of Google and data mining (e.g. Google Spain (ECJ 2014), UK RIPA case (2015).

As we move into the internet as the main site for key publishing of data, and as the internet of things and smart cities come online

Q&A

Q1) Should we be able to do data mining on large sets of social data?

A1) Big data, data mining and the internet of things can be seen as the three horsemen of the apocalypse in a way. And that’s the other talk I could have given. The police, using this sort of data are using data in a different context, and that isn’t ok under ECHR art 8.

Q2) I remember a paper about a year ago about the distinction between what an individual can do in terms of asking about others etc. They have more right that the police in some contexts.

A2) There is this weird thing where if you are not looking at specific people, you aren’t as restrained. That’s because it used to be the case that you could find out very little without investigating an individual. That has changed considerable but he law hasn’t been updated to reflect that.

Q3) A lot about us is public, so don’t we just have to deal with this. I see the concerns of a police state, but I don’t understand where you are drawing the line on legal controls on policing. If they can only do the same as a member of the public then there shouldn’t be an issue there…

A3) You’ve given that answer yourself – the power dynamic is asymmetrical. They have capacity to join data up to their own databases – which may include your being a witness or victim of crime, not always suspect or perpetrator. There is a lot of black boxing of data here…

Q3) What controls are you proposing?

A3) Honestly, I don’t know if the quick answer. But if we look at the requirements for intercepting letters, email, telephone are strict, searching homes, pretending to be friend etc. are less strict… But that scooping up of mass data is something different in terms of implications and we need some form of safeguarding around that, even if less strict than some other approaches/interceptions.

There is overwhelming evidence that young people don’t realise the potential implications of their sharing of data, and see these spaces as a private space away from other areas of their life in which they find themselves surveilled. So there is a reasonable presumption of privacy there.

Q3) I think there is a need for appropriate controls on police activities, I agree with that. If I share things only with friends on facebook and police look at that, that is an investigation. But if I tweet something it is public

A3) This is the classic liberal argument I don’t agree with. Tweeting is a bit different. Facebook is the new mall, the new social space, they use openness to serve them socially, believing it will only be read by peers. So they have a reasonable expectation of privacy. Part of Bartett and Millar work is about the use of the word “rape” – in gaming culture it is being used to take a game. Imagine that being crunched. That’s the sort of issue that can arise in big data. I’m not saying police needs a warrant for all Twitter data capture, I’m saying we need to think about what is appropriate.

Q4) There is a perspective that taking the UK out of the EU Human Rights Act is a red herring to distract from other legislation.

A4) Even if we left the EU Human Rights Act, the UK Government would find many of its protections are embedded in other part of EU law, so it would still require appropriate respect of individual rights to privacy. But that’s a political conversation really.

Q5) So, in terms of the issues you have raised, how do we understand what is private and what is public data?

A5) I think essentially that we need to safeguard certain points in what has become a continuum in privacy around human rights, something that will set some barriers about the types of interventions that can occur, and what kind of oversight they require.

And with that Lilian’s excellent and information-packed talk is done. Really interesting and there were clearly plenty more questions arising. Particularly interesting for me thinking about the Digital Footprints work, and the legislative context for the research we have been undertaking on student expectations, experiences, practices. 

Apr 272015
 

This afternoon I am attending a talk on the Privacy of Online Social Networks which has been arranged by the Social Network Analysis in Scotland Group (SNAS) and is taking place at the University of Edinburgh. The speakers are Jordi Herrera-Joancomarti, Cristina Perez-sola, and Jordi Casas-Roma, all from Universitat Autonoma de Barcelona (UAB). I’ll be taking notes throughout, although I think that the talk is also being recorded so may be available later. As ever this is a liveblog so corrections, comments, etc. welcome. (I will also be adding some images from the event later today as some of the processes discussed were quite complex and require illustration!)

We are opening with an introduction to the SNAS group, which meets at the University of Edinburgh on the last Tuesday of every month… They have a mailing list and I’ll add a link here later. Dr Jordi Herrera-Joancomarti is leading the talk, and is an expert on privacy and security.

Dr Jordi H-J: This is collaborative work with my colleagues Cristina and Jordi. My background is not social sciences but mathematics, so it is a good challenge for me to speak to a non technical audience here… Hopefully there are no scary mathematical equations here! I’ll open with an introduction, talk about Online Social Networks and graph theory, talk about the data you can mine, and I will talk about Online Social network Data anonimisation, and how you can release data from networks without compromising privacy, before coming to my conclusions.

So, to start with the definition of Online Social Network I am using is an “online service, platform or site that allos to create a user profle which can be connected with other user profiles of the network… ”  – a very computer science definition.

So this can be about specialisms like Flickr, LastFM, WikiLoc… specialised format (e.g. Twitter); Scope limited (e.g. LinkedIn); General purpose (e.g. Faebook, Google+) etc. The denomination of connectivity can be network dependent (e.g. Facebook: friends; Twitter: followers). An dinteractions between user profiles are also network ependent (e.g. Facebook: “like” action, post a message; Twitter: tweet, Retweet etc).

So, why are OSN interesting or important? Well they have become an important part of people’s everyday communications, with huge volumes of users. But there is also a book, Big Data (Viktor Mayer-Schonberger and Kenneth Cukier) which includes chapter 5 “Datafication” talking about the quantification of the world along the time from differnt aspects. So, when words became data (Google books in 2004); when localization becomes data (GPS); and when relationships become data (OSN). For instance Facebook datafied relationships, and most notably with the introduction of “Social graph”.

To graph theory then. A graph is a mathematical tool used to represent objects (nodes) that can be connected by links (edges). OSN can be modeled using graphs and analysed with graph theory. So… You can represented connections between individuals etc.

There are different OSN propoerties that dertmine the type of the corresponding social graph:

– Undirected graphs are those with no meaning on the incidence of an edge in the node. Facebook social graph is an undirected graph. So, no arrows between individuals, no value to that edge.

– Directed graphs (digraph) are those in which the edges have a direction associated with them. Twitter social graph is a directed graph. For instance you can follow someone, they don’t have to follow you… So we have arrows here to indicate connection and direction.

– Weighted graphs assign a weight to every edge in a graph.

So, when you add direction to a graph you can borrow many analysis tools from graph theory. So if we try with a degree of a node in an undirected graph… The degree of a node is the number of edges incident to that node, denoted as deg(vi).

In a directed graph the same concept applies but it is more complex… We have In-degree of a node and that is the number of head endpoints adjacent to that node denoted as deg-(vi). Similarly we can have out-degree for number of tail endpoints, denoted as deg+(vi).

So, in a facebook social graph the degree of a node is the number of friends of that user. In Twitter social graph, the in-degree can be seen as the number of followers of that user. High in-degree may indicate a popular user. And the out degree can be seen as the number of users that person follows.

We can also talk about the clustering coefficient. We see local clustering coefficient of a node – the proportion of edges between the nodes within its neighbourhood divided by the number of edges that could possible exist between them… So it measures how far are the neighbourood of a node to become a clique. So this is how well the friends of a node are connected. These kinds of technical techniques can be used to understand user connections and relationships.

We study OSN privacy from an information-fundamental point of view, analysing OSN privacy from a graph mining perspective. We do not study specific OSN services, configurations or vulnerbailities. In some cases we do make some assumptions about the type of OSN: open vs closed profiles. For instance Facebook is more difficult to extract data from than Twitter, an open social network.

So there are two kinds of users information that can be extracted:

1) Node information – data about a specific user, details contaied in the users profile on a specific OSN

2) Edge information – data about the relationship between members of the network – and that is what we are most interested in.

Edge information can, however, directly disclose node attributes – e.g. an edge representing a sentimental relationships between two individuals of the same sex would be revealing about their sexual orientation. It is more difficult to protect edge information than node information – as it depends on behaviour of connected people whereas node information is controlled by just one user. Relations between users can also detect communities, and more node attributes.

So, I wanted to explain about data retrieval. How do you ontain social network information? Well you can ask OSN providers – but many are not that cooperative or put a great deal of restrictions/agreements to do that. They provide local and/or anonimised data. OR you can take the data from the OSN providers – that is not always possible adn depends on the open degree of the OSN service. And it is very important to take care on the mechanism used to obtain information as that may determine the bias of the data you collect.

You can gather data several ways. You can use a web crawler to gather daya from an open OSN (like Twitter). Web crawlers are computer programs that retrieve web pages starting from a single (or multiple) page and exploring all its linked pages and also the pages linked to those ones and so on. Since most of OSN interact through the web, you can use web crawlers for OSN data retrieval… The process is iterative…

A download is the interface between the OSN and the crawler – it downloads the users profiles and passes it to the parser, which then parses that data. You draw out the friends of that user and add them to the queue, which contains all the users that are awaiting to be explored, found when crawling every user. And the scheduler selects which user, from the ones in the queue will be explore and sends the decision to the downloader. The scheduler impacts on both performance and data quality.

If you are exploring the whole network then it is not so important to consider the crawler details… if I am crawling every member I will find all of the connections at the end… the order you gather data in doesn’t matter in that case. BUT you cannot crawl all of the network available now… So you will have to, at some point, decide to take a partial view of the network. So to do that we have to think about notification and assumptions…

Users can be crawled (one that all his profiles information and all friends are known to the crawler (v E Vcrawl). A discovered user (connected to the user crawled), and an explored user  (discovered by relationship to discoverd user)?

So… for instance a Breath-First Search (BFS) Algorithm would start with one user (h)… you find they have two friends (d and j)… I crawl j and then discover they connect to users l and k and g (and I’ve already crawled d and h)… Then I crawl user d, finding connections to f, e, b, c… others are already found… Then I crawl l, find connections etc…

So, that is your schedule, the order you crawl. And the idea is that you can end up with all the elements of the network… This is quite a linear process. So, this is one approach, and this BFS algorithm produces graphs quite dissimilar to other algorithms you could use.

An alternative approach is the Depth-First Search (DFS) which works as a traditional stack, the first nodes to be crawled are the last ones that have been discovered (LIFO management). So, in this approach… If you start with user h… you discover j and d… But the next node you explore is d… then when you find connections to f, g, e, b, c… and you next explore node c. At the end you will end up with all the nodes as well… But in a different order than you had before… So, again, if you do this with a group of users (example here being 162 flickr nodes) it looks quite different…

Then you can do more intelligent things… You can use “greedy” algorithms:

– Real-degree greedy (hypothetical greedy or higherst-degree-crawler) takes its decisions based on the real degree (which may be unknown to the crawler before the node is crawled) of the nodes in the OSN. So a user has degree 5, degree 7 etc. based on the edges between different nodes… You can gather the whole network, or you may have restrictions and only capture part of the network…

– Explored-degree greedy (greedy) uses the actual known degree of the nodes in the OSN… So if you graph that you see many many connections, you look more conciously to the mode connected nodes.

You can also choose to select more variance in the network, to randomise your sample to an extent. This can be done with a lottery algorithm…

So, if you take information from a social network or a social network graph you have to be really well aware of what you are getting. When you do your sampling from different profiles, etc. that you understand what your sample is of. As far as you can see you can just adjust the scheduler to get what you want… you can do that to focus on particular users, types of users.

Schedulers have implications on privacy… depending on the level you select that has different implications… So your scheduler can have different objectives for the crawler – taking the privacy attackers point of view. So you can then understand which scheduler algorithm fits those objectives most appropriately…

You can also do more tricky things… For instance the classification of users from a graph point of views. So, I want to classify users, identifying the set of categories a new observation belongs to. The decision is made on the basis of a training set of data containing observations whose category membership is already known. When you try to classify users within the network, you can see link information which may help you to classify a user – connections to a community for instance.

The idea is that you can see classification as a privacy attack – user classification allows an attacker to infer private attributes from the user. Attributes may be sensitive by themselbes, attribute disclosuer may have undesirable consequences for the user. So the design of a user (node) classifer that uses the graph structure alone (no semantic infomation needed)… So, for instance… We may classify the user, with a neighborhood analysis to better classify the user… So the classifer analyses the graph structure and maps each node to a 2-dimensional sample using degree and clustering coefficient. The output is an initial assignation of nodes to categories…

And you can make that neighborhood information to classify the node… You can also have a relational classifier, which maps users to n-dimensional samples, using both degree and clustering coefficient and the neighborhood information to classify users…

So coming to the issue of data and data release… When you obtain a collection of data… you may have a more anonymised data view… You may see connections etc. but without user names, for instance. The intention is to preserve the privacy of users. But is this enough? Well no… this nieve anonimisation potentially reveals huge amounts about the user… if you know other data (other than names), you may be able to deduce who is in the network, you might find one user in the network and thus expose others. Removing the identifiers is not enough… So, you have to do something more elaborate…

One approach is to modify the edges – adding or deleting edges to hinder re-identification… But the problem is that you have two opposite objectives: On th eone hand you want to maximise the data utility and you want to minimise noise in that data. But you also want to preserve users privacy…

So, there are different ways to quantify the objective…. There are generic information loss measures (GIL) – measures like average distance, diameter, harmonic mean of shortest distance, etc… You want to preserve that in your data. So… you have the original network, you do one metric… and end up with a different network that is anonimised, and you can apply a similar metric afterwards to use it… In statistical databases you can preserve the mean of all the registers that sold boots (say)… If you know the questions to ask of that data, you know the process to keep that anonimised data close to the original data set…

You can also use specific information loss measures (clustering process)… Similar problem here… You have the original clusters, you use a clustering method to get to an anonimised (perturbed) version.

So, some measures behave in a similar way independently of the data in which they are gathered.

And then you have the idea of k-anonimity. A model that indicates that an attacker can not distinguish between different k records although he managed to find a group of quasi-identifiers. Therefore the attackers can not re-identify an individual. So, node degree can be the quasi-identifier… We can presume the attacker may know some of the nodes in the network… We can preserve the degree sequence, and the ordered degree sequence. And you can measure the k degree by understanding how many nodes have the same degree. So if two nodes in the network have degree 4, then the k-degree anonymity is 2. You can then make use of this to preserve the graph…

To modify the graph you can use edge modification (adding and/or deleting); node modification (adding and/or deleting). You can use uncertain graphs – adding or removing edges “particially” by assigning a probabiity to each edge. The set of all possible edges is considered and a probability is assigned to each edge.

Edge modification can include edge rotation, random perturbation, relevant edge identification, k-anonymity orientated anonimisation. These can allow you to keep data you want to keep, whilst preserving user privacy.

So, in conclusion, OSN can be modeled with social graph and analysed using graph mining techniques. Web crawlers may retrieve sensitive information from OSNs but the quality of the collected information will depend on the scheduler algorithm specitifities. Relational classifiers may provide relevant user information by just analyzing the graph structure information… Data anonimisation is needed for releasing OSN data without compromising the user’s privacy. This is a research field that is quite new and quite difficult… unlike statistical databases, where you can change one user without impacting on others, any change here does effect the network. And anonymisation algorithms need a trade-off between information loss and user anonymity loss.

Q&A

Q1) You talked about how much stuff is being datafied… Soon with smart watches we’ll have health data available. Because crawlers take some time… things could change whilst you are crawling.

A1) One of the problems in social networks and graph theory, is that algorithms for this sort of data are complex and time consuming… And that is a problem… Especially at scale. And sometimes you have the information, you make a lot of computation but the information is not static… so not only a lot of work not only on algorithms but also on understanding different and changes in the network – what happens when a node is removed for instance. There are people working on algorithms for dynamic data… But much m

Q2) What kind of research questions have you been using this with?

A2) There are two different issues for me in terms of social sciences… We don’t start with research questions… we start with problem and try to start it… So when AOL released data about lots of servers… you could identify individuals from the data… but you shouldn’t be able to… That happens because they don’t understand or care about anonymising data. So we are trying to provide tools to enable that anonymisation. We also have ideas about the crawling approach… So as a social network provider you might want to avoid this type of crawler… you might use this approach to trap or mislead the crawler… So the crawler end up in a dead end… and cannot crawl the network.

Q3) Some of the techniques you showed there were about anonymisation… do you use removal of nodes for that purpose

A3) There are several approaches for adding or removing nodes… Sometimes those approaches collapse those nodes… So you anonymise all the nodes too… But the general techniques that are more used are those that perturb and move the nodes.

Q4) One of the last things you said was about that trade off of utility of analysis and user privacy. My question is who makes that decision about the trade off? Would the people being studied agree with those decisions for instance, in the real world?

A4) The real world is much more complex of course. The problem is about deciding level of usefulness of the data… At the present time these methods are not used as far as they could be done… For statistical data this is often fixed by government… for instance in Census data you can see the method by which data has been anonimised. But for OSN there is nothing of that type, and nobody is telling… and basically no-one is releasing data… Data is money… So if we can try to give good algorithms to enable that, then maybe the OSN companies can release some of this kind of data. But at this moment, nobody is putting that idea of privacy there… Generally privacy level tends to be low, information level is high…

Q5) I didn’t totally understand how you set the boundaries of the network… Is it the crawling process?

A5) The idea is that there are no boundaries… Crawler goes… Maybe it completes within 1000 nodes, or 3 hours… or similar. You won’t crawl everything and you want some data. So 10 million users might be the boundary for instance… Then you have data to look at… So I have 10 million users out of a pool of 500 million… But which ones do I have? How representative? That needs consideration…

Q6) The crawler gathers a model of relationships and behaviours, and I’m sure that marketers are very interested. Is there potential to predict connections, behaviours, intentions etc.

A6) Yes, there are lots of techniques of graph theory that allow that sort of interpretation and prediction. OSN use these sorts of approaches for recommendations and so on…

Q6) How reliable is that data?

A6) Understanding similarities there can help make it more reliable… similarity rather than distance between nodes can be helpful for understanding behaviour… But I will say that they are quite accurate… And the more information they gather, the more accurate they are…

Q7) I was wondering when you were talking about measuring the effectiveness of different anonymisation methods… Is there a way to take account of additional data that could effect anonimisation

A7) In computer security in general, when you model someone you have to define the adversary model… What the adversary is able to do… So, what is the attacker able to have… The available information… So the more information is available, the harder it is to protect the individual. It is a complex scenario.

Q8) Is there a user friendly web crawler that can be used by non technicians…

A8) No. Sorry about that… No, because there are some frameworks… But you don’t have one solution to fit all… But the idea is that there are some frameworks that are more suited to computer science people… Tomorrow in the workshop we will explain extracting information from Twitter… And those techniques will let us explore how we could develop a crawler on Twitter… So exploring connections and followers, etc.

Q9) What are the ethics of web crawling in social sciences? And what are the positions of the OSN on that?

A9) You can crawl OSN because the information is public. So you can crawl Twitter, as information is public. If you want to crawl Facebook, you have to be authorised by the user to look at the profile… And you need to develop an algorithm to run as an app in Facebook… and authorise that… But that doesn’t mean the user understands that… But for instance in last US Election, Obama campaign did an application on Facebook that did that… graphing their supporters and friends… And use that in the campaign…

Q9) I was wondering about the crawling of discussion forums… where you cannot get authorisation. But you also mentioned that providers not keen… is it legitimate to do that…

A9) I think that it is… If you are crawling public information… There is another thing of the OSN not liking it – then they can make some restrictions. If I do things that avoid OSN restrictions that is fine… You can do that

Q10) I wanted to follow up on that… There are legal and ethical issues associated with crawling websites. You have to consider it extremely carefully. If I use a website that says it does not allow crawlers, I don’t expect it to be crawled and that would not be legal under data protection law. And there was some research about 10 years ago a research project found that bloggers, although posting in public, didn’t expect to be analysed and interpreted… And you do have to think about the ethics here… And you need to think about the user’s expectation when they put the data up.

A – Christina) Everyone uses Google, you can’t expect that when you put something on the internet you have to expect it to be crawled

A – Jordi) From my perspective, as a researcher doing cryptography what you say is quite strange… My work is about protecting information… It assumes people will be trustworthy with your information…

Q10) No, I’m saying an ethical researcher should not be breaking the law.

Comment) There can be an expectation of privacy in a “public” space…

Comment – from me) I would recommend the Association of Internet Researchers Ethics Guide for more on how you can mediate expectations of users in your research. For your cryptography work that may not be as relevant, but for others in this audience that guide is very helpful for understanding ethical research processes, and for thinking about appropriate research methods and approaches for ethical approval.

And with a gracious close from Jordi, we are done! There is a workshop running tomorrow on this type of analysis – I won’t be there but others may be tweeting or blogging from it.