Cambridge Analytica Harvested 50 Million Facebook Profiles
A whistleblower has revealed to the Observer how Cambridge Analytica – a company owned by the hedge fund billionaire Robert Mercer, and headed at the time by Trump’s key adviser Steve Bannon – used personal information taken without authorisation in early 2014 to build a system that could profile individual US voters, in order to target them with personalised political advertisements.
[…]
Documents seen by the Observer, and confirmed by a Facebook statement, show that by late 2015 the company had found out that information had been harvested on an unprecedented scale. However, at the time it failed to alert users and took only limited steps to recover and secure the private information of more than 50 million individuals.
[…]
The data was collected through an app called thisisyourdigitallife, built by academic Aleksandr Kogan, separately from his work at Cambridge University. Through his company Global Science Research (GSR), in collaboration with Cambridge Analytica, hundreds of thousands of users were paid to take a personality test and agreed to have their data collected for academic use.
However, the app also collected the information of the test-takers’ Facebook friends, leading to the accumulation of a data pool tens of millions-strong. Facebook’s “platform policy” allowed only collection of friends’ data to improve user experience in the app and barred it being sold on or used for advertising.
Chief Security Officer of Facebook @alexstamos says that Cambridge Analytica misusing the data from 50M profiles was a feature of their platform at the time.
Cool man. Great PR work.
I have deleted my Tweets on Cambridge Analytica, not because they were factually incorrect but because I should have done a better job weighing in.
Facebook was doing things covered under the ToS. For the first time in the history of Facebook — and countless people like me screaming about it for years — people decided to be upset.
CA acted dishonestly in using an unrelated quiz to harvest user and friends’ profile, etc data, but it really isn’t any different than what a ton of people were doing at the time. That’s on Facebook, and on them for not notifying the public about it when they discovered it.
If your API allows access to more data than I’m granted, that’s a vulnerability. And if I access it, that’s a breach. The honor system is not a valid layer of defense in depth.
This was not a security breach. This is simply what Facebook is: a massive surveillance machine.
“This was a scam — and a fraud,” Paul Grewal, a vice president and deputy general counsel at the social network, said in a statement to The Times earlier on Friday. He added that the company was suspending Cambridge Analytica, Mr. Wylie and the researcher, Aleksandr Kogan, a Russian-American academic, from Facebook.
So the Cambridge Analytica Whistleblower has been ‘depersonned’ by @facebook without any chance to retrieve his contacts or private materials.
Facebook preempted the publication of both of these stories with a press release indicating that they’ve suspended Strategic Communications Laboratories — Cambridge Analytica’s parent — from accessing Facebook, including the properties of any of their clients.
However, the reason for that suspension is not what you may think: it isn’t because Kogan, the developer of the thisisyourdigitallife app, passed information to Cambridge Analytica, but rather because he did not delete all of the data after Facebook told him to.
[…]
Facebook can make all the policy changes it likes, but I don’t see any reason why something like this can’t happen again at some point in the future.
Facebook is a machine built to collect your personal information and hand it to others, en masse. Not surprised that a hostile actor acquired that information. I expect there are many, many, many more that we will never hear about.
[…]
Anyone who builds a Facebook app (and any rookie can do this) has access to an absurd amount of information about you and your loved ones. And there is nothing stopping them from giving it away, besides the “Terms”.
It’s been said many times before but it takes a while to sync in: The cloud is just someone else’s computer. If you’re giving up your data or attention in exchange for free social, mail, messaging, photograph, document, or other transit or storage, then you’re really just taking the drive from your computer, unencrypted, and mailing it to those companies to do with it whatever they will.
[…]
The only thing we can do is delete Facebook. And Messenger, and Whatsapp, and Instagram, and every app like them.
There is a widespread belief that Facebook is a frivolous thing people should just quit. Two billion people use it. For many of them, it is the Internet. For others, it’s the only way to stay in contact with family or loved ones. Facebook has worked hard to get ubiquitous
In large areas of the Third World, Facebook has offered free data plans as long as you stay on the site. WhatsApp and Messenger are integral parts of people’s lives. Before you say ‘just get off Facebook’, ask yourself if you really understand what Facebook is (I know I don’t)
The company routinely ignores or downplays the worst-case scenarios, idealistically building products without the necessary safeguards, and then drags its feet to admit the extent of the problems.
[…]
Here’s an incomplete list of the massive negative consequences and specific abuses that stem from Facebook’s idealistic product development process.
Google is already facing significant antitrust challenges in the E.U., which is exactly what you would expect from a company in a dominant position in a value chain able to dictate terms to its suppliers. Facebook, meanwhile, has always seemed more immune to antitrust enforcement: its users are its suppliers, so what is there to regulate?
That, though, is the answer: user data. It seems far more likely that Facebook will be directly regulated than Google; arguably this is already the case in Europe with the GDPR. What is worth noting, though, is that regulations like the GDPR entrench incumbents: protecting users from Facebook will, in all likelihood, lock in Facebook’s competitive position.
This episode is a perfect example: an unintended casualty of this weekend’s firestorm is the idea of data portability: I have argued that social networks like Facebook should make it trivial to export your network; it seems far more likely that most social networks will respond to this Cambridge Analytica scandal by locking down data even further.
Dean:
The dark patterns @facebook use to get me to give access to my personal contacts in Messenger is pretty sickening and shouldn’t be allowed on the @AppStore.
- No option for “No”
- “Learn More” leads to a real option
- In-app notification shameing
- Push notification shameing
Update (2018-03-23): Bob Burrough:
The con-job is that this is a Facebook-specific “breach,” and therefore theirs to address. The problem is much bigger than that. Why are the New York Times, CNN, and The Guardian reporting what you’re reading to Facebook?
never forget you also give up data to Facebook by not ever signing up for Facebook and just visiting any web page with a like button 🙃
But while Facebook has been on the receiving end of some heated and justified media criticism for its privacy abuses, that criticism feels detached from a broader context: namely that we’ve increasingly approved of the wholesale collection and sale of our private data without anything even vaguely resembling transparency, accountability, or oversight.
Nothing personifies this more clearly than the telecom industry, which has been gobbling up and selling consumer data on an industrial scale for the better part of the last few decades. Often with only an iota of the outrage we’ve already seen during Facebook’s latest scandal.
More than a decade ago, ISPs like Comcast began hoovering up your clickstream data (data on every website you visit) and selling it with little accountability and absolutely no transparency. When press outlets back then asked ISPs about what data they were collecting, most would simply refuse to respond. And regulators (and most press outlets) saw no real problem with that.
I’ve written software against the Facebook API, and accessing information about the social graph is part of the API. We may not like what Cambridge Analytica did with the data, but I don’t think they did anything that every other company that makes products that work with Facebook doesn’t already do. Including of course Facebook itself.
The API condundrum(s):
--legit researchers using APIs to expand human knowledge, track fake news and abuse, etc = GOOD
--fake researchers siphoning data for Cambridge Analytica = BAD
--APIs open enough to allow competitive/innovative use of data with user permission = GOOD
Still, it seems to me that a lot of these wounds are self-inflicted. Not just in choices the company makes from a product and policy standpoint, but also how they choose to react to issues when they arise. Even on Friday night, when it seemed like they were doing the right thing by making a swift, decisive move around a very complicated situation, it turns out, no — Facebook was simply reacting quickly because publications were about to run stories about the pilfering of data from their network for mass political profiling. And what’s worse, Facebook was apparently threatening said publications if they ran said stories.
Sandy Parakilas, the platform operations manager at Facebook responsible for policing data breaches by third-party software developers between 2011 and 2012, told the Guardian he warned senior executives at the company that its lax approach to data protection risked a major breach.
“My concerns were that all of the data that left Facebook servers to developers could not be monitored by Facebook, so we had no idea what developers were doing with the data,” he said.
Parakilas said Facebook had terms of service and settings that “people didn’t read or understand” and the company did not use its enforcement mechanisms, including audits of external developers, to ensure data was not being misused.
Facebook Inc. tried to get ahead of its latest media firestorm. Instead, it helped create one.
The company knew ahead of time that on Saturday, the New York Times and The Guardian’s Observer would issue bombshell reports that the data firm that helped Donald Trump win the presidency had accessed and retained information on 50 million Facebook users without their permission.
Facebook did two things to protect itself: it sent letters to the media firms laying out its legal case for why this data leak didn’t constitute a "breach." And then it scooped the reports using their information, with a Friday blog post on why it was suspending the ad firm, Cambridge Analytica, from its site.
It’s not just that he’s silent in public. Facebook CEO and co-founder Mark Zuckerberg declined to face his employees on Tuesday to explain the company’s role in a widening international scandal over the 2016 election.
[…]
Nor, The Daily Beast has learned, did chief operating officer Sheryl Sandberg attend the internal town hall.
Mr. Stamos, who plans to leave Facebook by August, had advocated more disclosure around Russian interference of the platform and some restructuring to better address the issues, but was met with resistance by colleagues, said the current and former employees. In December, Mr. Stamos’s day-to-day responsibilities were reassigned to others, they said.
Mr. Stamos said he would leave Facebook but was persuaded to stay through August to oversee the transition of his responsibilities and because executives thought his departure would look bad, the people said. He has been overseeing the transfer of his security team to Facebook’s product and infrastructure divisions. His group, which once had 120 people, now has three, the current and former employees said.
So Facebook is forcing out Stamos, the one executive with the moral backbone to do the right thing in response to what they’d allowed to happen.
First, we will investigate all apps that had access to large amounts of information before we changed our platform to dramatically reduce data access in 2014, and we will conduct a full audit of any app with suspicious activity. We will ban any developer from our platform that does not agree to a thorough audit. And if we find developers that misused personally identifiable information, we will ban them and tell everyone affected by those apps. That includes people whose data Kogan misused here as well.
Second, we will restrict developers’ data access even further to prevent other kinds of abuse. For example, we will remove developers’ access to your data if you haven’t used their app in 3 months. We will reduce the data you give an app when you sign in -- to only your name, profile photo, and email address. We’ll require developers to not only get approval but also sign a contract in order to ask anyone for access to their posts or other private data. And we’ll have more changes to share in the next few days.
Third, we want to make sure you understand which apps you’ve allowed to access your data. In the next month, we will show everyone a tool at the top of your News Feed with the apps you’ve used and an easy way to revoke those apps’ permissions to your data. We already have a tool to do this in your privacy settings, and now we will put this tool at the top of your News Feed to make sure everyone sees it.
The problem with Zuckerberg’s post is this. In 2011, FB was caught deceiving people about how it violated their privacy. It signed an agreement w/the FTC pledging to stop doing that. Today, Zuckerberg is outlining the steps he promised to take in 2011.
They did not disclose this at the time, nor did they notify the fifty million users whose information was accessed by Cambridge Analytica. So their claim in their press statement that they felt deceived is bunk: they knew, and did nothing when it mattered first.
Dear Mark Zuckerberg, you offered interviews to lots of outlets but not the @guardian & Observer. We broke the story first in 2015. We led the reporting last weekend. You used legal threats to try and stop us. And now, you’re... ignoring us?
This is 100% right. Zuckerberg threatening to sue the outlets who broke the stories while giving interviews to the ones who didn’t shows that the leadership of Facebook is a part of the problem.
Zuckerberg’s multiple apologies are undercut by a ruthless legal strategy to attack critics in the press, a huge lobbying operation against things like the Honest Ads Act, and massive financing of researchers and academics through dollars and access to data.
Facebook was so kind as to offer up each user’s unique Facebook User_ID when it returned these data requests. This means that all the data from all the different apps, quizzes and games can be immediately and instantly recombined into one massive database… just like Facebook’s!
[…]
To give a sense of how many apps were out there doing this: here’s an AdWeek article back in 2012, quoting Facebook as saying there were 9 million apps and websites integrated with Facebook. And 2012 was three years before Facebook cut off API access to pulling this kind of data.
[…]
For the longest period of time, Facebook was an advertising business that dreamed of being something else other than an advertising business. It wanted to be a platform.
[…]
And if those are the grand illusions that you’ve got, it’s not your proprietary data that you view as the secret to your success (which you only need to advertise). Instead, it’s developers, and getting them to build on top of your precious platform.
FB is incentivized to keep your data only to themselves. So ONLY THEY can target with it.
We’ll never let apps do this again!
Ya, I bet you won’t. Why WOULD you give them free data when you can charge for it, per ad.
In a wide-ranging interview with Recode this afternoon, the Facebook CEO and co-founder said that he would appear before legislators if he was the “right” one inside the company to give lawmakers information about what happened.
You deserve to have your information protected - and we’ll keep working to make sure you feel safe on Facebook. Your trust is at the core of our service. We know that and we will work to earn it.
Facebook: here’s a photo montage of your random friend anniversary we send you every week!
Also Facebook: we’re not sure we can notify people affected by Cambridge Analytica because we’re not sure if we know who your friends were in 2014
In the process of deleting my little used #Facebook account, I’ve downloaded my data & found worrying things…
This is bonkers. I definitely never authorized Facebook to share this information.
Privacy settings on Facebook are sadly opt-out. When Facebook introduces a new privacy invading feature (like facial recognition), it’s always on by default.
If you need any more evidence for how important selling your info is to Facebook, look no further than how long it takes to opt out of everything you can.
If you can’t quite bring yourself to close down your account - maybe there’s a support group or family connections you’d like to keep active - then here’s how to restrict the amount of data Facebook has got on you.
A few years back, I reworked my Facebook account to lock down my personal information; given everything going on with the social media giant this week, I figured I’d walk everyone through the steps I took to keep Facebook from accidentally broadcasting valuable data to the world.
Something Apple would never do, but should - indicate on the App Store page for each app which analytics SDKs are included within it.
Update (2018-03-25): Taryn Luna (via Hacker News):
The California Consumer Privacy Act would require big companies to disclose the type of information they gather, explain how it is shared or sold and give people the right to prevent businesses from spreading their personal data.
The initiative has months to qualify for the November ballot and will likely become one of the most expensive fights this year.
Google, Facebook, AT&T, Verizon and Comcast have contributed $200,000 each to a campaign finance committee opposing the initiative since mid-February. The proponents, a trio of Bay Area business professionals, expect the Internet behemoths will eventually pour in over $100 million to try to stop the measure from passing.
brockhopper (via Sonya Mann):
What was the Facebook friend suggestion that made you go “OK, that’s just creepy, how did FB know to suggest them”?
After I changed all my Facebook settings and deleted API access, the next time I opened Messenger I saw these two screens trying to trick me into giving Facebook full Address Book access. Shady as hell.
The New York Times apparently offers powerful third parties the ability to edit away–that is, to delete from the internet–unfavorable coverage appearing in the paper of record’s online edition.
[…]
The Times’ original story made reference to Facebook COO Sheryl Sandberg–and mentioned her “consternation” at Stamos’ efforts to shepherd the tech giant towards being more transparent about Russian trolls’ electoral interference.
Among other things (all correct), Zeynep explains that “Facebook makes money, in other words, by profiling us and then selling our attention to advertisers, political actors and others. These are Facebook’s true customers, whom it works hard to please.”
Irony Alert: the same is true for the Times, along with every other publication that lives off adtech: tracking-based advertising. These pubs don’t just open the kimonos of their readers. They treat them as naked beings with necks bared to vampires ravenous for the blood of personal data, all ostensibly so those persons can be served with “interest-based” advertising.
Apple is complicit with the power Facebook has amassed by refusing to provide their own identity management service.
Zuck wants regulation because it serves him. Not because it’s doing the right thing.
Facebook is gonna turn this into an opportunity to strengthen the walls of its data silo, invite regulation that disadvantages new entrants, & avoid conversations about their propaganda amplification machine.
I don’t understand the take that this is bad for FB. This was a gift.
Update (2018-03-27): Josh Constine (Hacker News):
Meanwhile, if the government instituted new rules for tech platforms collecting persona information going forward, it could effectively lock in Facebook’s lead in the data race. If it becomes more cumbersome to gather this kind of data, no competitor might ever amass an index of psychographic profiles and social graphs able to rival Facebook’s.
The ironic thing about the Facebook data mess is after they get regulated other advertising companies will need huge legal and compliance teams to deal with the new regulations.
The regulations could actually build a nearly insurmountable moat for FB.
The message is clear: Zuckerberg thinks we’re idiots. How are we to believe Facebook didn’t know — and derived benefits — from the widespread abuse of user data by its developers. We just became aware of the Cambridge Analytica cockroach…how many more are under the sink? In more lawyerly terms: “What did you know, and when did you know it?”
Ben and James discuss Facebook’s current crisis, and why almost everyone misunderstands what the company did wrong: the problem isn’t advertising, it was Facebook’s desire to be a platform.
Apple handed over the role of managing our identities to Facebook - with their system level account login control
So the best thing that Apple could do for users - to protect their privacy - would be provide a better alternative that did so
The worst thing from a privacy POV would be to bury their head in sand...not offer a safer alternative, and push their users to G/FB without privacy
[…]
see the most recent update to Safari with Intelligent Tracking Prevention
It solidifies FB/Goog monopoly - while destroying market competition in online ad marketplace (from strategic POV, that’s the last thing Apple wants)
Facebook responded to reports that it collected phone and SMS data without users’ knowledge in a "fact check" blog post on Sunday.
[…]
This contradicts the experience of several users who shared their data with Ars. Dylan McKay told Ars that he installed Messenger in 2015, but only allowed the app the permissions in the Android manifest that were required for installation. He says he removed and reinistalled the app several times over the course of the next few years, but never explicitly gave the app permission to read his SMS records and call history. McKay’s call and SMS data runs through July of 2017.
In my case, a review of my Google Play data confirms that Messenger was never installed on the Android devices I used. Facebook was installed on a Nexus tablet I used and on the Blackphone 2 in 2015, and there was never an explicit message requesting access to phone call and SMS data. Yet there is call data from the end of 2015 until late 2016, when I reinstalled the operating system on the Blackphone 2 and wiped all applications.
For what it’s worth, this story applies only to Android users, because of course it does; iOS has never allowed a third-party app to silently monitor call or messaging history.
Oh! Guys. We just misunderstood! Everything is on the up-and-up here. Let’s go have a cup o’ tea!
When an app uses the Facebook SDK, Facebook gets access to the same permissions that the containing app has. Let that sink in.
[…]
Using VSCO, you’d have no idea it’s talking to Facebook. We wager they’re just using it to track ad conversion, but who knows? Sadly, the web has tools like Ghostery to block trackers, but there’s no such solution for mobile apps.
On a locked down platform such as iOS, your privacy and security are entirely in the hands of the OS vendor. On an open platform such as macOS, you can take your life into your own hands. Little Snitch on iOS? No. Reverse engineering 3rd party apps on iOS? Not without jailbreak.
I find it incomprehensible how Google-associated people still comment critically on Facebook’s business practices when 84% of their revenue (and what pays for all the free services and research) comes from precisely the targeted advertising that’s suddenly so contemptible.
Want to freak yourself out? I’m gonna show just how much of your information the likes of Facebook and Google store about you without you even realising it
Update (2018-03-29): See also: The Menu Bar.
i think one of the reasons facebooks reaction to the past few weeks seems so caught off guard is that this level of data collection and manipulation has literally been the standard for years
imagine them wondering “why does everyone suddenly care now?”
Facebook successfully managed to keep Instagram out of this debate, but as far as I know, it’s basically a different UI on the same platform at this point. What percentage of users connect IG accounts to FB? Must be >80%.
Update (2018-03-30): BuzzFeed:
The Bosworth memo reveals the extent to which Facebook’s leadership understood the physical and social risks the platform’s products carried — even as the company downplayed those risks in public. It suggests that senior executives had deep qualms about conduct that they are now seeking to defend. And as the company reels amid a scandal over improper outside data collection on its users, the memo shows that one senior executive — one of Zuckerberg’s longest-serving deputies — prioritized all-encompassing growth over all else, a view that has led to questionable data collection and manipulative treatment of its users.
Update (2018-04-02): Vox (Hacker News, MacRumors):
Ezra Klein: One of the things that has been coming up a lot in the conversation is whether the business model of monetizing user attention is what is letting in a lot of these problems. Tim Cook, the CEO of Apple, gave an interview the other day and he was asked what he would do if he was in your shoes. He said, “I wouldn’t be in this situation,” and argued that Apple sells products to users, it doesn’t sell users to advertisers, and so it’s a sounder business model that doesn’t open itself to these problems.
[…]
Mark Zuckerberg: You know, I find that argument, that if you’re not paying that somehow we can’t care about you, to be extremely glib and not at all aligned with the truth. The reality here is that if you want to build a service that helps connect everyone in the world, then there are a lot of people who can’t afford to pay. And therefore, as with a lot of media, having an advertising-supported model is the only rational model that can support building this service to reach people.
[…]
But if you want to build a service which is not just serving rich people, then you need to have something that people can afford. I thought Jeff Bezos had an excellent saying on this in one of his Kindle launches a number of years back. He said, “There are companies that work hard to charge you more, and there are companies that work hard to charge you less.” And at Facebook, we are squarely in the camp of the companies that work hard to charge you less and provide a free service that everyone can use.
Update (2018-04-03): Josh Barro:
I don’t think this is a very good line for Zuckerberg. Apple is a company that works hard to charge you more. Amazon is a company that works hard to charge you less. Facebook is a company that works hard to charge someone else more for access to you.
Fair Zuckerberg counterpunch to Tim Cook. BUT. Apple has an 27% operating profit margin and Facebook is 50%. So Facebook is making a healthy amount from its paying customers (advertisers).
Jobs told me that Apple had held unsuccessful talks with Facebook about a variety of unspecified partnerships related to Ping. The reason, according to Jobs: Facebook wanted “onerous terms that we could not agree to,” related to connecting with Facebook friends on Ping.
Jobs let that word hang in the air and even raised a disdainful eyebrow when I asked what he meant, including whether Ping would incorporate connecting with Facebook or even using Facebook Connect, which would make it much easier to find friends to share music with.
“We could, I guess,” he shrugged without much enthusiasm for Ping and, most of all, for linking Apple customers with Facebook.
If Zuckerberg really is holding the sales team back from doing even more intrusive things, as he suggests, I don’t find that a comforting thought that leaves me feeling better about Facebook.
The linguistic trick Zuckerberg pulls here is that nowhere in the entire interview does he mention the words user or customer. He only says you (in the plural sense) and people. That’s a dodge, because unlike Apple — and Amazon — Facebook’s users are not its customers — and most of the controversies they are dealing with today all stem from the fact that they favored their customers (advertisers willing to pay ever-higher sums for ever-more-invasively-targeted ads) at the expense of their users.
Update (2018-04-05): Olivia Solon (Hacker News):
The Facebook data of up to 87 million people – 37 million more than previously reported – may have been improperly shared with Cambridge Analytica, the company has revealed.
This larger figure, which included over a million UK users, was buried in the penultimate paragraph of a blogpost by the company’s chief technology officer, Mike Schroepfer, published on Wednesday, which also provided updates on the changes Facebook was making to better protect user information.
The drip-drip-drip PR strategy is an old trick, and Facebook utilizes it every time they have bad news involving a number of users.
Update (2018-04-06): Josh Constine (Hacker News):
Facebook admits it deleted Fb messages sent by Zuckerberg & other execs from non-employees’ inboxes with no disclosure. Seems like a breach of trust to me.
Facebook now acknowledges it has a two-tiered privacy system in which regular users have to live with their dumb old texts forever and the CEO’s disappear into a memory hole. Let’s remember that next week when they tell Congress how seriously they take our privacy
Update (2018-04-10): Issie Lapowsky:
The data consulting firm Cambridge Analytica, which harvested as many as 87 million Facebook users' personal data, also could have accessed the private inbox messages of some of those affected. Facebook slipped this previously undisclosed detail into the notifications that began appearing at the top of News Feeds on Monday. These alerts let users know whether they or their friends had downloaded a personality quiz app called This Is Your Digital Life, which would have caused their data to be collected and potentially passed on to Cambridge Analytica.
Update (2019-10-21): Jason Kint:
Finally. Here in SEC docs is what Facebook has painfully avoided public knowing and press has mostly missed documenting. Facebook data was ****SOLD**** to Cambridge Analytica. Can everyone please now say that Facebook personal data was sold rather than captured, transferred, etc?