Do Not Track — Blogs

This page collects blog posts from researchers, privacy advocates and tech companies leading the way on Do Not Track. To be notified of new posts, subscribe to the RSS feed or follow @DoNotTrack.

Gone Fishin'

by Christopher Soghoian (noreply@blogger.com) (Slight Paranoia) at August 15, 2013 07:30 PM UTC
This blog is not currently active. If you want to see what I'm upto, find me on Twitter at @csoghoian or at the ACLU Free Future blog.

Analyzing Yahoo's PRISM non-denial

by Christopher Soghoian (noreply@blogger.com) (Slight Paranoia) at June 09, 2013 01:35 AM UTC

Today, Yahoo's General Counsel posted a carefully worded denial regarding the company's alleged participation in the NSA PRISM program. To the casual observer, it might seem like a categorical denial. I do not believe that Yahoo's denial is as straightforward as it seems.

Below, I have carefully parsed Yahoo's statement, line by line, in order to highlight the fact that Yahoo has not in fact denied receiving court orders under 50 USC 1881a (AKA FISA Section 702) for massive amounts of communications data.


We want to set the record straight about stories that Yahoo! has joined a program called PRISM through which we purportedly volunteer information about our users to the U.S. government and give federal agencies access to our user databases. These claims are false. [emphasis added]

No one has claimed that the PRISM program is voluntary. As the Director of National Intelligence has confirmed, the PRISM program involves court orders granted using Section 702 of the Foreign Intelligence Surveillance Act.

By falsely describing PRISM as a voluntary scheme, Yahoo's general counsel is then able to deny involvement outright. Very sneaky.

Yahoo! has not joined any program in which we volunteer to share user data with the U.S. government. We do not voluntarily disclose user information.
Again, PRISM has nothing to do with voluntary disclosures. These are compelled disclosures, pursuant to an order from the FISA court.
The only disclosures that occur are in response to specific demands.
The government can make a specific demand for information about all communications coming to or from a particular country. This is an empty statement.
And, when the government does request user data from Yahoo!, we protect our users.
Claiming to "protect our users" means nothing.
We demand that such requests be made through lawful means and for lawful purposes. We fight any requests that we deem unclear, improper, overbroad, or unlawful.
When the law allows blanket surveillance, "lawful means and lawful purposes" doesn't mean anything.
We carefully scrutinize each request, respond only when required to do so, and provide the least amount of data possible consistent with the law.
When a FISA court order demands blanket surveillance, responding only when required to do so is an empty promise, as is providing the least amount of data possible.
The notion that Yahoo! gives any federal agency vast or unfettered access to our users’ records is categorically false.

Elsewhere in the post, Yahoo's uses the terms "user data" and "user information". Why the sudden switch to the term "users' records"? This seems to deny participation in a Section 215 metadata disclosure program (see: the Verizon Business order revealed earlier this week), which has nothing to do with PRISM.

In any case, the PRISM scandal is not about unfettered access to users' data. It is about giving the government data in which one party of the communication is not in the US. Yahoo is not accused of giving the government unfettered access to communications where all parties are in the US.

Of the hundreds of millions of users we serve, an infinitesimal percentage will ever be the subject of a government data collection directive.
Note the use of the word directive in this statement, which does not mean voluntary. Now see below.
Where a request for data is received, we require the government to identify in each instance specific users and a specific lawful purpose for which their information is requested.
Here, Yahoo switches to using the term "requests" which are voluntary, not demands. The government is not obligated to describe "a specific legal purpose" when it has obtained a court order compelling the disclosure of data. It is only when the government is making a voluntary request of Yahoo that the company has the ability to set terms for the disclosure.
Then, and only then, do our employees evaluate the request and legal requirements in order to respond—or deny—the request.
Yahoo has flexibility when the government makes a request for data. The company has far less flexibility when it receives a court order demanding the disclosure of data.
We deeply value our users and their trust, and we work hard everyday to earn that trust and, more importantly, to preserve it.
If that were true, Yahoo would protect the privacy and security of its customers by enabling HTTPS by default for Yahoo Mail. Yahoo was the last big email provider to even offer HTTPS as an opt-in option, and has still not enabled it by default.

what ever happened to the second party?

by Sid Stamm (noreply@blogger.com) (Sid Stamm) at March 19, 2013 12:23 AM UTC

I got into a terminology discussion with Brendan this week, and it turns out there's general confusion over these labels we give to businesses on the web: first party and third party.  This topic has been debated ad nauseum in the TPWG, but I want to share my thoughts on what it means in the context of cookies and the general browser/webpage point of view.

The Marx brothers have a take on this in Night at the Opera when they get into discussion of parties and contracts, and I think they're on to something, but on the web these party labels probably come from business-focused contractual engagements. So which party am I?  I'm not a party (though that sounds like fun).

In the case of cookies, the party labels are all about contractual arrangements to produce a good or service for you. You, the surfer, are not part of the contract, but you benefit from a system of first, second and third party businesses.

Here, the first party is the business you seek to engage.  The second party in question is a contractor doing business explicitly for the first party. For example, when you visit the grocery store, someone might help bag your groceries. Maybe they're a temp worker and are actually employed by a different company, but their sole job is to do what the grocery store asks, and they do their work in the store. In these cases there's a direct business agreement between first (business) and second (contractor) parties to produce one service or good. For all intents and purposes, the bagger seems like part of the store.
 
Second-party cookies don't make much sense in the online cookie context since to the web browser, there's no technical distinction between the first-party or second-party web software. The assumption here is that second parties operate within the "umbrella" of the first party, so the cookies are part of the first party offering.

Any third party players are peripheral to the transaction and may add value but their primary purpose is something other than the sought-after good or service. These third parties are more like the flier guy who walks around the parking lot while you shop and puts discount fliers for his car dealership on everyone's windshields.  (Wow, zero down, $169 a month?)  He's not stocking shelves or bagging your groceries at the grocery store, but is still a peripheral part of the whole grocery shopping experience. Customers expectations for the third party here are likely different than those for the temp worker.  (What's maybe not obvious, is if you go to his dealership, the flyer may inform him what kind of groceries you bought, and tracking cookies can be even more invisible than these fliers -- but that's a blog post for a different day.)

So how's this work online?  The first party on this blog is me: blog.sidstamm.com.  There's a second party here too, the folks who made my blog framework software.  They maintain the software (I'm too lazy), and I use it to publish my blog, but it all comes through on this same domain name.  When you read this, the two of us are working together with the goal of bringing you my thoughts.  There also happen to be a "G+ Share" button and search bar on the site, but they're third party; controlled by some other entity, served over a different domain, and only showing up here to augment your experience beyond the blog you seek.

So don't panic: the second parties are still there!  We just don't use the term much because they're so tightly integrated with first parties, that they usually appear the same.

Who uses the password manager?

by Sid Stamm (noreply@blogger.com) (Sid Stamm) at March 06, 2013 08:13 PM UTC

Who uses the password manager, and why? My colleague Monica Chew tries to answer these questions and more by measuring password manager use. 

Check out her blog post.

what is privacy?

by Sid Stamm (noreply@blogger.com) (Sid Stamm) at December 27, 2012 06:41 PM UTC
Often times when I find myself in a conversation about Privacy, there's a lack of clarity around what exactly we're discussing.  It's widely accepted that people who are experts on privacy all speak the same language and have the same goals.

I'm not so sure this is true.

This came up in a discussion with Jishnu yesterday, and we needed a common starting place.  So I'd like to take a little time to lay out what I'm thinking when I talk about Privacy, especially since I'm mainly focused on empowering individuals with control over data sharing and not so much on keeping secrets.
Privacy is the ability for an individual to have transparency, choice, and control over information about themselves.
At the risk of sounding too cliché, I'm gonna use a pyramid to explain my thinking.  There are three parts to establishing privacy:

First, an organization's (or individual's) collection, sharing and use of data must be transparent.  This is crucial because choice and control cannot be realized without honesty and fairness.

Second, individuals must be provided choice.  This means data subjects (those people whose data is being collected, used or shared) must be able to understand what's going to happen with their data and have the ability to provide dissent or consent.

Third, when it's clear what's happening and individuals have an understanding about what they want, they must be given control over collection, sharing or use of the data in question.

This means control depends on choice which depends on transparency.  You cannot make decisions unless you're given the facts.  You cannot make your desires reality unless you've decided what you want.

For the engineers out there (like me), this dependencies can be modeled as such:
[Transparency] = Awareness of Data Practices
[Choice] = [Transparency] + Individual's Wants
[Control] = [Choice] + Organizational Cooperation
Control is the goal, but it requires Transparency and Choice to work -- as well as some additional inputs.  Privacy is the whole thing: all three pieces acting together with support from both data controllers and data subjects to empower individuals with a say in how their data is used.

The privacy perception gap is a symptom of ineffective transparency and choice; it is the result of peoples' inability to really understand what's going on so they have no chance to establish positions about what is okay.  When transparency and choice are built into a system, the gap shrinks and people have most of what they need to regain control over their privacy.

What is privacy to you?

A few words on patronage

by Christopher Soghoian (noreply@blogger.com) (Slight Paranoia) at November 24, 2012 10:14 PM UTC

Over the past couple years, I've taken several big companies to task for their woeful privacy and security practices. Just as it is important to call out these flaws, I believe it is also important to give companies credit when they go the extra mile to protect their customers.

When Google began protecting Gmail with HTTPS by default, I praised the company. When it started voluntarily publishing statistics for government requests, I again praised the company. When AT&T protected its customers' voicemail accounts from caller ID spoofing by forcing users to enter PINs, I praised the company. When Twitter asked the government to unseal the 2703(d) order that it had obtained as part of its investigation into Wikileaks, I praised the company. When Facebook started to offer HTTPS, and then this month enabled it by default, I praised the company. When Mozilla switched to encrypted search by default for Firefox, I praised the organization.

You get the idea.

Of course, just because I praise a particular action by a company, it doesn't mean that I am suddenly giving the company or its products my seal of approval. As an example, I'm of course glad that Facebook is enabling transport encryption to protect its customers' communications from network based interception. That doesn't mean I suddenly love Facebook, or bless the company's other business practices. Turning on HTTPS by default is a great move, but it isn't enough to get me to open a Facebook account, or trust the company with my data.

It is unfortunate then that I must defend myself against Nadim Kobeissi's latest attempt at reputation assassination.

Earlier this month, I praised Silent Circle for the company's fantastic law enforcement compliance policy. [Silent Circle sent me an early draft of their policy, sought feedback, and even accepted some of my suggestions]. Compared to the industry norm, in which companies merely disclose that they will hand over their customers' data to the government when forced to do so, Silent Circle's policy is an absolutely stellar example of the ways in which companies can approach this issue in a clear, transparent and honest manner.

I have spent several years researching the ways in which law enforcement agencies force service providers to spy on their customers. Most companies are not willing to discuss their law enforcement policies, let alone publish them online. It is for that reason that I praised Silent Circle - because they have set a great example that I hope other companies will follow.

However, as with the numerous other examples I highlighted above, just because I praise a particular action by a company, it doesn't mean that I now stand behind the company or its products.

Although I have praised Silent Circle's legal policies, I've made no public statements regarding the technical merits of their products. When I've been questioned by journalists about the extent to which consumers should trust the company's technology, I've been consistently conservative. As I recently told Ryan Gallagher at Slate:

Christopher Soghoian, principal technologist at the ACLU's Speech Privacy and Technology Project, said he was excited to see a company like Silent Circle visibly competing on privacy and security but that he was waiting for it to go open source and be audited by independent security experts before he would feel comfortable using it for sensitive communications.

Nadim has suggested that I am endangering my independence and that I have some kind of conflict of interest regarding Silent Circle, possibly because the company loaned me an iPod Touch so that I could get a chance to try out the iOS version of their software while they work out the kinks in the Android version. (How does Nadim even know the company loaned me an iPod? Because I disclosed it in a discussion with him on a public mailing list.)

Let me be perfectly clear. I am not a consultant to Silent Circle or any other company. I am not on an advisory board for Silent Circle or any other company. The only employer I have is the American Civil Liberties Union. Yes, I regularly talk with people who work at the company, and offer suggestions for ways that they can better protect the privacy of their customers. However, I regularly give solicited (and even more frequently, unsolicited) feedback to many companies, big and small. Most ignore me, but some occasionally change their practices. I am a privacy activist, and that is what I do.

ownership and transparency in social media

by Sid Stamm (noreply@blogger.com) (Sid Stamm) at October 11, 2012 04:14 PM UTC
Les Writes:
"You don’t own the spaces you inhabit on Facebook. You’re enjoying a party at someone’s house, and you barely know the guy. In fact, your content is the currency that pays for the booze (ie. the privilege of using their servers). That’s why it’s free-as-in-beer: You’ve given them what you post, instead of money. That’s valuable stuff, if they can ever quite figure out how to sell it."  [link]
It's not completely fair to expect that FB users realize the data about them that they so generously contribute to FB no longer belongs to them.  My hypothesis is that many people feel that no matter who has facts about you and prints them, they're still *yours*.  After all, companies have trademarks, can't things about me be mine and reserved for me?

On a smaller scale, the monetization of facts about me is not surprising; I give an interview to a magazine, they print it, it gets syndicated, no surprise.  On a large scale (lots of data collection,  frequently) I think people lose track of with whom they are communicating and get immersed in the task at hand.  Is it my FB friends, or is it FB, who is helpfully telling my friends things?  This system is flexible, crazy, complex, shiny and distracting!  Can I use it to video chat with my friends?  That's neat.  Oh, geez, I forgot FB is in the middle of all this communication...

People who sign up for FB are not signing up to contribute their life to this stranger throwing a party.  They sign up assuming it is a tool they can use to communicate with their friends; it is a machine they've "bought" (for free, heh) to help them communicate.  Nobody reads the terms of service.  Nobody reads the privacy policy.  They accept them since other people have and only read what their friends write.  Many are in denial or do not realize that what they contribute to the site is just that: a contribution.

I think there is shared responsibility here; consumers should be a little bit wary--but this isn't their area of expertise.  As such, the site operator also has a duty to be more forthcoming with what's going on.  My communications tool is supposed to be a communications tool.  If you market it as a "free communications tool that sells my data," I am better informed than if it's just marketed as a "communications tool."

Responding to Wired's ad hominem hatchet job

by Christopher Soghoian (noreply@blogger.com) (Slight Paranoia) at August 08, 2012 10:08 AM UTC

I have long been a fan of Wired's coverage of privacy and security issues, particularly the insightful reporting and analysis by Ryan Singel, currently the editor of the Threat Level blog. It is for that reason that I am saddened to see Ryan stoop to twisting my words in support of a lengthy character assassination piece targeted against me.

Brief background

Two weeks ago, Wired published a glowing, 2000 word story by Quinn Norton about CryptoCat, an encrypted chat tool. Quinn was not the first journalist to shower praise upon Cryptocat -- writers at the New York Times and Forbes had previously done so too.

I subsequently published a lengthy blog post, which compared the media's coverage of Cryptocat, a relatively new, unproven security tool, to the media's previous fawning coverage of Haystack, a tool which, once analyzed by experts, was revealed to be pure snakeoil.

The message in my blog post -- that journalists risk exposing their readers to harm when they hype unproven security technologies -- was directed at the media as a whole. In support of my argument, I cited glowing praise for such technologies printed in the Guardian, the New York Times, Newsweek, Forbes and, Wired.

Today, Ryan Singel, the editor at Wired's Threat Level blog responded to my blog post, but incorrectly frames my criticism as if it were solely directed at Quinn Norton and her coverage of Cryptocat. In doing so, Ryan inaccurately paints me as a sexist, security-community insider who is unfairly criticizing a tool "created by an outsider to the clubby crypto community and one that’s written up by a woman and reviewed by a female security expert."

The importance of dissenting technical experts

One of the biggest criticisms of Norton's story I expressed in my blog post of was the fact that she did not quote a single technical expert that was critical of Cryptocat, even though there are quite a few who have been vocal with their concerns:

Other than Kobeissi, Norton's only other identified sources in the story are Meredith Patterson, a security researcher that was previously critical of Cryptocat who is quoted saying "although [Cryptocat] got off to a bumpy start, he’s risen to the occasion admirably" and an unnamed active member of Anonymous, who is quoted saying "if it's a hurry and someone needs something quickly, [use] Cryptocat."
As I also noted in my post:
Even though their voices were not heard in the Wired profile, several prominent experts in the security community have criticized the web-based version of Cryptocat. These critics include Thomas Ptacek, Zooko Wilcox-O'Hearn, Moxie Marlinspike and Jake Appelbaum.
Singel frames my criticism here as sexist. Meredith Patterson is a woman, whereas the Cryptocat critics I named were all men. Singel claims that, "Patterson, one of the all-too few female security researchers, doesn’t seem to count for much in Soghoian’s analysis." He adds later, "instead, Soghoian believes, Norton should have turned to one of four more vocal critics he names — all of them men."

As an initial matter, let me say that I have genuine respect for Meredith and her skills as a security researcher. We've known each other for several years, have attended several privacy conferences together, and have a shared goal in keeping the communications of users out of the prying hands of the government. Nowhere in my prior blog post do I dismiss Patterson's skills, credentials, or technical opinions.

My criticism of Norton's piece, in this respect, is not about the specific technical expert who is quoted as saying positive things about Cryptocat, but rather, the total lack of any dissenting quotes. If the rest of the security community were agnostic about the merits of Cryptocat, then it would perhaps be fine to quote a single technical expert who has positive things to say. In this case though, there are several technical experts who have deep concerns about the security of Cryptocat, experts whose research and views Wired has covered at length in the past.

As Singel has described it, I would have liked Norton to talk to a more more qualified expert, and to not print Patterson's opinions. That is not the case. I just think that a dissenting expert should be quoted too.

To summarize, the gender of the technical expert quoted saying positive things about Cryptocat has absolutely nothing at all to do with my belief that a responsible journalist would have spoken to, and quoted at least one technical expert who is critical of the tool. Even more so when the headline of the story is "This Cute Chat Site Could Save Your Life and Help Overthrow Your Government."

On the issue of privilege

In my blog post, I quoted from a few of Norton's recent tweets, in which she criticizes the crypto community, which she believes is filled with "privileged", "mostly rich 1st world white boys w/ no real problems who don't realize they only build tools [for] themselves."

After I published my blog post, Singel criticized me for quoting Norton's tweets, claiming that I was using "an outsider's critique of your boys club as a way to discredit them."

Although Singel clearly disagrees, I felt, and still feel that it is relevant to highlight the fact that Norton believes that the crypto community, and in particular, the critics of Cryptocat, are just privileged, paranoid geeks who have no real problems.

As I mentioned in my blog post, two of the most vocal critics of Cryptocat's web based chat app, Jake Appelbaum and Moxie Marlinspike, have faced pretty extreme real world problems of surveillance and government harassment.

After Appelbaum was outed by the press as as being associated with WikiLeaks, Twitter, Google and Sonic.net were forced to provide his communication records to the FBI as part of its investigation into WikiLeaks. At least one of Appelbaum's friends and colleagues has been forced to testify at a federal grand jury, and he has been repeatedly stopped at the border, harassed, and had digital devices seized by the authorities.

Likewise, for some time, Marlinspike was routinely stopped at the border by US authorities, had his laptop and phones searched, and in at least one case, was questioned by a US embassy official, who had a photo of Marlinspike at hand, before he could get on a plane back to the US.

While Appelbaum and Marlinspike have (thankfully) not been physically tortured by government agents, their paranoia and dedication towards improving the state of Internet security is by no means theoretical. Their concerns are legitimate, and their paranoia is justified.

On telling journalists to unplug

Singel's most vicious, yet totally unfair criticism relates to the two paragraphs that concluded my Cryptocat blog post:

Although human interest stories sell papers and lead to page clicks, the media needs to take some responsibility for its ignorant hyping of new security tools and services. When a PR person retained by a new hot security startup pitches you, consider approaching an independent security researcher or two for their thoughts. Even if it sounds great, please refrain from showering the tool with unqualified praise.

By all means, feel free to continue hyping the latest social-photo-geo-camera-dating app, but before you tell your readers that a new security tool will lead to the next Arab Spring or prevent the NSA from reading peoples' emails, step back, take a deep breath, and pull the power cord from your computer.

Singel states that the main point of my post "seemed to be to tell a woman to shut up and unplug from the net." He further twists my words by writing:
Moreover, Soghoian suggesting that if Quinn Norton ever wanted to write about about encryption tools in the future, she ought to "step back, take a deep breath, and pull the power cord from your computer" isn't just rude and obnoxious, it’s border-line sexist and an outright abuse of Soghoian's place in the computer security world."

The harsh words in my conclusion, which Singel quotes, were aimed at "the media." This of course includes Wired, but also many other journalists and news organizations who regularly publish stories on the latest new snake-oil product that uses "military-grade encryption."

In fact, the words "ignorant hyping" in the blog post's conclusion link to a recent New York Times article about Wickr, a new mobile app that the Times reveals will let "users transmit texts, photos and videos through secure and anonymous means previously reserved for the likes of the military and intelligence operatives."

(This is, of course, rubbish. There are no anonymity technologies that have been "reserved for the likes of the military and intelligence operatives.")

Finally, in support of his charge that I am sexist, Singel twists my words by stating that "Soghoian suggest[s] that if Quinn Norton ever wanted to write about about encryption tools in the future, she ought to 'step back, take a deep breath, and pull the power cord from your computer.'"

Let me be clear: Nowhere in my blog post do I tell Quinn that she should never again write about encryption tools. Instead, I warn journalists who are planning to write that "that a new security tool will lead to the next Arab Spring or prevent the NSA from reading peoples' emails." That is very different than "ever writing about encryption tools in the future."

Of course I want journalists to write about encryption, privacy, security and the importance of protecting data. I want users to be safe, and one of the best ways for them to discover and then adopt safe practices is by reading about them in the media.

(Strangely enough, Wired's chilling coverage this week of the devastating hack against Mike Mat Honan has been absolutely fantastic, offering a clear demonstration of how difficult it is for users to protect their data even when using tools and services created by billion dollar corporations.)

What I wish to avoid though, is news stories that hype technologies that simply cannot, and will not deliver what has been promised to users. By all means, please tell users about two-factor authentication, encrypted cloud backups with keys not known to providers, and VPN services. Just don't claim that these technologies will plunge the NSA into darkness or lead to the overthrow of authoritarian governments.

I do not hate female journalists

As an activist that uses media coverage to pressure companies to change their privacy invading practices, I regularly work with journalists around the world, feeding them stories, tips, and when they want them, quotes. In the more than six years that I have been working with the media (including Wired on countless occasions), never once has the gender of the reporter played any role in whether or not I went to them with a scoop, or returned their phone calls or emails.

The media are of course not equal in their understanding of technology or their willingness to dig deep into a tech issue. In my experience, gender plays absolutely no role in determining the quality of a tech journalist.

For example, of the entire news media, the What They Know team at the Wall Street Journal (Julia Angwin and Jennifer Valentino-DeVries) are by far the best in the business when it comes to covering privacy and security. They break major stories, do great investigative research, and routinely seek the confirmation of multiple technical experts in order to verify claims before they print them. On this beat, their coverage is first rate, and quite frankly, puts the New York Times, the Washington Post, Wired, Ars and others to shame. It is not surprising then, that when a great scoop lands in my lap, I take it to the WSJ first.

I judge, praise and criticize journalists on the tech beat based on the quality of their reporting, not by their gender. In this case, I criticized Quinn Norton's Wired story because it was deeply flawed, not because she is a woman. To claim otherwise is pure bullshit.

Tech journalists: Stop hyping unproven security tools

by Christopher Soghoian (noreply@blogger.com) (Slight Paranoia) at July 30, 2012 09:43 PM UTC

Preface: Although this essay compares the media's similar hyping of Haystack and Cryptocat, the tools are, at a technical level, in no way similar. Haystack was at best, snake oil, peddled by a charlatan. Cryptocat is an interesting, open-source tool created by a guy who means well, and usually listens to feedback.

In 2009, media outlets around the world discovered, and soon began to shower praise upon Haystack, a software tool designed to allow Iranians to evade their government's Internet filtering. Haystack was the brainchild of Austin Heap, a San Francisco software developer, who the Guardian described as a "tech wunderkind" with the "know-how to topple governments."

The New York Times wrote that Haystack "makes it near impossible for censors to detect what Internet users are doing." The newspaper also quoted one of the members of the Haystack team saying that "It's encrypted at such a level it would take thousands of years to figure out what you’re saying."

Newsweek stated that Heap had "found the perfect disguise for dissidents in their cyberwar against the world’s dictators." The magazine revealed that the tool, which Heap and a friend had in "less than a month and many all-nighters" of coding, was equipped with "a sophisticated mathematical formula that conceals someone’s real online destinations inside a stream of innocuous traffic."

Heap was not content to merely help millions of oppressed Iranians. Newsweek quoted the 20-something developer revealing his long term goal: "We will systematically take on each repressive country that censors its people. We have a list. Don’t piss off hackers who will have their way with you.

The Guardian even selected Heap as its Innovator of the Year. The chair of the award panel praised Heap's "vision and unique approach to tackling a huge problem" as well as "his inventiveness and bravery."

This was a feel-good tech story that no news editor could ignore. A software developer from San Francisco taking on a despotic regime in Tehran.

There was just one problem: The tool hadn't been evaluated by actual security experts. Eventually, Jacob Appelbaum obtained a copy of and analyze the software. The results were not pretty -- he described it as "the worst piece of software I have ever had the displeasure of ripping apart."

Soon after, Daniel Colascione, the lead developer of Haystack resigned from the project, saying the program was an example of "hype trumping security." Heap ultimately shuttered Haystack.

After the proverbial shit hit the fan, the Berkman Center's Jillian York wrote:

I certainly blame Heap and his partners–for making outlandish claims about their product without it ever being subjected to an independent security review, and for all of the media whoring they’ve done over the past year.

But I also firmly place blame on the media, which elevated the status of a person who, at best was just trying to help, and a tool which very well could have been a great thing, to the level of a kid genius and his silver bullet, without so much as a call to circumvention experts.

Cryptocat: The press is still hypin'

In 2011, Nadim Kobeissi, then a 20 year old college student in Canada started to develop Cryptocat, a web-based secure chat service. The tool was criticized by security experts after its initial debut, but stayed largely below the radar until April 2012, when it won an award at the Wall Street Journal's Data Transparency Codeathon. Days later, the New York Times published a profile of Kobeissi, which the newspaper described as a "master hacker."

Cryptocat originally launched as a web-based application, which required no installation of software by the user. As Kobeissi told the New York Times:

"The whole point of Cryptocat is that you click a link and you’re chatting with someone over an encrypted chat room... That’s it. You’re done. It’s just as easy to use as Facebook chat, Google chat, anything.”

There are, unfortunately, many problems with the entire concept of web based crypto apps, the biggest of which is the difficulty of securely delivering javascript code to the browser. In an effort to address these legitimate security concerns, Kobeissi released a second version of Cryptocat in 2011, delivered as a Chrome browser plugin. The default version of Cryptocat on the public website was the less secure, web-based version, although users visiting the page were informed of the existence of the more secure Chrome plugin.

Forbes, Cryptocat and Hushmail

Two weeks ago, Jon Matonis, a blogger at Forbes included Cryptocat in his list of 5 Essential Privacy Tools For The Next Crypto War. He wrote that the tool "establishes a secure, encrypted chat session that is not subject to commercial or government surveillance."

If there is anyone who should be reluctant offer such bold, largely-unqualified praise to a web-based secure communications tool like Cryptocat, it should be Matonis. Several years ago, before he blogged for Forbes, Matonis was the CEO of Hushmail, a web-based encrypted email service. Like Cryptocat, Hushmail offered a 100% web-based client, and a downloadable java-based client which was more resistant to certain interception attacks, but less easy to use.

Hushmail had in public marketing materials claimed that "not even a Hushmail employee with access to our servers can read your encrypted e-mail, since each message is uniquely encoded before it leaves your computer." In was therefore quite a surprise when Wired reported in 2007 that Hushmail had been forced by a Canadian court to insert a backdoor into its web-based service, enabling the company to obtain decrypted emails sent and received by a few of its users.

The moral of the Hushmail story is that web based crypto tools often cannot protect users from surveillance backed by a court order.

Wired's ode to Cryptocat

This past Friday, Wired published a glowing, 2000 word profile on Kobeissi and Cryptocat by Quinn Norton. It begins with a bold headline: "This Cute Chat Site Could Save Your Life and Help Overthrow Your Government," after which, Norton describes the Cryptocat web app as something that can "save lives, subvert governments and frustrate marketers."

In her story, Norton emphasizes the usability benefits of Cryptocat over existing secure communications tools, and on the impact this will have on the average user for whom installing Pidgin and OTR is too difficult. Cryptocat, she writes, will allow "anyone to use end-to-end encryption to communicate without ... mucking about with downloading and installing other software." As Norton puts it, Cryptocat's no-download-required distribution model "means non-technical people anywhere in the world can talk without fear of online snooping from corporations, criminals or governments."

In short, Norton paints a picture in which Cryptocat fills a critical need: secure communications tools for the 99%, for the tl;dr crowd, for those who can't, don't know how to, don't have time to, or simply don't want to download and install software. For such users, Cryptocat sounds like a gift from the gods.

Journalists love human interest stories

Kobeissi presents the kind of human interest story that journalists dream about: A Lebanese hacker who has lived through 4 wars in his 21 years, whose father was killed, whose house was bombed, who was interrogated by the "cyber-intelligence authorities" in Lebanon and by the Department of Homeland Security in the US, and who is now building a tool to help others in the Arab world overthrow their oppressive governments.

As such, it isn't surprising that journalists and their editors aren't keen to prominently highlight the unproven nature of Cryptocat, even though I'm sure Kobeissi stresses it in every interview. After all, which journalist in their right mind would want to spoil this story by mentioning that the web-based Cryptocat system is vulnerable to trivial man in the middle, HTTPS stripping attacks when accessed using Internet Explorer or Safari? What idiot would sabotage the fairytale by highlighting that Cryptocat is unproven, an experimental project by a student interested in cryptography?

And so, such facts are buried. The New York Times waited until paragraph 10 in a 16 paragraph story to reveal that Kobeissi told the journalist that his tool "is not ready for use by people in life-and-death situations." Likewise, Norton waits until paragraph 27 of her Wired profile before she reveals that "Kobeissi has said repeatedly that Cryptocat is an experiment" or that "structural flaws in browser security and Javascript still dog the project." The preceding 26 paragraphs are filled with feel good fluff, including description of his troubles at the US border and a three paragraph no-comment from US Customs.

At best, this is bad journalism, and at worst, it is reckless. If Cryptocat is the secure chat tool for the tl;dr crowd, burying its known flaws 27 paragraphs down in a story almost guarantees that many users won't learn about the risks they are taking.

Cryptocat had faced extensive criticism from experts

Norton acknowledges in paragraph 23 of her story that "Kobeissi faced criticism from the security community." However, she never actually quotes any critics. She quotes Kobeissi saying that "Cryptocat has significantly advanced the field of browser crypto" but doesn't give anyone the opportunity to challenge the statement.

Other than Kobeissi, Norton's only other identified sources in the story are Meredith Patterson, a security researcher that was previously critical of Cryptocat who is quoted saying "although [Cryptocat] got off to a bumpy start, he’s risen to the occasion admirably" and an unnamed active member of Anonymous, who is quoted saying "if it's a hurry and someone needs something quickly, [use] Cryptocat."

It isn't clear why Norton felt it wasn't necessary to publish any dissenting voices. From her public Tweets, it is however, quite clear that Norton has no love for the crypto community, which she believes is filled with "privileged", "mostly rich 1st world white boys w/ no real problems who don't realize they only build tools [for] themselves."

Even though their voices were not heard in the Wired profile, several prominent experts in the security community have criticized the web-based version of Cryptocat. These critics include Thomas Ptacek, Zooko Wilcox-O'Hearn, Moxie Marlinspike and Jake Appelbaum. The latter two, coincidentally, have faced pretty extreme "real world [surveillance] problems" documented at length, by Wired.

Security problems with Cryptocat and Kobeissi's response

Since Cryptocat was first released, security experts have criticized the web-based app, which is vulnerable to several attacks, some possible using automated tools. The response by Kobeissi to these concerns has long been to point to the existence of the Cryptocat browser plugin.

The problem is that Cryptocat is described by journalists, and by Kobeissi in interviews with journalists, as a tool for those who can't or don't want to install software. When Cryptocat is criticized, Kobeissi then points to a downloadable browser plugin that users can install. In short, the only technology that can protect users from network attacks against the web-only Cryptocat also neutralizes its primary, and certainly most publicized feature.

Over the past few weeks, criticism of the web-based Cryptocat and its vulnerability to attacks has increased, primarily on Twitter. Responding to the criticism, on Saturday, Kobeissi announced that the the upcoming version 2 of Cryptocat will be browser-plugin only. At the time of writing this essay, the Cryptocat web-based interface also appears to be offline.

Kobeissi's decision to ditch the no-download-required version of Cryptocat came just one day after the publication of Norton's glowing Wired story, in which she emphasized that Cryptocat enables "anyone to use end-to-end encryption to communicate without ... mucking about with downloading and installing other software."

This was no doubt a difficult decision for Kobeissi. Rather than leading the development of a secure communications tool that Just Works without any download required, he must now rebrand Cryptocat as a communications tool that doesn't require operating system install privileges, or one that is merely easier to download and install. This is far less sexy, but, importantly, far more secure. He made the right choice.

Conclusion

The technology and mainstream media play a key role in helping consumers to discover new technologies. Although there is a certain amount of hype with the release of every new app or service (if there isn't, the PR people aren't doing their jobs), hype is dangerous for security tools.

It is by now well documented that humans engage in risk compensation. When we wear seatbelts, we drive faster. When we wear bike helmets, we drive closer. These safety technologies at least work.

We also engage in risk compensation with security software. When we think our communications are secure, we are probably more likely to say things that we wouldn't if our calls were going over a telephone like or via Facebook. However, if the security software people are using is in fact insecure, then the users of the software are put in danger.

Secure communications tools are difficult to create, even by teams of skilled cryptographers. The Tor Project is nearly ten years old, yet bugs and design flaws are still found and fixed every year by other researchers. Using Tor for your private communications is by no means 100% safe (although, compared to many of the alternatives, it is often better). However, Tor has had years to mature. Tools like Haystack and Cryptocat have not. No matter how good you may think they are, they're simply not ready for prime time.

Although human interest stories sell papers and lead to page clicks, the media needs to take some responsibility for its ignorant hyping of new security tools and services. When a PR person retained by a new hot security startup pitches you, consider approaching an independent security researcher or two for their thoughts. Even if it sounds great, please refrain from showering the tool with unqualified praise.

By all means, feel free to continue hyping the latest social-photo-geo-camera-dating app, but before you tell your readers that a new security tool will lead to the next Arab Spring or prevent the NSA from reading peoples' emails, step back, take a deep breath, and pull the power cord from your computer.

The known unknowns of Skype interception

by Christopher Soghoian (noreply@blogger.com) (Slight Paranoia) at July 26, 2012 09:15 PM UTC

Over the past few weeks, the technical blogosphere, and most recently, the mainstread media have tried to answer the question: What kind of assistance can Skype provide to law enforcement agencies?

Most of the stories have been filled with speculation, sometimes informed, but mostly not. In an attempt to paint as clear a picture as possible, I want to explain what we do and don't know about Skype and surveillance.

Skype has long provided assistance to governments

The Washington Post reported yesterday that:
Skype, the online phone service long favored by political dissidents, criminals and others eager to communicate beyond the reach of governments, has expanded its cooperation with law enforcement authorities to make online chats and other user information available to police

The changes, which give the authorities access to addresses and credit card numbers, have drawn quiet applause in law enforcement circles but hostility from many activists and analysts.

To back up its claim, the post cites interviews with "industry and government officials familiar with the changes" who "poke on the condition of anonymity because they weren’t authorized to discuss the issue publicly." Ugh.

However, a quick Google search for "Skype law enforcement handbook" quickly turns up an official looking document on the whistleblower website cryptome.org, dated October 2007, which makes it clear that Skype has long been providing the assistance that the Post claims is new.

From Skype's 2007 law enforcement handbook:

In response to a subpoena or other court order, Skype will provide:
• Registration information provided at time of account registration
• E-mail address
• IP address at the time of registration
• Financial transactions conducted with Skype in the past year, although details of the credit cards used are stored only by the billing provider used (for instance, Bibit, RBS or PayPal)
• Destination telephone numbers for any calls placed to the public switched telephone network (PSTN)
• All service and account information, including any billing address(es) provided, IP address (at each transaction), and complete transactional information
While Skype's law enforcement handbook suggests that the company does not have access to IP address session logs, high-profile criminal case from 2006 suggests that the company does.
Kobi Alexander, the founder of Comverse, was nabbed in Negombo, Sri Lanka yesterday by a private investigator. He is wanted by the US government in connection with financial fraud charges. He is accused of profiting from some very shady stock-option deals, to the detriment of Comverse shareholders. Once the deals became public and he was indicted, he resigned as CEO and fled the US.

Alexander was traced to the Sri Lankan capital of Colombo after he placed a one-minute call using Skype. That was enough to alert authorities to his presence and hunt him down.

This makes sense. Skype clients connect to Skype's central servers (so that users can make calls to non Skype users, and learn which of their friends are online and offline), and so the servers naturally learn the IP address that the user is connecting from. This is not surprising.

Skype voice call encryption

So while it is clear that Skype can provide government agencies with basic subscriber information and IP login info, what remains unclear is the extent to which governments can intercept the contents of Skype voice calls.

Skype has always been rather evasive when it comes to discussing this issue. Whenever questions come up, the company makes it a point to mention that it provides end to end encryption, but then dodges all questions about how it handles encryption keys.

Skype's strategy is genius - most journalists, even those that cover tech, know very little about the more granular aspects of cryptography. When Skype says it provides end to end call encryption, journalists then tell their readers that Skype is wiretapping proof, even though Skype never made that specific claim. Conveniently enough, Skype never bothers to correct the many people who have read a tad bit too much into the company's statements about security.

As Seth Schoen from EFF told Forbes recently, "my view is that Skype has gotten a reputation for impregnable security that it has never deserved." Exactly. Consumers think the service is secure, and Skype has absolutely no incentive to correct this false, yet positive impression.

The mud puddle test

Last year, I directed a bit of a media firestorm at Dropbox, after I filed an FTC complaint alleging that the company had been misleading its customers about the "military grade" security it used to protect the files uploaded by users. Earlier this year, the tech press started to ask similar questions about the cryptography and key management used by Apple's iCloud service.

Soon after, crytographer Matt Green proposed the 'mud puddle test' for easily determining if a cloud based storage solution has unencrypted access to your data.

1. First, drop your device(s) in a mud puddle.
2. Next, slip in said puddle and crack yourself on the head. When you regain consciousness you'll be perfectly fine, but won't for the life of you be able to recall your device passwords or keys.
3. Now try to get your cloud data back.
Did you succeed? If so, you're screwed. Or to be a bit less dramatic, I should say: your cloud provider has access to your 'encrypted' data, as does the government if they want it, as does any rogue employee who knows their way around your provider's internal policy checks.

Both Dropbox and iCloud fail the mud puddle test. If a user's laptop is destroyed and they forget their password, both services permit a user to reset the password and then download all of their data that was stored with the service. Both of these companies have access to your data, and can be forced to hand it over to the government. In contrast, SpiderOak, a competing online backup service (which I use) passes the test. If a SpiderOak user forgets their password, they lose their data.

What about Skype? After all, the company isn't an online backup service, but rather a communications service, right?

Well, as an initial matter, if you forget your password, Skype sends you a reset link by email, which lets you into your account, maintaining the same account balance and restoring your full contact list. Likewise, if you install Skype on a new computer, your contact list is downloaded, and you can conduct conversations that, to the other caller, will not in any way reveal that you recently installed Skype on a new device, or reset your password. It just works.

Encrypted communications require encryption keys.

Some protocols, like Off The Record (built into several Instant Messaging clients, but not to be confused with Google's fake, unencrypted Off The Record), random keys are created by the IM client, and then users are expected to exchange and verify them out of band (usually, by phone, or in person).

The OTR developers realized that users don't like manually verifying random alpha-numeric crypto fingerprints, and so the developers introduced a slightly easier method of verifying OTR keys in recent versions that uses secret questions or shared secrets selected by users (obviously, this is less secure, but more likely to be actually followed by users).

Another scheme, the ZRTP encrypted VOIP protocol, created by Phil Zimmermann of PGP fame avoids the static fingerprint method, and instead requires users to verify a random phrase at the beginning of each conversation. ZRTP (which is also used by Whisper Systems' RedPhone and the open source Jitsi chat tool) can rely on these pass phrase exchanges, because users presumably know each others' voices. Text based IM schemes don't have this voice recognition property, and so slightly heavier weight verification schemes are required there.

While these key/identity verification methods are a pain for users, they are important. Encryption is great, but without some method of authentication, it is not very helpful. That is, without authentication, you can be sure you have encrypted session, but you have no idea who is at the other end (someone pretending to be your friend, a government device engaging in a man in the middle interception attack, etc). The key verification/exchange methods used by OTR and ZRTP provide a strong degree of authentication, so that users can be sure that no one else is snooping on their communications.

Thanks for the crypto lesson

In contrast to the complex, user-visible fingerprint exchange and verification methods employed by OTR and ZRTP, Skype does nothing at all. Skype handles all the crypto and key exchange behind the scenes. When a Skype user installs the software on a brand new device and initiates a conversation with a friend already in their contact list, that friend is not told that the caller's device/software has a new crypto key and that it should be verified. Instead, the call just connects.

While we don't know the full details of how Skype handles its key exchange, what is clear is that Skype is in a position to impersonate its customers, or, should it be forced, to give a government agency the ability to impersonate its customers. As Skype acts as the gatekeeper of conversations, and the only entity providing any authentication of callers, users have no way of knowing if they're directly communicating with a friend they frequently chat with, or if their connection is being intercepted using a man in the middle attack, made possible due to the disclosure of cryptographic keys by Skype to the government.

I suspect that Skype does not create a new private encryption key for each device running Skype. Instead, my guess is that it creates a key once, when the user sets up their account, and then stores this online, along with the user's contact list. When the user installs Skype on a new device, the key is downloaded, along with all of their other account data. The user's public/private key pair would then be used to authenticate a session key exchange. If this is the design that Skype uses, the company can be compelled to disclose the private crypto keys it holds, allowing the government to impersonate users, and perform active man in the middle interception attacks against their communications.

One alternate, but equally insecure approach would be for the Skype clients to create a new public/private keypair each time the a user installs Skype on their computer and for Skype to digitally sign the user's public key using a certificate pre-installed in all Skype clients. In that scenario, while Skype the company won't have access to your private key, it will be able to sign public keys in your name for other people (including the government) that other Skype clients will accept without complaint. Such impersonation methods can then be used to perform man in the middle attacks.

Whatever the key exchange method that Skype uses, as long as users rely on Skype for all caller authentication, and as long as the company provides account access after a forgotten password, and seamless communications after the installation of Skype on a new computer, the company will fail the mud puddle test. Under such circumstances, Skype is in a position to give the government sufficient data to perform a man in the middle attack against Skype users.

Government agencies and encryption keys

Ok, so Skype has access to users' communications encryption keys (or can enable others to impersonate as Skype users). What does this mean for the confidentiality of Skype calls? Skype may in fact be telling the truth when it tells journalists that it does not provide CALEA-style wiretap capabilities to governments. It may not need to. If governments can can impersonate Skype users and perform man in the middle attacks on their conversations (with the assistance of broadband ISPs or wireless carriers), then they can decrypt the voice communications without any further assistance from Skype.

Do we know if this is happening? No. But that is largely because Skype really won't comment on the specifics of its interactions with governments, or the assistance it can provide. However, privacy researchers (pdf) have for many years speculated about governments compelling companies to hand over their own encryption keys or provide false certificates (pdf) for use in MiTM attacks. In such cases, when the requests come, there isn't really anything that companies can do to resist.

We need transparency

I suspect that 99% of Skype's customers have never given a moment's thought to the ease or difficulty with which government agencies can listen to their calls. Most likely use the service because it is free/cheap, easy, and enables them to talk to their loved ones with a minimum of hassle. There are, however, journalists, human rights activists and other at-risk groups who use Skype because they think it is more secure. In terms of Skype's hundreds of millions of users, these thousands of privacy-sensitive users are a tiny rounding error, a drop in the bucket.

Skype is not transparent about its surveillance capabilities. It will not tell us how it handles keys, what kind of assistance it provides governments, under what circumstances, or which governments it will and won't assist. Until it is more transparent, Skype should be assumed to be insecure, and not safe for those whose physical safety depends upon confidentiality of their calls.

Skype of course can't talk about the requests for assistance it has received from intelligence agencies, since such requests are almost certainly classified. However, Skype could, if it wished to, tell users about its surveillance capabilities. It doesn't.

I personally don't really care if Skype is resistant to government surveillance or not. There are other schemes, such as ZRTP, which are peer reviewed, open, documented protocols which activists can and should use. What I would like though, is for Skype to be honest. If it is providing encryption keys to governments, it should tell its customers. They deserve the truth.

Tracking Not Required: Behavioral Targeting

by Arvind Narayanan (33 Bits of Entropy) at June 11, 2012 10:42 PM UTC

Co-authored by Jonathan Mayer and Subodh Iyengar.

In the first installment of the Tracking Not Required series, we discussed a relatively straightforward case: frequency capping. Now let’s get to the 800-pound gorilla, behaviorally targeted advertising, putatively the main driver of online tracking. We will show how to swap a little functionality for a lot of privacy.

Admittedly, implementing behavioral targeting on the client is hard and will require some technical wizardry. It doesn’t come for “free” in that it requires a trade-off in terms of various privacy and deployability desiderata. Fortunately, this has been a fertile topic of research over the past several years, and there are papers describing solutions at a variety of points on the privacy-deployability spectrum. This post will survey these papers, and propose a simplification of the Adnostic approach — along with prototype code — that offers significant privacy and is straightforward to implement.

Goals. Carrying out behavioral advertising without tracking requires several things. First, the user needs to be profiled and categorized based on their browsing history. In nearly all proposed solutions, this happens in the user’s browser. Second, we need an algorithm for selecting targeted ads to display each time the user visits a page. If the profile is stored locally and not shared with the advertising company, this is quite nontrivial. The final component is for reporting of ad impressions and clicks. This component must also deal with click fraud, impression fraud and other threats.

Existing approaches

The chart presents an overview of existing and proposed architectures.

“Cookies” refers to the status quo of server-side tracking; all other architectures are presented in research papers summarized in the Do Not Track bibliography page. CoP stands for “Client-only Profiles,” the architecture proposed by Bilenko and Richardson.

Several points of note. First, everything except PrivAd — which uses an anonymizing proxy — reveals the IP address, and typically the User Agent and Referer to the ad company as part of normal HTTP requests. Second, everything except CoP (and the status quo of tracking cookies) requires software installation. Opinions vary on just how much of a barrier this is. Third, we don’t take a stance on whether PrivAd is more deployable than ObliviAd or vice-versa; they both face significant hurdles. Finally, Adnostic can be used in one of two modes, hence it is listed twice.

There is an interesting technological approach, not listed above, that works by exposing more limited referer information. Without the referer header (or an equivalent), the ad server may identify the user but will not learn the first-party URL, and thus will not be able to track. This will be explored in more depth in a future article.

New approach. In the solution we propose here, the server is recruited for profiling, but doesn’t store the profile. This avoids the need for software installation and allows easy deployability. In addition, non-tracking is externally verifiable, to the extent that IP address + User-Agent is not nearly as effective for tracking as cookie-based unique identifiers.[1] Like CoP, and unlike Adnostic, each ad company can only profile users during visits to pages that it has a third-party presence on, rather than all pages.

Profiling algorithm.

1. The user visits a page that has embedded content from the ad company.
2. JavaScript in the ad company’s content sends the top-level URL to a special classifier service run by the ad company.  (The classifier is run on a separate domain.  It does not have any cookies or other information specific to the user.)
3. The classifier returns a topic classification of the page.
4. The ad company’s JavaScript receives the page classification and uses it to update the user’s behavioral profile in HTML5 storage.  The JavaScript may also consider other factors, such as how long the user stayed on the page.

There is a fair degree of flexibility in steps 3 and 4 — essentially any profiling algorithm can be implemented by appropriately splitting it into a server-side component that classifies individual web pages and a client-side component that analyzes the user’s interaction with these pages.

Ad serving and accounting.

The ad serving process in our proposal is the same as in Adnostic — the server sends a list of ads along with metadata describing each ad, and the client-side component picks the ad that best matches the locally stored profile. To avoid revealing which ad was displayed, the client can either download all (say, 10) ads in the list while displaying only one, or the client downloads only one ad, but ads are served from a different domain which does not share cookies with the tracking domain. Note the similarity to our frequency capping approach, both in terms of the algorithm and its privacy properties.

Accounting, i.e., billing the right advertiser is also identical to Adnostic for the cost-per-click and cost-per-impression models; we refer the reader there. Discussing the cost-per-action model is deferred to a future post.

Implementation. We implemented our behavioral targeting algorithm using HTML 5 local storage. As with our frequency capping implementation, we found performance was exceptionally fast in modern desktop and mobile browsers. For simplicity, our implementation uses a static local database mapping websites to interest segments and a binary threshold for determining interests. In practice, we expect implementers would maintain the mapping server-side and apply more sophisticated logic client-side.

We also present a different work-in-progress implementation that’s broader in scope, encompassing retargeting, behavioral targeting and frequency capping.

Conclusion. Certainly there are costs to our approach — a “thick-client” model will always be slightly more inconvenient to deploy and maintain than a server-based model, and will probably have a lower targeting accuracy. However, we view these costs as minimal compared to the benefits. Some compromise is necessary to get past the current stalemate in web tracking.

Technological feasibility is necessary, but not sufficient, to change the status quo in online tracking. The other key component is incentives. That is why Do Not Track, standards and advocacy are crucial to the online privacy equation.

[1] The engineering and business reasons for this difference in effectiveness will be discussed in a future post.

To stay on top of future posts, subscribe to the RSS feed or follow me on Google+.


Adding Privacy to Apps Permissions

by Sid Stamm (noreply@blogger.com) (Sid Stamm) at May 22, 2012 10:37 PM UTC
I've been thinking about app permission models, especially as we're working on B2G and need a way for users to safely and thoughtfully manage the apps on their device.  Most permission models strive to do precisely one thing: allow apps to ask for consent to use features.

The problem I have with "allow/deny" consent to use features is that there's not a clear usage intention in having the access; a mirror app that asks for access to your camera probably doesn't need to store data it gets from the sensor, but it could go so far as to store video (and perhaps send it to "sneakyprivacyinvadors.com" to spy on you).

If apps can explain their usage intentions, consumers of the apps have more context and can make better decisions about the permissions they grant.  While the software probably can't make sure the usage intentions are actually followed, this commitment to customers puts the app developers on the hook for doing the right thing.

Head on over to the discussion in mozilla.dev.webapps where I've posted my thoughts, and let us know what you think.

Edit (23-May-2012 / 9:33 PDT): Google Groups (the public archive) did not pick up my original post to the group.  If you're not subscribed via NNTP or the dev-webapps mailing list, you can see my original post in the quoted text of the first reply by Paul.

Congressmen pushing awful cybersecurity bill fail cybersecurity 101

by Christopher Soghoian (noreply@blogger.com) (Slight Paranoia) at April 18, 2012 03:36 PM UTC

Over the last several months, several cybersecurity bills have been proposed by various Congressional committees. One of the leading bills, the Cyber Intelligence Sharing and Protection Act (CISPA), has been proposed by Congressmen Mike Rogers (R-Mich.) and Rep. Dutch Ruppersberger (D-Md.). Many of the major civil liberties groups like EFF and ACLU have legitimately criticized the substance of the bill, which would give companies a free pass to share their customers' private information with the government.

I'm not going to get into the weeds and criticize specific portions of this bill. Instead, I want to make a broader point - Congress knows absolutely nothing about cybersecurity, and quite simply, until it knows more, and starts leading by example, it has no business forcing its wishes on the rest of us.

Congressmen Rogers and Ruppersberger are, respectively, the chairman and ranking member of the House Intelligence Committee. Although it is no secret that most members of Congress do not have technologists on staff providing them with policy advice, we can at least hope that the two most senior members of the Intelligence Committee have in-house technical advisors with specific expertise in the area of information security. After all, without such subject area expertise, it boggles the mind as to how they can at least evaluate and then put their names on the cybersecurity legislation that was almost certainly ghostwritten by other parts of the government - specifically, the National Security Agency.

So, given that these two gentlemen feel comfortable forcing their own view of cybersecurity on the rest of the public, I thought it would be useful to look at whether or not they practice what they preach. Specifically, how is their own information security. While I am not (for legal reasons) going to perform any kind of thorough audit of the two members' websites or email systems, even the most cursory evaluation is pretty informative.

HTTPS and Congressional websites

HTTPS encryption is the most basic form of security that websites should use - providing not only confidentiality, but also authentication and integrity, so that visitors to a site can be sure they are indeed communicating with the site they believe they are visiting. All big banks and financial organizations use HTTPS by default, Google has used it for Gmail since January 2010, and even the CIA and NSA websites use HTTPS by default (even though there is absolutely nothing classified on either of the two spy agency public sites). Some in Congress have even lectured companies about their lack of default HTTPS encryption - one year ago, Senator Schumer wrote to several major firms including Yahoo and Amazon, telling them that "providers of major websites have a responsibility to protect individuals who use their sites and submit private information. It’s my hope that the major sites will immediately put in place secure HTTPS web addresses.”

It is now 2012. HTTPS is no longer an obscure feature used by a few websites. It is an information security best practice and increasingly becoming the default across the industry. It is therefore alarming that not only do Congressional websites not offer HTTPS by default, but most members' websites don't support HTTPS at all.

Rogers

For example, the webserver running Congressman Mike Rogers's website seems to support HTTPS, however, attempting to visit https://mikerogers.house.gov/ (or https://www.mikerogers.house.gov/) will result in a certificate error.

This is perhaps a bit better than Congressman Roger's campaign website, which does not appear to be running a HTTPS webserver at all. Attempting to visit https://www.mikerogersforcongress.com/ results in a connection error.

Ruppersberger

When I manually tried to visit the HTTPS URL for Congressman Ruppersberger's website last night, it instead redirected me to the Congressional Caucus on Intellectual Property Promotion. Soon after I called the Congressman's office this morning to question his team's cybersecurity skills, the site stopped redirecting visitors, and now instead displays a misconfiguration error.

Congressman Dutch's campaign webserver appears to support HTTPS, but returns a certificate error.

Congressional websites could do HTTPS

While most Congressional websites return HTTPS certificate errors, the problems largely seem to be configuration issues. The webserver that runs all of the house.gov websites is listening on port 443 and it looks like Akamai has issued a wildcart *.house.gov certificate that can be used to secure any Congressional website. As an example, Nancy Pelosi's website supports HTTPS without any certificate errors (although it looks like there is some non-HTTPS encrypted content delivered from that page too.) This means that the Congressional IT staff can enable HTTPS encryption for Rogers, Ruppersberger and every other member without having to buy any new HTTPS certificates or setting up new webservers. The software is already all there - and the fact that these sites do not work over HTTPS connections already suggests that no one in the members' offices have asked for it. After all, if Nancy Pelosi's site can offer a secure experience, other members of Congress should be able to get similar protections too.

Remember SOPA

During the SOPA debate several months ago, a few members seemed to take pride in acknowledging their total ignorance regarding technology, proclaiming that they were not nerds, didn't understand the Internet, but even so still thought that SOPA was a good bill. Those members were justifiably ridiculed for ignoring technical experts while voting for legislation that would significantly and negatively impact the Internet.

Here, we have members who've not even bothered to ask the Congressional IT staff to make sure that their website support HTTPS, let alone use it by default, who are now telling the rest of the country that we should trust their judgement on the complex topic of cybersecurity.

Until the respective Congressional committees that deal with technology issues actually hire subject matter experts, any legislation they propose will lack legitimacy and, most likely, will probably be ineffective. Likewise, if Congress thinks that cybersecurity is a priority, perhaps it should lead by example.

In the summer of 2010, I filed a FTC complaint (pdf) against Google for deceiving its users about the extent to which it knowingly leaks user search queries to third parties via the referring header sent by web browsers. Shortly after my complaint was made public, a class action firm hit Google with a lawsuit over the practice.

Like many privacy class actions, the lawyers included every possible legal argument they could think of. One of their claims was that Google had violated the Stored Communications Act, which prohibits companies from sharing the contents of users' communications contents with other parties (even law enforcement agencies, unless they have a warrant).

The federal judge assigned to the case recently threw out all but one of the class action firm's claims, but but has permitted the case to continue solely focusing on Google's alleged violations of the Stored Communications Act. As such, one of the next big, important issues that the court is going to have to address is determining whether or not search queries are considered communications content under the Stored Communications Act.

As law professor Eric Goldman recently observed, "the SCA's poor drafting means that no one (including the judges) knows exactly what's covered by the statute." This is certainly true, and made worse by the fact that the statute hasn't really been updated since it was passed in 1986, long before the first web search engine or referrer header. It is for this very reason that DOJ has argued that the government should be able to get search engine query data without a warrant. Thankfully, Google disagrees.

Google: Search queries are content

At a recent event at San Francisco Law School, Richard Salgado, Google's Director of Law Enforcement and Information Security spoke publicly (for the first time) about Google's aggressively pro-privacy legal position on search queries and government access:

As far as search warrants and content go, Google and I think a lot of providers are taking this position, sees the 4th amendment particularly as it has been applied in the Warshak cases, as establishing that there is a reasonable expectation of privacy such that disclosure of the contents held with the third party is protected by the 4th Amendment. And not limited to email, but other material that is uploaded to the service provider to be handled by the service provider.

You hear a lot about ECPA about electronic communications service, ECS and remote computing sevice, RCS, and the crazy rules that apply [for example], the 180 day rule. I think most providers now, although I really should only speak to Google, view the way the case law is going and certaininly viewing the 4th Amendment as applying to any content that is provided by the user to the service, so that, for Google, would include things like Calendar and Docs, and all those others, even where there is not a communication function going on, that there's not another party involved in the Doc that you're uploading, the notes that you're keeping for yourself. It's still material that you've put with the service provider as part of the service that the company, in this case Google, is holding on your behalf. Its our view that that is protected by the 4th amendment, and unless one of the exceptions to the warrant requirement apply, its not to be disclosed to a government entity as a matter of compulsion.

Question: Where does search fall in that?

Answer: Search is one where we take a pretty hard stance, the same with other material, so we view search that its provided to us the way that other information is provided to us. That is very consistent with the ligitiation with the Department of Justice back in 2006.

Now, it seems pretty clear that Salgado is primarily talking about Google's view that the 4th Amendment protects user search queries, and is not arguing that they are communications content under the Stored Communications Act. Prior to this public event, I had heard reliable rumors that Google had adopted a warrant position for search queries based on the Stored Communications Act. Perhaps my sources were wrong, or perhaps Google realizes that it is going to be difficult to simultaneously argue two different positions on search engine queries and the SCA.

Even so, I suspect Google's legal team is still going to have a difficult time convincing the judge in this case that search engine queries are private enough for the company to repeatedly argue that they deserve warrant protections under the 4th Amendment, yet not private enough to deserve protections under the Stored Communications Act's prohibition against sharing communications content.

After all, as Al Gidari, Google's top privacy outside lawyer himself said at Brookings last year:

"[C]ontent is content, I don’t care how many times you try to repackage it into something else, content is still content, and the standards that we try to apply that give lesser protection to that content inevitably falls short, as well, when people stop and think about it."

ACLU docs reveal real-time cell phone location spying is easy and cheap

by Christopher Soghoian (noreply@blogger.com) (Slight Paranoia) at April 03, 2012 04:25 PM UTC
"Technological progress poses a threat to privacy by enabling an extent of surveillance that in earlier times would have been prohibitively expensive."
-- US v. Garcia, 474 F. 3d 994 - Court of Appeals, 7th Circuit 2007

In 2009, I attended a surveillance industry trade show (the "wiretapper's ball") in Washington DC where I recorded an executive from Sprint describing, in depth, the location tracking capabilities his company provided to law enforcement agencies:

"[M]y major concern is the volume of requests. We have a lot of things that are automated but that's just scratching the surface. One of the things, like with our GPS tool. We turned it on the web interface for law enforcement about one year ago last month, and we just passed 8 million requests. So there is no way on earth my team could have handled 8 million requests from law enforcement, just for GPS alone. So the tool has just really caught on fire with law enforcement. They also love that it is extremely inexpensive to operate and easy, so, just the sheer volume of requests they anticipate us automating other features, and I just don't know how we'll handle the millions and millions of requests that are going to come in.
-- Paul Taylor, Electronic Surveillance Manager, Sprint Nextel.

The information that I gathered was one of the first real data points revealing the scale and ease with which law enforcement and intelligence agencies can now collect real-time location data from wireless phone carriers. This is because unlike wiretaps, there are no annual statistics produced by the courts that detail the number of location surveillance orders issued each year.

My disclosure of this information led to significant news coverage, but also to a citation from Judge Kozinski of the 9th Circuit, who observed in dissent in U.S. v. Pineda-Moreno that:

When requests for cell phone location information have become so numerous that the telephone company must develop a self-service website so that law enforcement agents can retrieve user data from the comfort of their desks, we can safely say that "such dragnet-type law enforcement practices" are already in use.

ACLU FOIA docs reveal other carriers have followed Sprint's lead

It appears that Sprint is not the only wireless company to provide law enforcement agencies with an easy way to track the location of targets in real-time.

Among the 5500 pages of documents obtained by the ACLU as part of a nationwide FOIA effort, are a few pages from Tucson AZ detailing (or at least hinting at) the real-time location tracking services provided to the government by the major wireless carriers.

AT&T's Electronic Surveillance Fee Schedule reveals that the company offers an "E911 Tool" to government agencies, which it charges $100 to activate, and then $25 per day to use.

While it is no secret that Sprint provides law enforcement agencies subscriber real-time GPS data via its "L-Site" website (read the L-site manual), Sprint's Electronic Surveillance Fee Schedule reveals that the company charges just $30 per month for access to this real-time data.

The documents from T-Mobile provides by far the greatest amount of information about the company's real-time location tracking capabilities. The company's Locator Tool service, which it charges law enforcement agencies $100 per day to access, generates pings at customizable 15 / 30/ 60 minute intervals, after which, the real-time location information is emailed directly to the law enforcement agency.

Unfortunately, Verizon's surveillance pricing sheets do not reveal any information about GPS tracking. It is almost certain that the company does provide real-time location data, but for now, we don't know how it is provided, or at what cost.

Federal judge: Google free to tell user about mysterious gov requests, likely related to Wikileaks

by Christopher Soghoian (noreply@blogger.com) (Slight Paranoia) at March 26, 2012 09:13 PM UTC

Summary

In two 1-page orders issued today, a Federal judge in Virginia has (for a second time) ruled that Google is permitted to tell a customer (and only that customer) about two mysterious surveillance orders -- a 2703(d) order and a search warrant -- issued in June, 2011 for records (likely including communications content) associated with their Google account.

While Google is only permitted to notify the subscriber that was the subject of surveillance, that person is permitted to tell anyone else they wish, should they wish to do so.

Background

One month ago, a federal judge published two (pdf) orders (pdf) [hereafter the February 2012 orders], related to two previously secret surveillance orders obtained in June, 2011 by the government seeking data about a Google subscriber. In the two February 2012 orders, the judge ruled that Google could tell the user about the earlier surveillance orders.

Soon after, the government filed a motion with the court, seeking to clarify whether Google could tell any person about the orders, or merely the impacted user.

In the two orders issued today, the judge seems to have been convinced by the government's clarifying motion. Thus, in 14 days (unless the government appeals), Google will be free to tell the impacted user (and no one else) about the June 2011 surveillance orders.

This may involve Wikileaks

When Jeff Rollins at PaidContent first highlighted the existence of these two mysterious court orders, he suggested that they might be related to the Megaupload investigation. The Megaupload connection was mere speculation on his part (as he acknowledged), as there simply isn't anything solid in those two brief court orders that identifies a particular target.

However, for the reasons I outline below, I believe that these surveillance orders are actually related to the investigation of to Wikileaks.

First, in one of the February 2012 orders (page 2), the judge noted that "[t]he existence of the investigation in issue and the government’s wide use of § 2703(d) orders and other investigative tools has been widely publicized now."

The only high-profile federal investigation that I can think of in recent times involving 2703(d) orders is the government's investigation of individuals associated with Wikileaks. That is, while the Megaupload indictment was also filed in the Eastern District of Virginia, there has been little publicity surrounding the actual investigative legal instruments used in the case.

Specifically, I've not seen any published media report indicating that a 2703(d) order was used in that investigation. In contrast, the 2703(d) order issued to Twitter as part of the Wikileaks investigation has itself been a major story, as have the (failed) efforts of the ACLU, EFF and others to quash the order.

In December 2010, a judge from the same court issued a 2703(d) order to Twitter, forcing the company to disclose information about several users associated with Wikileaks. A month later, the Twitter judge agreed to unseal that order, allowing Twitter to notify the impacted individuals. Once existence of the surveillance order was made public, the media went crazy.

The Wall Street Journal later revealed that Google and California broadband provider Sonic had received similar requests as part of the same investigation. At the time of the WSJ report, those surveillance orders remained sealed.

Second, one persistent rumor in Washington DC over the past year has been that one of the main reasons DOJ has cited justifying the continued sealing of the Wikileaks/Google/Sonic orders is a fear of harassment from the Internet community directed at the prosecutors involved in the case.

As the WSJ revealed earlier this year, the address of Tracy Doherty McCormick, the prosecutor whose name was on the original Twitter order "was spread online, and the person's email account [tracy.mccormick@usdoj.gov] was subscribed to a pornography site." According to the unnamed officials quoted by the WSJ, she was also "bombarded with harassing phone calls."

The WSJ also reported that fear of similar harassment led "the government to take the rare step of keeping officials' names out of news releases and public statements when the government shut down the website Megaupload.com." It is likely that similar fears were the reason that no prosecutors names were listed in the recently published Lulzsec indictments.

Why do I mention this? Well, the two orders issued by the judge today specifically state that Google may share a copy of the 2703(d) order and search warrant with the impacted subscriber, but that the email address and name of the attesting official must be redacted first.

This suggests that someone at DOJ has told the judge they are fearful of retaliation from the Internet community -- thus also suggesting that this surveillance is related to a high-profile investigation of a target to whom Anonymous and other Internet activists may feel some sympathy. While this certainly could be the Megaupload case, I'd be willing to bet a few dollars that this involves Wikileaks.

Firefox switching to HTTPS Google search by default (and the end of referrer leakage)

by Christopher Soghoian (noreply@blogger.com) (Slight Paranoia) at March 21, 2012 12:10 PM UTC

A few days ago, Mozilla's developers quietly enabled Google's HTTPS encrypted search as the default search service for the "nightly" developer trunk of the Firefox browser (it will actually use the SPDY protocol). This change should reach regular users at some point in the next few months.

This is a big deal for the 25% or so of Internet users who use Firefox to browse the web, bringing major improvements in privacy and security.

First, the search query information from these users will be shielded from their Internet service providers and governments who might be using Deep Packet Inspection (DPI) equipment to monitor the activity of users or censor and filter search results.

Second, the search query information will also be shielded from the websites that consumer visit after conducting a search. This information is normally leaked via the "referrer header". Google has in the past gone out of its way to facilitate referrer header based data leakage (which led to me filing a FTC complaint against the firm in 2010).



However, in October 2011, Google turned on HTTPS search by default for signed-in users, and at the same time, began scrubbing the search query from the non-HTTPS URL that HTTPS users are redirected to (and that subsequently leaks via the referrer header) before they reach the destination website:

Over the next few weeks, many of you will find yourselves redirected to https://www.google.com (note the extra “s”) when you’re signed in to your Google Account. This change encrypts your search queries and Google’s results page....

What does this mean for sites that receive clicks from Google search results? When you search from https://www.google.com, websites you visit from our organic search listings will still know that you came from Google, but won't receive information about each individual query.

At the time of the announcement, Google told the search engine optimization (SEO) industry (a community that very much wants to be able to continue to passively receive this kind of detailed user data) that the percentage of users whose search queries would be shielded would be a "single digit" -- and thus, at least 90% of Google users would still continue to unknowingly leak their search queries as they browse the web.

Shortly after Google's October announcement, search engine industry analyst Danny Sullivan told the SEO community that the days of referrer leakage were doomed:

By the future is clear. Referrer data is going away from search engines, and likely from other web sites, too. It’s somewhat amazing that we’ve had it last this long, and it will be painful to see that specific, valuable data disappear.

But from a consumer perspective, it’s also a better thing to do. As so much more moves online, referrers can easily leak out the location of things like private photos. Google’s move is part of a trend of blocking that already started and ultimately may move into the browsers themselves.

It looks like Danny was right.

Google's October 2011 decision to start proactively scrubbing search queries from the referrer header was a great first step, but a small percentage of Google's search users benefited. Now that Mozilla is switching to HTTPS search, hundreds of millions of Firefox users will have their privacy protected, by default.

The only surprising aspect to this otherwise great bit of good news is that the first major browser to use HTTPS search is Firefox and not Chrome. I reasonably assumed that as soon as Google's pro-privacy engineers and lawyers won the internal battle over those in the company sympathetic to needs of the SEO community, that Google's flagship browser would have been the first to ship HTTPS by default.

Just as it showed strong privacy leadership by being the first browser to embrace Do Not Track, Mozilla is similarly showing its users that privacy is a priority by being the first to embrace HTTPS search by default. For Mozilla, this is a clear win. For the Chrome team, whose browser has otherwise set the gold standard for security (and who have proposed and implemented a mechanism to enable websites to limit referrer leakage), this must be extremely frustrating and probably quite embarrassing. Hopefully, they will soon follow Mozilla's lead by protecting their users with HTTPS search by default.

(Just to be clear - the ultimate decision to enable HTTPS search by default was largely in the hands of Google's search engineers, who are responsible for dealing with the increased traffic. Mozilla's privacy team deserves the credit for pressuring Google, and Google's search engine team deserve a big pat on the back for agreeing to cope with encrypted searches from hundreds of millions of users.)

FBI seeks warrant to force Google to unlock Android phone

by Christopher Soghoian (noreply@blogger.com) (Slight Paranoia) at March 14, 2012 02:47 PM UTC

Today, I stumbled across a recent FBI application and accompanying affidavit for a search warrant ordering Google to unlock a screen-locked Android phone. The application asks Google to: "provide law enforcement with any and all means of gaining access, including login and password information, password reset, and/or manufacturer default code ("PUK"), in order to obtain the complete contents of the memory" of a seized phone.

The phone in question was seized from a gentleman named Dante Dears, a founding member of the "Pimpin' Hoes Daily" street gang. On January 17, 2012, a cellphone was seized from Dears by an FBI agent, who then obtained a search warrant to look through the device. According to the affidavit, the technicians at the FBI Regional Computer Forensics Lab (RCFL) were unable to get past the electronic "pattern lock" access controls protecting the phone (apparently, entering multiple incorrect unlock sequences will lock the memory of the phone, which can then only be accessed by entering the user's Gmail username and password).

So why is this interesting and noteworthy?

First, it suggests that the FBI's computer forensics lab in Southern California is unable, or unwilling to use commercially available forensics tools or widely documented hardware-hacking techniques to analyze seized phones and download the data from them.

Second, it suggests that a warrant might be enough to get Google to unlock a phone. Presumably, this is not the first time that the FBI has requested Google unlock a phone, so one would assume that the FBI would request the right kind of order. However, we do not know if Google has complied with the request. Given that an unlocked smartphone will continue to receive text messages and new emails (transmitted after the device was first seized), one could reasonably argue that the government should have to obtain a wiretap order in order to unlock the phone.

Third, on page 13 of the warrant application, the government asks that the owner of the phone not be told about the government's request to unlock his phone. It is surprising then that the warrant and the associated affidavit have not been sealed by the court.

making DNT easier for web sites

by Sid Stamm (noreply@blogger.com) (Sid Stamm) at March 12, 2012 11:53 PM UTC
Jos Boumans has done some analysis about the effect of turning on Do Not Track in your browser, and his findings show that sites in general are slow to show that they support the feature.
"As it stands, only 4 out of 482 measured top 500 sites are actively responding to the DNT header being sent." (Link)
As a user, it's hard to tell if sites are honoring my Do Not Track request, and as a site developer, it might be a daunting task to hack up my back-end code.  The W3C Tracking Protection working group at the W3C are working on helping out transparency and implementations, but in the meantime Jos has released his mod_cookietrack apache module to make it easier for site owners to track their users' clicks in a respectful way -- right now.
The Apache module, mod_cookietrack, does all sorts of stuff like mod_usertrack, but one thing it does better is honor DNT; if a server using this module sees "DNT: 1" in an HTTP request, it replaces the tracking cookie with one that says "DNT" -- something that's not unique to a visitor.

Apparently it was a lot of work to get DNT supported properly in mod_cookietrack, a native browser module that performs well and is safe on multiple threads, so thanks Jos for your hard work so that more organizations can support DNT on their web sites.

More:

Malware and Phishing Protection in Firefox

by Sid Stamm (noreply@blogger.com) (Sid Stamm) at February 27, 2012 06:30 AM UTC
For a while, Firefox has included malware and phishing protection to keep our users safe on the web.  Recently, Gian-Carlo Pascutto made some significant improvements to our Firefox support for the feature, resulting in much more efficient operation and use of the Safe Browsing API for this protection.

Privacy in the Safe Browsing API

I want to take a little time to explain how this feature works and why I like it from a privacy perspective:  Firefox can check whether or not a web site is on the Safe Browsing blacklist without actually telling the API what the web site is called.

At a high level, using this API to find URLs on the "bad" list is like asking your friend to identify whether or not he likes things you show him through a dirty window.  Say you hold up an apple to the dirty window and the your friend on the other side sees a fuzzy image of what you're holding.  It looks round and red and pretty small, but he's not sure what it is.  Your friend looks at his list of things he doesn't like and says he likes everything like that except for plums and red tennis balls.  While he still does not know exactly what you're holding, you can know for sure he likes the apple.

More technically, this uses a hash function to turn web URLs into numbers.  Each number corresponds to exactly one URL.  For each site you visit, Firefox hashes the URL and sends the first part of the resulting number to the Safe Browsing API.  The API responds with any values on the list of bad URLs that start with the value it received.  When Firefox gets the list of "bad" site hash values that match the first part, it looks to see if the entire hash is in the list.  Based on whether or not it's in the provided list of bad stuff, Firefox can determined whether the URL is on the Safe Browsing blacklist or not.

Consider this hypothetical example of two sites and their (fake) hash values:

SiteHash Value
http://mozilla.com1339
http://phishingsite.com1350

When you visit http://mozilla.com, Firefox calculates the hash of the URL, which is 1339.  It then asks the Safe Browsing API what bad sites it knows about that start with "13".  It returns a list of numbers including "1350".  Firefox takes that list, notices that 1339 (http://mozilla.com) is not in the list, so the site must be okay. 

If you repeat the same procedure with http://phishingsite.com, the same prefix "13" is sent to the API, and the same list of bad sites (including 1350) is returned.  In this case, however,  the site's hash is "1350" so Firefox knows it's on the list of bad sites and gives you a warning.

For you techies and geeks out there: yeah, I'm glossing over a few protocol details, but the gist is that you don't need to tell Google exactly where you browse in return for the bad-stuff blocking. 

Keeping the Safe Browsing Service Running Smoothly

Google hosts the Safe Browsing service on the same infrastructure as many of their other services, and they need to ensure that our users aren't blocked from accessing the malware and phishing blacklists as well as make sure they invest in the right resources to keep the service operating well.  One of the mechanisms they need for performing this quality-of-service assurance is a cookie, so the first request Firefox makes to the Safe Browsing API results in the setting of a Google cookie.

I know that not everyone likes that cookie, but Google needs it to make sure their service is working well so I've been working with them to ensure that they can use it for quality of service metrics but not track you around the web.  The most straightforward way to do this is to split the Firefox cookie jar into two: one for the web and one for the Safe Browsing feature.  It's not there yet, but with a little engineering work, in a future version of Firefox that cookie will only be used for Safe Browsing, and not sent with every request to Google as you browse the web.

The cookie can be turned off entirely if you disable third party cookies in Firefox.  When you turn off third party cookies, even if the cookie has been previously set your browser will not send the Google cookie -- unless you visit a Google website. You can also turn off malware and phishing protection, but I really don't recommend it.

Making "Safer Browsing"

While Firefox has been using Safe Browsing for a while, Google has started experimenting with a couple new features in Safe Browsing for additional malware and phishing filtering.  Both of these new features are pretty new and it's not yet clear how effective they are or what percent of my browsing history will be traded for this improvement.  Both new features involve sending whole URLs to Google and departing from Firefox's current privacy-preserving state requires evidence of a significant gain in protection. When Google measures and shares how much gain is encountered by their pilot deployment in Chrome, we can take a deeper look and consider whether these new features are worth it.

For now, Firefox users are getting a lot of protection for very little in return and there does seem to be good reason for Google to use cookies with Safe Browsing.  We are always looking out for things we can do to give Firefox users both the best of privacy and security.

Blogs