The Social Graph API and Surprises
On Friday, Google released an interesting new API called the "Social Graph API" which can be used to connect accounts online. The API is a concrete step in opening the social graph, and it is great to see Google demonstrating just how important decentralized social networks are as it really reinforces what we, Plaxo, the DiSo Project, and many others have been building. But we also wanted to highlight some of the complex issues of making social applications on the web that pop up with these kinds of initiatives -- issues that are especially important since some people (including danah boyd and ReadWriteWeb) have concerns about the privacy implications of Google's work.
We're excited about APIs to access social relationships online. But we're concerned that existing systems may result in ugly surprises when your relationships are available in a new or unexpected context. We think that access to the social graph must give individuals control over how and where their relationships are shared.
I've had many conversations with Brad Fitzpatrick (who joined Google last August to develop this API) about the work we're trying to accomplish at both of our companies, as you can see from the initial publication of the "Thoughts on the Social Graph" last year. When I re-joined Six Apart last August I started developing an open source service that would crawl online relationship data and expose it via an API. I showed snippets of it on my blog (pictures and text), gave a brief demo at the Data Sharing Summit, and even came close to releasing an online tool which visualized all of your accounts and friends; instead we opted for demonstrating its power with a screencast showing how you could use it to find your friends. While this implementation of the API was based on publicly discoverable information (like Google's), we simply didn't feel comfortable shipping that project based on current implementations.
Why not? Well, for us to be comfortable that we weren't doing any evil, we wanted to make sure that we first had a way to clearly explain to non-technical users a few important points:
- Where each point of data about relationships is coming from
- How to hide or control sharing of data on each service you use
- A way to prompt the services to update relationships when they change, to make sure they're up to date, as we prototyped along with Ma.gnolia in October
There are many more requirements we could add to this short list, but these seemed like fundamentals to make sure that people have a very high degree of control over their relationship data, and the current implementation of Google's Social Graph API falls a bit short. This is even more important, for example, when we learn in ReadWriteWeb's post on privacy concerns that Aber Whitcomb (CTO of MySpace) has said that Google's "API includes a custom mechanism to extract social connections between friends on MySpace." This means that Google isn't just using profile information designed to be aggregated, but is already willing to extract data as needed (much as Plaxo did to Robert Scoble's Facebook account) from services that don't explicitly share it. While Facebook easily blocked Plaxo's specific crawler, it seems extremely unlikely that MySpace would block Google. This is even more fraught becasue Google is likely basing this API on the data they collect when routinely crawling the web for their search engine. Controlling access to relationship data from a service should not require completely blocking all of Google's crawlers.
Don't get us wrong, having social networking become a feature of every application versus a product by itself will dramatically change the web for the better. Tim O'Reilly discusses this where he says "It's a lot like the evolutionary value of pain. Search creates feedback loops that allow us to learn from and modify our behavior. A false sense of security helps bad actors more than tools that make information more visible." Google's Social Graph API certainly is powerful and we intend to use it within our products (such as recommending accounts to add to your Action Streams) though will always balance our use with your privacy as we've always done in the past. But we do hope to kick-start a conversation about how we can all be given more control over the way our relationships are used.
The guiding principle here is one that Brad Fitzpatrick and many others have relied on many times: The Principle of Least Surprise. PoLS is one of the things that's always guided our work at Six Apart, and we've found that many of our biggest mistakes have come when we forget the lessons it teaches us. There's an obvious point of frustration or embarrassment that can arise from exposing our personal MySpace connections in a context where our professional LinkedIn contacts can see them, for example. The fact that much of this data could theoretically be discovered anyway isn't the point. Just as much of the information in Facebook's News Feed could have been discovered anyway, the fact that these relationships are being moved from "possible to find" to "easy to discover" means that we should be thinking of how this affects social behaviors in this new context.
And the truth is, we don't know the right answer. We're hoping to start a useful conversation in the community to help find the answers. That way, we can make sure everyone who benefits from these new social features finds them to be a pleasant surprise.

3 Comments
David -
I'm glad your thinking about this. There is a very strong difference between making something available and making something easily findable.
There is an analog in electronic court filing/records. Courts make court records public if you show up and ask for specific items, and sometimes they make the records public, but they have been hesitant to make them easily searchable on the web due to privacy and abuse concerns. I think these are very similar to the concerns you raise. EPIC has a page on privacy issues related to court records here
This is definitely a place where policy should be considered along with technology - I'm glad someone is thinking about should before could!!!
Also, another excellent resource on the court records and privacy issue is CourtAccess.org
I am glad the privacy perspective is in sight. That is going to be a much tougher beast than mapping connections. It seems the balance is to put control in the hands of the individual and the services as far to how (and if) the information is gathered. Those of us who understand this stuff are not the ones who will likely have a difficult time sorting through how we mitigate the problems, but it is not 98% that are not geeks who will be dealing with wonderfully crafted social engineering SPAM and other added digital mire they do not want to deal with or think about.
There are many layers missing in access and privacy around the social graph, including prevalent tools like ClaimID that help humans understand who other people are in systems by verifying personal control of digital identities they say are theirs. The human name space is thin and collisions in the digital realm are easy to overlap (I have 6 dual named people pairs in my address book, which leads to innocent problems). People friend my dad thinking it is me more than should happen.
This is a big messy issue and I am glad you are helping sort out the space with patience and understanding.