January 09, 2010

The Future: Operating System And Application-Neutral Data

We are now growing accustomed to the concept of the "cloud", where our data will be increasingly stored in Web services, not on local disk, accessible from any computer, operating system or browser. But, despite the adoption of standards from major players storing our personal data, the choice of services causes serious vendor lock-in, as the data suite, be it from Microsoft, Google, Apple or other providers, is not only interpreted by their offerings, but stored there as well. This storage and management of our data makes migration between services incredibly difficult, and still leaves us at the mercy of a large company, whose priorities may not be the same as our own.

The time has come to start on a path to true ownership of data by the individual, reducing applications and Web services to the role of filters and containers, rather than hosts, who can propagate lock-in as these services spread to mobile devices and tablets from their desktop roots.

What makes a digital device mine, be it a laptop or a cellphone or an music player, is the personal content that is stored, and how that data is translated, stored, presented and categorized. Similarly, when we make a choice as to our preferred Web services or technology providers, we are, for the most part, passing our content to them exclusively. While we may have made the data location independent, it is far from being service independent, and any potential future switching will have dramatic impacts on time and productivity, including:
  • Complicated export and import of personal data
  • Differences in the interpretation of data between similar applications
  • Potential loss of metadata between services
  • Reduced backup stability as differing instances of our data is housed at differing services
This headache is a major part of vendor lock-in. Today, when I make a choice as to what brand computer to buy, or what phone to purchase, while I may be committing to a brand or a suite of applications, all I am really doing is asking this product to provide its own interpretation of my data, including:
  • My Contacts and Relationships
  • My Music Files
  • My Videos
  • My Photos
  • My E-mails and Hierarchy
  • My Documents
  • My Bookmarks and Hierarchy
Increasingly more important than the actual data itself is its metadata, or data around data. How is the data structured, meaning... Do I have my e-mails in folders with subfolders and rules? Do I have photos in specific albums? Do I know how often I have played a specific song or genre? When were documents created or last edited?

Today, for the most part, we are choosing from three major service providers, although there are alternatives. We can select Google, who offers Google Contacts, Calendar, YouTube, Picasa, GMail, Google Apps and Google Chrome for the majority of our needs. We can, instead, select Apple and use Address Book, iTunes, iPhoto, Mail, and Safari. Or, we can stick with Microsoft, and leverage Outlook, Windows Media Player, and Internet Explorer. (Or their online equivalents)

Despite standards adoption, not all programs interoperate well. No doubt there remain issues with meetings from Microsoft Exchange being received by Apple Mail, and the integration of Web browser bookmarks and Web history is not shared between Google Chrome and Safari. These minor problems are greatly magnified when you consider the potential for future switching, as you migrate from one platform to another or one computer to another - largely because in every case, the applications themselves, even those that are Web services, are storing our data for us, and interpreting it in their own way.

I think it is time for a change, that lets us own our own data, turning the situation on its head.

Instead of hosting our own data with the service provider of the day, we should host our own data in a standard format, which will be adopted by the leading providers, whose applications will tap into us directly, and pull down our data and its metadata. If I chose to log in with GMail one day, I would authenticate who I was, and GMail would pull down my e-mail stream, complete with e-mail activity history (such as replies and forwards). The data would not be stored on Gmail, but instead be more like a read-only process, whereby changes to data, including sent items, would not be stored in GMail, but written back to my personal "cloud", if you will. Similarly, if I opted to log in to Microsoft Outlook, tapping in to my own, authenticated, account, I could browse my contacts in their application, through their filter, but the data would reside with me.

Hosting one's own personal cloud with our own data is not an end run around large corporations in fear of Big Brother, but instead, for real, true, portability. In this situation, a longtime iPhone user could pick up an Android phone, enter my own personal ID (be it through OpenID or some other standard), and pull down my details into all of Google's native applications. Similarly, I could log in to any Microsoft, Apple or Google powered device and become me, not with my data hosted on the new machine, but with my data being read, like a Web page, on that device, in their own lens.

Even as we on the Web are rallying around these concepts of standards, and the cloud, we are seeing the concept of vendor lock-in be as true as ever. The switching costs from hardware device to hardware device, OS to OS and Web service to Web service remain completely too high, and the way around this problem is to take back our data, make it personal, and enforce standards that get the major players to come aboard. While we may not all have all the broadband access necessary to make this a solution today, it's 2010, and we should be well beyond the same issues we have been facing in computing for the last 20 years.

So how do we make this happen?