Blockchain can help secure digital transactions while also protecting user privacy
Note: Professor Alex Pentland of MIT’s Media Lab, will host a panel on Big Data 2.0: Next-Gen Privacy, Security, and Analytics, at the upcoming MIT CIO Symposium on May 18. This article focuses on one aspect of his work.
The exponential growth of mobile and ubiquitous computing together with big data analysis, are transforming the entire digital landscape, or “data ecology.” These shifts are having a dramatic impact on people’s personal data-sharing awareness and sensitivities as well as their cybersecurity: The more data created and shared, the more concerns rise. Recently, apprehensions have reached critical mass regarding privacy and the use of personal data, partially due to media exposure of cybersecurity breaches and intelligence scandals. The surge of mobile transactions, micropayments, and connected sensors in both private and public spaces is expected to further exacerbate this tension.
What’s needed is a “new deal on data” where security concerns are matched with transparency, control and privacy, and are designed into the core of any data-driven service 1.
In order to demonstrate that such a win-win data ecology is possible, my students and I have developed Enigma, a decentralized, computation platform enabling different participants to jointly store and run data computations while keeping the data completely private2. Enigma promotes a viable digital environment by supporting four key requirements:
- That data always be encrypted
- That computation happens on encrypted data only
- That data owners will control access precisely, absolutely and with an audit trail
- That there are means to reliably enable payment to data owners for use of their data
From the user’s perspective, Enigma is a hosted computer cloud that ensures both the privacy and integrity of their data. The system also allows any type of computation to be outsourced to the cloud while guaranteeing the privacy of the underlying data and the correctness of the result. A core feature of the system is that it allows data owners to define and control who can query it, ensuring that the approved parties only learn the output. Moreover, no other data leaks to any other party.The Enigma cloud itself is comprised of a network of computers that store and execute queries. Using secure multi-party computation, each computer only sees random pieces of the data, preventing information leaks. Furthermore, queries carry a micro-payment to the provider of the computing resources, as well as a payment to the users whose data is queried, thus providing the foundation for the rise of a sustainable, secure data market.
To illustrate how the Enigma platform works, consider the following example: a group of data analysts of an insurance company wishes to test a model that leverages people’s mobile phone data. Instead of sharing their raw data with the data analysts in the insurance company, customers can securely store their data in Enigma, and only provide the data analysts with permission to execute their study. The data analysts, therefore, are able to execute their code and obtain the results, but nothing else. In the process, the users are compensated for giving access to their data, and the owners of the network computers are paid for their computing resources.
Three types of entities are defined in Enigma, and each can play multiple roles (see Figure 1). Owners are those sharing their data into the system and controlling who can query it; Services, if approved, can query the data without learning anything else beyond the answer to their query; and Parties (or computing parties) are the nodes that provide computational and storage resources but they only see encrypted or random bits of information. In addition, all entities are connected to a blockchain, as shown below.
When owners share data, the data is split into several random pieces called shares. Shares are created in a process of secret-sharing, and they perfectly hide the underlying data while maintaining some necessary properties. This allows them to be queried later in this masked form. Since users in Enigma are owners of their data, we use the blockchain as a decentralized, secure database that is not owned by any party. This also permits an owner to designate which services can access its data and under what conditions, and it permits parties to query the blockchain and ensure that it holds the appropriate permissions. In addition to being a secure and distributed public database, the blockchain is also used to facilitate payments from services to computing parties and owners, while enforcing correct permissions and verifying that queries execute correctly.
In summary, a sustainable data ecology requires that data is always encrypted,and that computation happens on encrypted data only. It also requires that owners of the data control access to their data precisely, absolutely and in a way that can be audited. Finally, it requires that data owners are reliably paid for use of their data. Enigma accomplishes these requirements, providing proof that a sustainable, secure data ecology is possible.
The major design question remaining about this ecology is one of policy: What is the trade-off between user security and access by law enforcement and intelligence services? In the current Engima system this trade-off is handled by leaving metadata encrypted, but visible. Other trade-offs are possible, including full anonymization, or building-in the ability for court-ordered investigators to penetrate anonymity though zero-knowledge proofs (which Is different than back-door approaches).
For additional information see http://trust.mit.edu
1 Pentland, A. Reality mining of mobile communications: Toward a new deal on data. World Economic Forum Global IT Report 2008, Chapter 1.6, (2008), 75–80.[
2 Zyskind, G., Nathan, O., and Pentland, A. (2015). Decentralizing privacy: Using blockchain to protect personal data. In Proceedings of IEEE Symposium on Security and Privacy Workshops, 180–184.