This blog post was originally published at Intel’s website. It is reprinted here with the permission of Intel.
Privacy for machine learning (ML) and other data-intensive applications is increasingly threatened by sophisticated methods of re-identifying anonymized data. In addition, while encryption helps preserve privacy while data is stored on hard drives or moving over networks, the data must be decrypted to be operated on, and this exposes it to malicious actors and inadvertent leakage.
Privacy-Preserving Machine Learning (PPML), including rapid advances in cryptography, statistics, and other building block technologies, provides powerful new ways to maintain anonymity and safeguard privacy.
At the RSA Conference (RSAC) in San Francisco this week, I’ll discuss these advances and show how they open the door to an exciting new category of collaborative ML use cases. The session, Protecting Privacy in a Data-Driven World: Privacy-Preserving Machine Learning, takes place Feb. 25 at 1 pm in Moscone West.
Advancing Privacy for Machine Learning
Traditional approaches to privacy and ML rely on removing identifiable information, encrypting data at rest and in flight, and limiting data sharing to a small set of trusted partners. But ML often involves massive data volumes and numerous parties, with separate organizations owning or providing the ML models, training data, inference data, infrastructure and ML service. Collaboration has meant running the risks of exposing the data to all these partners, as well as working on unencrypted data.
PPML combines complementary technologies to address these privacy challenges. Working together, these technologies make it possible to learn about a population while protecting data about the individuals within the data set. For example:
- Federated learning and multi-party computation let institutions collaboratively study data without sharing the data and potentially losing control of it. In addition to bringing together previously siloed data, these technologies help provide secure access to the massive quantities of data needed to increase model training accuracy and generate novel insights. They also avoid the costs and headaches of moving huge data sets among partners.
- Homomorphic encryption (HE) is a public/private key cryptosystem that allows applications to perform inference, training and other computation on encrypted data, without exposing the data itself. Dramatic performance advances are making HE practical for mainstream use.
- Differential privacy adds mathematical noise to personal data, protecting individual privacy but enabling insights into patterns of group behavior.
Expanding ML Collaboration
The advances provided by PPML and its component technologies can enable a new class of private ML services that let competitors collaborate to achieve mutual benefit without losing a competitive advantage. These new methods of collaboration present particularly exciting opportunities in healthcare, financial services and retail—fields that that collect highly sensitive data about their clients and customers and that comprise 22 percent of the US Gross Domestic Product (GDP) . Rival banks could create joint models for combating money laundering, potentially reducing fraud. Hospitals could use remote, third-party analytics on patient data, potentially leading to new clinical insights and breakthrough treatments. Retailers could monetize their purchase data while protecting user privacy and retaining their ability to develop unique products and services.
Faster Time-to-Value for Machine Learning
Intel is working on multiple fronts to accelerate progress on PPML. Data owners can use HE-Transformer, the open source backend to our nGraph neural network compiler, to gain valuable insights without exposing the underlying data. Alternatively, model owners can use HE-Transformer to deploy their models by in encrypted form, helping protect their intellectual property. Researchers, developers and institutions can accelerate their PPML building blocks and protect their federated learning environments by running them in a trusted execution environment (TEE) such as Intel® Software Guard Extensions (Intel SGX).
PPML and its component technologies will bring new power to AI and ML services while strengthening protections for sensitive data. I’m excited to discuss these technologies with the RSAC community—and to follow the results as PPML applications mature.
Intel at RSAC
Three additional RSAC 2020 sessions will highlight other aspects of Intel’s work to build a trusted foundation for data-driven computing.
- Security Policy and Regulation Trends for Developers. Intel’s director of global cybersecurity policy, Dr. Amit Elazari, will discuss legal and regulatory issues shaping the future of cybersecurity. Feb. 24, 12:15 pm, Moscone West.
- Nowhere to Hide: How Hardware Telemetry and Machine Learning Can Make Life Tough for Exploits. Rahuldeva Ghosh and Zheng Zhang will describe considerations for building CPU telemetry and ML solutions for runtime threat and anomaly detection. Ghosh is a senior staff architect and Zheng is an engineering manager. Feb. 25, 3:40 pm, Moscone West.
- “I’m Still Standing,” Says Each Cyber-Resilient Device. Principal engineer Abhilasha Bhargav-Spantzel and senior firmware engineer Nivedita Aggarwal will present a new perspective on creating resilient computer systems in the face of increasing cyberthreats. Feb. 27, 8 am, Moscone West.
- Intel Software Guard Extensions
- nGraph-HE: A Graph Compiler for Deep Learning on Homomorphically Encrypted Data
- HE-Transformer for nGraph: Enabling Deep Learning on Encrypted Data
- Eye on AI Podcast
- Privacy Now: Can AI and Privacy Coexist?
 Bureau of Economic Analysis News Release, April 19, 2019. https://www.bea.gov/system/files/2019-04/gdpind418_0.pdf
Senior Director, Office of the CTO, Artificial Intelligence Products Group, Intel