DataGuard: Guaranteeing Private Training in Systolic-array Based Accelerators

Abstract

Differential privacy (DP) and federated learning (FL) have emerged as important privacy-preserving approaches when using sensitive data to train machine learning (ML) models. FL ensures that raw sensitive data does not leave the users' devices by training the model locally on the device. DP ensures that the model does not leak any information about an individual by clipping and adding noise to the gradients before updating the model. It provides formalism to constrain privacy loss during training to a privacy budget determined a priori by the owner of sensitive data. However, real-life deployments of FL algorithms typically assume that a third-party FL application can be trusted to correctly implement DP algorithms. Thus, the third-party application is given full access to sensitive data. In this work, we propose DataGuard, a hardware-based mechanism that guarantees that the only data that can leave the device is the result of computation that meets DP requirements. DataGuard can thus be used to ensure that the privacy budget defined by the data owner is not exceeded during FL training without the need to trust a third-party application. We evaluate DataGuard in simulations of four accelerators for various ML models and demonstrate only small area overheads of less than 0.01\% and performance slowdowns of less than 0.3\%.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…