P2RAG: Efficient Privacy-Preserving RAG Service Supporting Arbitrary Top-k Retrieval

Abstract

Retrieval-Augmented Generation (RAG) enables large language models to use external knowledge, but outsourcing the RAG service raises privacy concerns for both data owners and users. Privacy-preserving RAG systems address these concerns by performing secure top-k retrieval, which is typically implemented using secure sorting to identify relevant documents. However, existing systems face challenges supporting arbitrary k due to their inability to change k, new security issues, and in particular, efficiency degradation with large k. This is a significant limitation because applications such as finance, law, and healthcare require a k that is large enough to cause huge overhead for existing systems. Also, modern long-context models generally achieve higher accuracy with larger retrieval sets. We propose P2RAG, an efficient privacy-preserving RAG service that supports arbitrary top-k retrieval. Unlike existing systems, P2RAG avoids sorting candidate documents. Instead, it uses an interactive bisection method to determine the set of top-k documents. For security, P2RAG uses secret sharing on two semi-honest non-colluding servers to protect the data owner's database and the user's prompt. It enforces restrictions and verification to defend against malicious users and tightly bounds the information leakage of the database. The experiments show that P2RAG is 3--300× faster than the state-of-the-art PRAG for k = 16--1024.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…