Language models (LMs) today are widely used (1) in personalized contexts and (2) to build agents that can use additional tools. But do they respect privacy when helping with daily tasks like emailing?

Figure 1: Inference-time privacy risks emerge as an agent may access private data through prompt inputs and potentially leak it while executing personal tasks on the user's behalf.

While many studies have investigated LMs memorizing training data, a lot of private data or sensitive information is actually exposed to LMs at inference time (e.g., users allowing LMs to use content retrieved from their mailbox). To quantify the privacy norm awareness of LMs and their emerging inference-time privacy risks, in our paper accepted to NeurIPS 2024 D&B track, we propose PrivacyLens, a novel framework that extends privacy-sensitive seeds into vignettes and further into agent trajectories to enable multi-level evaluation of privacy leakage in LM agent’s actions.

Figure 2: Data construction pipeline in PrivacyLens. PrivacyLens start with contextual privacy-sensitive seeds (A). It extends each seed into a vignette through template-based generation (B) and an LM agent trajectory through sandbox simulation (C). The data construction pipeline is open-sourced at our Github repo.

Using our framework, we reveal a discrepancy between LM performance in answering probing questions and their actual behavior when executing user instructions. While GPT-4 and Claude-3-Sonnet answer nearly all questions correctly, they leak information in 26% and 38% of cases. In our paper, we explore the impact of prompting. Unfortunately, simple prompt engineering does little to mitigate privacy leakage of LM agents’ actions. We also examine the safety-helpfulness trade-off and conduct qualitative analysis to uncover more insights. Unfortunately, current language models have yet to fully occupy the crucial space that hits both safety and helpfulness.

Figure 3: PrivacyLens reveals trade-off between respecting privacy norms and maximizing helpfulness of current LM agents.

Our paper, dataset, and code can be found at https://salt-nlp.github.io/PrivacyLens/.