Federated learning is a way to train models collaboratively using data from multiple clients, maintaining data privacy. Yet, this privacy can become compromised by gradient inversion attacks that reconstruct original data from shared gradients. To address this threat and specifically tackle the challenge of text recovery, researchers from INSAIT, Sofia University, ETH Zurich, and LogicStar.ai have created an algorithm called DAGER.
The DAGER algorithm exploits the discrete nature of token embeddings and the low-rank structure of self-attention layer gradients to verify token sequences in client data, achieving exact recovery of entire batches of input text without requiring prior knowledge. It works efficiently for both encoder and decoder architectures using heuristic search and greedy approaches. When tested on large language models such as GPT-2, LLaMa-2, and BERT, DAGER showed improved speed, scalability, and reconstruction quality, handling batches as large as 128.
Gradient leakage attacks, that DAGER counters, typically occur in two ways—honest-but-curious attacks where the attacker observes updates passively and malicious server attacks where the attacker can modify the model. Although most research focuses on image data, DAGER targets text-based attacks, outdoing previous methods as it supports larger batches and sequences and works for token prediction and sentiment analysis.
DAGER identifies the correct tokens in each client input sequence from shared gradients by using a two-stage process. The gradient subspace checks are applied to filter out incorrect embeddings initially, and then partial client sequences are created and verified through subsequent self-attention layers. This supports the efficient reconstruction of full input sequences.
The experimental evaluation of the DAGER algorithm demonstrated its superior performance against previous methods across different settings. It consistently outperformed TAG and LAMP when tested on models like BERT, GPT-2, Llama2-7B and datasets like CoLA, SST-2, Rotten Tomatoes and ECHR. The experiment results showed that DAGER achieved near-perfect sequence reconstructions, reduced computation times and was even robust in handling long sequences and larger models, maintaining high ROUGE scores for bigger batch sizes.
Despite its efficiency, the DAGER algorithm faces embedding dimension limitations when working on decoder-based models. Future research may want to explore DAGER’s resilience against defense mechanisms like DPSGD and its application to more complex FL protocols. Additionally, in relation to encoder-based model attacks, heuristics to reduce the search space could be integral to overcoming computational challenges large batch sizes present. DAGER displays the inherent vulnerability of decoder-based LLMs to data leakage, underscoring the need to implement solid privacy measures within collaborative learning.