The privacy of users participating in online communities is an imperative issue. Websites like Reddit allow users to post under pseudonyms to maintain anonymity; however, anonymity can lead to abusive behavior. In some instances, pseudonyms may not entirely assure privacy as a user’s writing style can disclose their identity. These identifiable elements within a text, studied under stylometry, can be used to track an author’s writings across different texts and platforms, posing a significant privacy concern.
Authorship obfuscation techniques aim to address this issue by automatically rewriting text to conceal the identity of the author, thus ensuring privacy in online discussions. Traditional methods have often been limited to specific scenarios, and heavily rely on basic modifications. But this can result in odd or unnatural writings, which can hamper the efficiency of privacy protection and affect communication quality.
Researchers from the University of Maryland, College Park, have developed an automatic text privatization framework that uses a Large Language Model (LLM) to generate rewrites balancing intelligibility, sense, and privacy. The framework employs a large language model fine-tuned via reinforcement learning to strike a balance between privacy protection, maintaining the text’s coherence, and preserving its naturalness.
The new framework maintains the original text’s coherence and readability while concealing the author’s identity. The effectiveness of this technique was evaluated using a vast dataset of English Reddit posts from 68,000 authors. The posts varied in length, representing typical online discussion board content. The study examined how the obfuscation technique performs with different authorship detection methods and based on the author profile’s length.
Automatic measurements and human reviews verified that the new technique maintained high text quality, allowing readers to comprehend and connect with the altered text. The method successfully circumvented numerous automated authorship attacks, proving its effectiveness in preserving user privacy.
By using reinforcement learning to fine-tune a large language model, this new method represents a significant advancement over previous attempts and offers a more robust and practical approach to concealing authorship. It guarantees that people can interact freely and securely online without compromising the quality of their communication or privacy.