Tuning for TraceTarnish: Techniques, Trends, and Testing Tangible Traits

Abstract

In this study, we more rigorously evaluated our attack script TraceTarnish, which leverages adversarial stylometry principles to anonymize the authorship of text-based messages. To ensure the efficacy and utility of our attack, we sourced, processed, and analyzed Reddit comments -- comments that were later alchemized into TraceTarnish data -- to gain valuable insights. The transformed TraceTarnish data was then further augmented by StyloMetrix to manufacture stylometric features -- features that were culled using the Information Gain criterion, leaving only the most informative, predictive, and discriminative ones. Our results found that function words and function word types (L\FUNC\A \& L\FUNC\T); content words and content word types (L\CONT\A \& L\CONT\T); and the Type-Token Ratio (ST\TYPE\TOKEN\RATIO\LEMMAS) yielded significant Information-Gain readings. The identified stylometric cues -- function-word frequencies, content-word distributions, and the Type-Token Ratio -- serve as reliable indicators of compromise (IoCs), revealing when a text has been deliberately altered to mask its true author. Similarly, these features could function as forensic beacons, alerting defenders to the presence of an adversarial stylometry attack; granted, in the absence of the original message, this signal may go largely unnoticed, as it appears to depend on a pre- and post-transformation comparison. "In trying to erase a trace, you often imprint a larger one." Armed with this understanding, we framed TraceTarnish's operations and outputs around these five isolated features, using them to conceptualize and implement enhancements that further strengthen the attack.

0

Discussion (0)

Sign in to join the discussion.

Loading comments…