QTM: Wikipedia Vandalism

A Wikipedia revision is said to exhibit vandalism if it is non-value adding, offensive, or destructive in its removal of content. The motives of vandals vary (profit, narcissism, political agendas, etc.), but their impact is large. Studies have suggested that hundreds of millions Wikipedia page-views have been marred by incidents of vandalism.

The vast majority of existing attempts to detect Wikipedia vandalism, both academic and internal to Wikipedia, are based on natural-language-processing (NLP) techniques (often regular expressions of offensive words/phrases). While NLP filters are capable of detecting large amounts of vandalism, they may not be resilient to sophisticated attack models: NLP-attempts in the analogous domain of spam-email detection have shown NLP-filters to be computationally complex and often evade-able.

These deficiencies are compounded by the fact locating vandalism on Wikipedia is an extremely hard problem. The insertion of a single "not" can falsify otherwise true statements -- someone can replace an obscure name in a historical narrative with their own. Such subtle incidents of vandalism prove extremely hard to detect, and automated tools must be conservative to avoid false-positives.

Thus, large amounts of vandalism on Wikipedia remain un-discovered. To complement the progress made by NLP-efforts, the QTM project leverages reputation management fundamentals to uncover additional vandalism. Once a set of bad edits (i.e., negative feedbacks) can be identified, one can lower the reputation's of the entities involved (and their future edits will be viewed suspiciously). Further, the metadata of poor edits may exhibit patterns which can be exploited to aide in the discovery of future vandalism. Critically, these examinations can be performed without having to ever examine the article or diff text:

DETECTING WIKIPEDIA VANDALISM VIA SPATIO-TEMPORAL ANALYSIS OF REVISION METADATA (EuroSec '10) - Analysis begins by locating vandalous-revisions via an administrative form of reversion called rollback. The properties of guilty edit metadata are contrasted with those of 'normal' edits. Differences in the sets guide the construction of a feature set built upon spatio-temporal properties. Simple features include things such as the time-of-day an edit was made, and the length (spatial) of the revision comment. More interesting are the aggregate features that combine time-decayed behavioral observations (rollbacks/feedback) to create reputations for single-entities (users, articles) and spatial groupings thereof (geographical region, content categories). Ultimately, these features are combined into a classifier which performs comparably to NLP efforts.
STiki (SPATIO-TEMPORAL ANALYSIS OVER WIKIPEDIA) SOFTWARE - A software tool that builds the logic of the EuroSec `10 paper into a live, on-Wikipedia implementation. It consists of: (1) a server-side engine that processes and 'scores' Wikipedia edits in real-time (computes a value that speaks to the probability the edit is vandalism), and (2) a client-side GUI that allows users to examine and definitively classify probable-vandalism found on the back-end (and revert the edit on Wikipedia). All user feedback helps to improve future scoring on the back-end.
OTHER STiki EXPOSURE - Whereas the EuroSec `10 paper introduced the logic that would later be used to implement the STiki tool -- several more recent publications/documents address the STiki software more directly. STiki and its functionality were presented at a [WikiSym 2010 Formal Demonstration]. At the same venue, a [WikiSym 2010 Poster] (and its [write-up]]) was presented that reported statistics about STiki's use and growing user-base. Less technical versions of the same presentations were given at WikiMania 2010.
WIKIPEDIA VANDALISM DETECTION: COMBINING NATURAL LANGUAGE, METADATA, AND REPUTATION FEATURES (CICLing '11) - Here, the metadata signals of STiki are combined with those from other anti-vandalism techniques to create the most effective classifier at the time of publication. Over 50 features are analyzed and taxonomized.
MULTILINGUAL VANDALISM DETECTION USING LANGUAGE-INDEPENDENT & EX POST FACTO EVIDENCE (PAN-CLEF '11) - Here, the feature set of our CICLing 2011 paper was extended to reflect novelties in the 2011 edition of the PAN-CLEF anti-vandalism detection competition (and most vandalism research). First, the competition spanned three natural languages. Second, "ex post facto" evidence was permitted for use. Addressing these tasks, our approach won the associated competition. Slides are also available for this work: ([SLIDES-PDF] [SLIDES-PPT]) for this work.
COLLABORATIVE LINK SPAM - Having studied vandalism broadly, we focused our attention to an interesting subset thereof: external link spam. Attackers of this type are likely to be well-motivated (possible financially), distinguishing them from the vast majority of immature vandalism. A CEAS 2011 paper ([ABSTRACT-TXT] [PDF]) began by surveying the existing link spam problem. We found that existing strategies based on attaining link-persistence failed given the diligence of the Wikipedia editor community. In turn, we proposed an aggressive attack model that exploits the latency of the human defenses. Having put forth a viable attack model, our WikiSym 2011 paper ([ABSTRACT-TXT] [PDF]) sought to mitigate these concerns. Using signature-based detection, we made progress towards mitigating both existing spam strategies and our novel proposals. The technique was subsequently implemented in a live fashion, and implemented into our STiki tool. A Wikimania 2011 presentation described and demonstrated this live implementation.
BROAD OVERVIEWS - Those looking for a broad overview of all research undertaken as part of our anti-vandalism efforts should consult our MURI Grant Review Presentation. For an explanation of how our work fits into the larger body of literature, consult our Wikimania 2011 presentation "Anti-Vandalism: The Year in Review."
RELATED WORK - Our anti-damage and anti-vandalism efforts have also produced publications that lie slightly outside of the wiki domain. For example, a controversial methodology used in our link spam assessments prompted "Spamming for Science: Active Measurement in Web 2.0 Abuse Research" at WECSR 2012 (ethics in computer security; [ABSTRACT-TXT] [PDF] ). Finally, we have applied our wiki-inspired techniques to collaborative non-wiki applications as in "Towards Content-driven Reputation for Collaborative Code Repositories", at WikiSym 2012 ([ABSTRACT-TXT] [PDF]).

CIS Home | Penn Engineering | Penn