iSchool Capstone

Making Sense of Misinformation At Scale

Project tags:

data science & visualization

information policy & ethics

software development

Research Award
Project poster

Fact-checking organizations have limited resources to keep up with misinformation in this viral age. Prioritization of information allows for more efficient and accurate topic selection. We discuss an implementation of an automated NLP topic-modeling system at, a data pipeline to crowdsource misinformation. Users submit reports through a website and HTML is scraped and parsed into clusters for the reporting staff to act on. Topic modeling provides metrics for prioritization and effective allocation of resources. Our data pipeline reduces the amount of manual curation required by editors at Snopes, which enables them to reallocate their resources to debunking rumors.

Project participants:

Lucy Eun


Ethan Anderson


Jesse Chamberlin


Evan Frawley