‘Culturomics 2.0′ forecasts human behavior by supercomputing global news

September 6, 2011

A paper published yesterday in the peer-reviewed journal First Monday combines advanced supercomputing with a quarter-century of worldwide news to forecast and visualize human behavior, from civil unrest to the movement of individuals. The paper, titled “Culturomics 2.0: Forecasting Large-Scale Human Behavior Using Global News Media Tone in Time and Space,” uses the tone and location of news coverage from across the world to forecast country stability (including retroactively predicting the recent Arab Spring), estimate Osama Bin Laden’s final location as a 200-kilometer radius around Abbottabad, and uncover the six world civilizations of the global news media. The research also demonstrates that the news is indeed becoming more negative and even visualizes global human societal conflict and cooperation over the last quarter century.

Using the large shared-memory supercomputer Nautilus, Kalev Leetaru of the University of Illinois in Urbana-Champaign combined three massive news archives totaling more than 100 million articles worldwide to explore the global consciousness of the news media. The complete New York Times from 1945 to 2005, the unclassified edition of Summary of World Broadcasts from 1979 to 2010, and an archive of English-language Google News articles spanning 2006 to 2011 were used to capture a cross-section of the U.S. media spanning half a century and the global media over a quarter-century.

Advanced tonal, geographic, and network analysis methods were used to produce a network 2.4 petabytes in size containing more than 10 billion people, places, things, and activities connected by over 100 trillion relationships, capturing a cross-section of Earth from the news media. A subset of findings from this analysis were then reproduced for this study using more traditional methods and smaller-scale workflows that offer a model for a new class of digital humanities research that explores how the world views itself.


Funded by the National Science Foundation and managed by the University of Tennessee’s Remote Data Analysis and Visualization Center, the Nautilus supercomputer is a part of the National Institute for Computational Sciences network of advanced computing resources at Oak Ridge National Laboratory.

Read more about results from this research in the full paper at http://www.uic.edu/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/3663/3040. A highlight about the research is also available at www.utk.nics.edu.

About NICS:

The National Institute for Computational Sciences (NICS) is a joint effort of the University of Tennessee and Oak Ridge National Laboratory. NICS was founded in 2007, and is supported by the National Science Foundation’s Extreme Science and Engineering Discovery Environment program, and is located at Oak Ridge National Laboratory, home to the world’s most powerful computing complex.

About RDAV:

RDAV is the University of Tennessee’s Center for Remote Data Analysis and Visualization, sponsored by the National Science Foundation as part of the Extreme Science and Engineering Discovery Environment program. RDAV is a partnership between the National Institute for Computational Sciences, Oak Ridge National Laboratory, Lawrence Berkeley National Laboratory, the University of Wisconsin, and the National Center for Supercomputing Applications at the University of Illinois.

Contact: Kalev Leetaru
