How Wikipedia reading habits can successfully predict the spread of disease

Elahe Izadi in the Washington Post:

WikipediaPeople's Internet usage has opened a new door for predictive data. There are already some tools out there, such as Google Trends, which tries to “nowcast,” or show what's happening right now with the spread of certain diseases in the world. There have been studies, too, on whether Twitter can accurately predict how a disease is spreading.

But getting access to Google Trends or Twitter data is not always easy — or cheap. So a team of mathematicians, biologists and computer scientists got together to see if they could use something that's completely open and free: Wikipedia.

As it turns out, they could accurately forecast how influenza and dengue spread based purely on people's reading habits of Wikipedia articles. Last week, they showed how their algorithm could predict flu season in the United States. The full results of their research are published in this week's PLOS Computational Biology

Researchers looked at seven diseases and 11 countries over a period of three years, starting in 2010, and compared page views on Wikipedia articles about those diseases to official data from health ministries. By looking at readers' habits, they successfully predicted the spreads of influenza in the United States, Poland, Thailand and Japan and dengue in Brazil and Thailand at least 28 days in advance.

Read the rest here.