Tracking the flu with technology and Twitter
POSTED: Wednesday, January 30, 2013 - 6:22am
UPDATED: Wednesday, January 30, 2013 - 1:02pm
Complaining on social networks about being sick might annoy your friends and followers, but it can be useful for tools that track the spread of illnesses.
A new method for filtering tweets, developed by researchers at Johns Hopkins University, could make the real-time data pouring in more accurate.
The Unites States is in the middle of one of its most severe flu seasons in years. Tech companies, universities and health organizations are harnessing the wealth of data from social networks and search engines, in addition to the usual reports from vital statics offices, hospitals, doctors and public health departments, to keep the public informed and better prepare public health workers.
The Centers for Disease Control and Prevention releases a weekly influenza update for the United States that includes stats on people with flu-like symptoms, hospitalizations and deaths. But the detailed information is about two weeks old by the time it comes out.
"There are a lot of gaps in the system that Twitter can fill," said assistant research professor Mark Dredze, who headed up the Johns Hopkins project.
Real-time information like tweets are becoming a popular source of public health information. They can be used to do more than just track outbreaks. Dredze's goal is to get ahead of the curve and actually predict where and when illnesses will spread. This information could be invaluable for public health departments, providing them advanced warnings and time to plan with additional doctors, hospital beds or school closings. Thanks to GPS information for each tweet, the location information gathered from Twitter is more finely detailed than the CDC.
Early warnings are good for regular people as well. They push people to get vaccinated before they catch the flu, and individuals with health issues who might be more vulnerable can take extra precautions.
With an average 340 million tweets a day, Twitter is a firehouse of muddled and misleading information. Taken at face value, keywords would indicate the entire country is suffering from an ongoing fever epidemic of the Bieber variety. Much of the running commentary on Twitter is a reaction to news events, so when a flu epidemic becomes a national news story, the number of people talking about it spikes, regardless of their own health status.
"Most people have just focused on the presence of flu. The very simple thing is you look at Twitter and look at the number of people using the world flu or sick everyday," said Dredze. "The problem with that is if you look a little more closely it doesn't really work."
Dredze's team is using algorithms to correct for these issues, filtering out the noise to isolate the useful information. They were already researching using Twitter to track health issues before this winter's recent outbreak but quickly changed tracks to focus on the flu. The group plans to share the information it gathers with public health officials.
Tracking influenza and other illnesses based on social media isn't new. In a 2011 paper, researchers reported being able to accurately track disease levels for the swine flu outbreak two years earlier by searching for keywords such as flu, vaccine, illness, Tamiflu and pneumonia.
The University of Rochester has turned research about predicting the spread of diseases with social media into a web application called Germ Tracker. The colorful interactive map pulls up geo-tagged Tweets that contain keywords related to illness. If you find a tweet that's clearly miss-labeled (like someone bragging that their new car is "sick"), click a button to let the app know they're not actually sick. It also allows self reporting with a slider you can set to Awful, Sick, Yuck, Meh or Good and options to share specific symptoms.
Sick Weather is a similar project that can pull from Facebook and Twitter to map and show animations of contagious illnesses such as chicken pox, colds and whooping cough, as well as other issues like allergies, stress and depression.
Even as researchers get better at filtering tweets, the social network presents limitations. It isn't an accurate representation of the entire population in terms of age or location. Two of the groups hardest hit by influenza outbreaks, the elderly and children, are the least likely to be live blogging their symptoms.
Google's Flu Trends looks at search terms to create real-time estimates of where the flu is flaring up. Developed in 2009 in collaboration with the Centers for Disease Control, the tool digs through massive amounts of search data (all anonymous) looking for flu-releated searches around the world and maps out the intensity of outbreaks.
"The advantage of using these social media tools and Google is they're much faster than the CDC," said Michael Paul, a doctoral student working on the Johns Hopkins Twitter research. "As an early warning, they're useful to the government when it needs to plan."
Armchair influenza trackers will still get the most detailed information from weekly reports released by the Centers for Disease Control. The reports dig deep into the data gathered from doctors, hospitals and other health officials to outline things like number of deaths, flu-related hospitalizations and a breakdown of strains. The CDC also has its own simple app called FluView that plots the volume of influenza-like illnesses by state.
"Our job is really to figure out what viruses are going around and what effect they're having," said Lynnette Brammer, an influenza epidemiologist at the CDC. "It's more laboratory based; we try to get as close as we can to the viruses."
The CDC teams do check tools like the Google Flu Tracker, and Brammer says that while the results don't always match up, Google is close to the CDC's own findings most of the time. The CDC has slightly different goals than tech tools that track symptoms in real-time. It's drilling down to find out what strains are being reported, see if they are close to the current vaccine and if not, determine if the vaccine needs to be changed to include new virus candidates.