Mining data, like any resource, can be dirty and difficult work. Plenty of heavy lifting.
The end result, however, can be precious.
Where can you start? Is there a surefire topic that will help get the data mine pumping?
During a recent presentation in the GateHouse Media Newsroom Development Professional Development Series, Jill Riepenhoff and Mike Wagner of the Columbus Dispatch shared some tips for data journalism strategy as well as some of their greatest hits.
HERE’S AN IDEA
While every community is different, here’s what Riepenhoff referred to as “low-hanging fruit,” things that should be accessible to almost every newsroom: inspections. These reports are easy to get, they’re typically handled by the county or state (so records are centralized and homogenous), and processing year over year data isn’t too overwhelming.
The pair suggest getting inspection reports for:
• Pet stores
• Stadium food vendors
• Grocery stores
• Gas stations
• Nail salons
Once you get the data for these inspection reports, spotting trends and forming story ideas comes easier. Riepenhoff and Wagner consistently work with such spreadsheets at The Dispatch and wade through sets from time to time, either for packages they have already been working on, or for new story ideas.
KEEP IT CLEAN
Using an Excel or Google spreadsheet isn’t easy right off the bat. That’s why both reporters suggested playing with data sets in various ways, to get more comfortable. In fact, Riepenhoff admitted that she used Excel to track Christmas presents, giving her a way to keep track of who she’d purchased things for.
And not all data comes in clean, both reporters admit. Some sets have missing or duplicate records, while others have no standardization. Often dates are entered differently (sometimes with dashes, other times without) and that means some processing needs to be done.
Also, data doesn’t solve anything on its own.
“You need to make sure you know what questions can and can’t be answered,” Riepenhoff said.
FACE THE ISSUE
Once a reporter finally gets a data set together, it’s easy to fall in love with the numbers. While these facts can create the spine for a great package, what still makes these pieces resonate is a name or face.
For example, after pulling a huge set of numbers for an investigative series on suicides in Ohio, Riepenhoff, Wagner and fellow reporter Lori Kurtzman found a specific example to shed light on the issue, a young woman named Amy Luxenburger who fell through the cracks of the state’s mental health system and took her own life.
“Amy was the face that pulled the whole series together,” Wagner said. “It gave readers a person to identify with.”