I was talking to colleague the other day who was interested in getting into computer programming and more data projects. He asked where the best place to start was.
My gut reaction was to tell him to learn the basics of spreadsheets. Almost all of the data I have used with projects -- whether they end up being a map or a graphic -- were initially set up in spreadsheet form.
Most of us are familiar with spreadsheets and have likely worked with Microsoft Excel. Agencies both big and small often gather spreadsheets of useful information for us to use with our stories.
The problem is most of the time the data isn't formatted right or contains inaccuracies like misspellings. That is where the magic of spreadsheet formulas can come in to help organize your data.
Here's a few resources you might find handy for working with spreadsheets.
This walkthrough is great because it is written directly for journalists. It is also intended for beginners so those with no spreadsheet knowledge will be able to keep up. Finally, the walkthrough is intended for both print and online journalists. So journalists who have no intention of making visualizations will still find a handful of features in Excel to help them with their day-to-day reporting.
This PDF provided by Mary Jo Webster at the St. Paul Pioneer Press is a great, concise list of Excel formulas she uses all the time. It includes formulas on how to format dates, run sums and even run if-else statements in Excel. It's one of my favorite resources for Excel and definitely worth bookmarking.
Another great resource for Excel related things is this website called Easy Excel. Check it out.
3. Google Docs - Spreadsheet resources
Not a fan of Excel or don't have it installed on your work computer? Fortunately you can make a spreadsheet with Google by logging into Google Drive and clicking "Create > Spreadsheet." The best part is spreadsheets you create with Google can be accessed from any computer as long as you log in with your Google account.
Here's some resources for getting started with Google spreadsheets:
- Knight Digital Media Center - Spreadsheets
- Canadian Journalism Project - Data journalism basics with Google spreadsheets
- National Center for Business Journalism - Magic with Google Spreadsheets
4. Google Refine resources
Sometimes you need more than just spreadsheet formulas to clean dirty data. That's where the powerful Google Refine program can come into play. The program was designed to clean dirty data by finding inconsistencies in your spreadsheets. It can also help you sort data, add to data, transform it from one service to another and much, much more.
Here's some resources you might find handy:
- Dan Nguyen - Google Refine for Investigative Journalism
- Tom Meagher - Clean data is the best weapon against the monkey insurrection (via Chris Keller)
- Google Refine introduction videos
Still stuck? Fortunately, there is a wonderful community of computer-assisted reporters who are more than willing to help others out. If you want great information on spreadsheets or any other data journalism topic, check out the National Institute of Computer-Assisted Reporting's email list. Questions on Excel come up almost every day.
Have any other useful resources not listed here? E-mail me at email@example.com.