Often, the Excel files that my collaborators send me include all kinds of calculations and graphs. I feel strongly that your primary data file should contain just the data and nothing else: no calculations, no graphs.

If you’re doing calculations in your data file, that likely means you’re regularly opening it and typing stuff into it. Doing so incurs some risk that you’ll accidentally type junk into your data.

(Has this happened to you? You open an Excel file and start typing and nothing happens, and then you select a cell and you can start typing. Where did all of that initial text go? Well, sometimes it got entered into some random cell, to be discovered later during data analysis.)

Your primary data file should be a pristine store of data. Write-protect it, back it up, and don’t touch it.

If you want to do some analyses in Excel, make a copy of the file and do your calculations and graphs in the copy.


Next up: Don’t use font color or highlighting as data.

(Previous: Create a data dictionary.)