Sunday, September 22, 2019
Washington Post, An Alarming Number of Scientific Papers Contain Excel Errors:
A surprisingly high number of scientific papers in the field of genetics contain errors introduced by Microsoft Excel, according to an analysis recently published in the journal Genome Biology.
A team of Australian researchers analyzed nearly 3,600 genetics papers published in a number of leading scientific journals — like Nature, Science and PLoS One. As is common practice in the field, these papers all came with supplementary files containing lists of genes used in the research.
The Australian researchers found that roughly 1 in 5 of these papers included errors in their gene lists that were due to Excel automatically converting gene names to things like calendar dates or random numbers. ...
[T]he researchers note that there's no way to permanently disable automatic date formatting within Excel. Researchers still have to remember to manually format columns to "Text" before you type anything in new Excel sheets — every. single. time. ...
Genetics isn't the only field where a life's work can potentially be undermined by a spreadsheet error. Harvard economists Carmen Reinhart and Kenneth Rogoff famously made an Excel goof — omitting a few rows of data from a calculation — that caused them to drastically overstate the negative GDP impact of high debt burdens. Researchers in other fields occasionally have to issue retractions after finding Excel errors as well. ...
[O]ne perfectly free spreadsheet program did not have any issues storing the gene names as typed — Google Sheets.
For the time being, the only fix for the issue is for researchers and journal editors to remain vigilant when working with their data files. Even better, they could abandon Excel completely in favor of programs and languages that were built for statistical research, like R and Python.