Spurious Extrapolations: Novel and unique research abstracts

Last Christmas, BMJ published a funny article exploring the mentions of positive and negative words in research abstracts over the past 40 years. I’ve recreated their research for two of the phrases below — “novel” and “unique.”

word-frequency-research-abstracts

Your eyes aren’t fooling you: Over the past 40 years, researchers have started using the word “novel” so much that it appears in roughly 8% of all published research abstracts on PubMed. “Unique” has similarly grown in use — now used in about 3% of all research abstracts on PubMed — albeit not quite as dramatically.

If you’re interested in the details on how they looked up these phrases, read their supplemental information document.

In the spirit of the Spurious Extrapolations series, I had to ask: If “novel” and “unique” kept growing in use at the same rate they have been, how long would it take until they were used in every research abstract?

Disclaimer: Extrapolating beyond the bounds of a data set is extremely precarious, and most likely wrong. Most statisticians would recommend against extrapolating beyond a few time points outside of a data set.

word-frequency-research-abstracts-extrapolated

To compute these extrapolations, I fit polynomial regressions (Usage_Pct ~ Year + Year2) to the time series and used those models to predict when the terms would reach 100% usage. Shown in the chart above, “novel” will reach 100% usage by 2130 and “unique” by 2674. It’s comforting to know that by 2674, our research will be novel and unique, just like all other research.

Really, these charts are just an exaggeration of an already-silly phenomenon: Funding agencies and journals have placed considerable pressure on academics to perform “novel” research, which has in turn fooled many academics into thinking that some research paths are worthwhile simply because they’re novel. Let’s not forget that many of the greatest breakthroughs were achieved not because they were “novel,” but because they built on the findings of hundreds of scientists from the past. That’s how science works, and that’s the kind of science we should encourage.

Oh, and seriously: Please stop using the word “novel” when describing your research. We know. It’s research.

Dr. Randy Olson is a Senior Data Scientist at the University of Pennsylvania, where he develops state-of-the-art machine learning algorithms with a focus on biomedical applications.

Posted in analysis, data visualization Tagged with: , , ,