We have all been there. You’ve run the experiments, you’ve cleaned up the spreadsheets, and it’s time to sit down and analyze your results. Now for the big decision: what is the best test to use?
Hmm, an ANOVA would make sense, since the variable is continuous and there are multiple treatment groups (assuming the data have met the appropriate assumptions). When it’s time to press “run,” you realize you could analyze the results as a percent change from baseline, or you could just compare final endpoints, or you could try to use a general linear model with baseline as a covariate. *pulls some hair out*
I think most scientists would agree that it’s best practice to decide on which statistical tests to use before beginning the experiment. But there can be other things to think about if the statistical analysis plan isn’t fleshed out until after the study is complete. “Oops, now that I think about it, I really would have liked to account for X variable, but I didn’t measure it.” And how often do we generate data that we didn’t plan for? What do we do with that data?!
Enter statistical consultant on a white horse. With a little assuming here and some adjusting there, voila! You have a p value. Of course you have a p value. You will always get a p value. Is it as meaningful as it could have been?
Even though we had a statistical consultant on our research team while I was in grad school, it was my job as the lead on the project to write out the statistical analysis plans, and in retrospect I’m really glad it was. I learned so much about statistics and avoided a lot of mistakes in doing so, since I was able to hash out a lot of the details with my lab mates before we started the study. We even changed questionnaires (with IRB approval, of course) at the last minute because we realized some of the data wasn't being collected at the right time points, all thanks to my wonderful adviser forcing me to write out our statistical analysis plan.
Some statistical analysis plans are incredibly detailed: “if it meets this assumption but not that one, then this test will be used. The data will be fitted to this covariance structure, but if it doesn’t meet this assumption after that, then this structure will be applied.”
Do you honestly think about all of these things before you decide how the data will be collected? If not, join the club! However, it can be very important to get a second set of eyes looking over a proposal to make sure the experimental design has the appropriate controls (easier said than done, sometimes!) and will account for important possible confounders. A colleague can sometimes offer their time, or you could employ a statistician or experimental design consultant who will consider these decisions from a different point of view, one that may ultimately save your research study.
Bottom line: save yourself the headaches and anxiety, and plan all of this out before the experiment begins. And make back-up plans, predicting what may go wrong in the data collection. When I was in grad school and began to mentor undergraduate students in their thesis projects, I realized that it can be very helpful to draw out the graphs of the expected results. That way, I could visually see what test needed to be done and if the representation of the graph made sense (and if the data was being collected in the right way!). I have actually had the privilege recently to help a few friends with their data analyses, and I’ve continued to do this very thing.
Once you’re staring at your data, it is easy to get really lost, really fast. Data can be analyzed any of a million different ways. In the name of good science, and of preserving your emotional health, consider seeking the help of a statistician, or at the very least consulting with a peer for experimental design advice, to hammer out your statistical analysis plan ahead of time.