I was at the ISBE in Exeter last week, and one thing that I noticed was that several people were plotting their categorical data in ways that showed all the data, such as jittered scatter plots (sometime with model predictions or means, sometimes with errors around that prediction or mean). I have been trying recently to move towards using figures that show more of the data, like boxplots, but really like the idea of showing all the data. In my talk, however, I (lazily) used the figures I had used previously, which were all mean-plus-error bars.
Then Tom Houslay tweeted this:
And I felt bad.
So I had a go at replotting the figure from my final slide in different ways. As the original (means and standard error), as a box plot, and as the raw data with a line for the mean (although this could be the prediction from the model instead). This shows the number of neighbours each fish had both before (white) and after (grey) a startle stimulus, in clear and turbid water.
The latter two are definitely much more informative about the data. There’s maybe a bit too much going on in the last one (each category has 120 data points) but I am definitely going to use this approach in my next paper and encourage/tell my students to do the same.
The thing that is masked by all of these, however, is that the 100 data points are actually from 12 shoals of 10 fish, and it should be possible to display this too (it’s in the stats, don’t worry).