There is a lot of emerging research showing strong relationship between gut microbiome and overall health in humans[1], but what about in cows? For this project we were contacted by a client in the animal and poultry sciences department to help answer the question “is there a difference in the stomach microbiome between calves on an enhanced treatment feed vs calves on the control feed”.

Now, had the quantity of interest in this study been a single measurement, say calf weight or height, then this would have been a simple problem and we could have answered the client’s question within the hour. However, the client was more interested in the relative prescence of DNA sequences, of which there were quite a bit measured (some calves had >100,000 unique sequences).

The client had found previous literature from researchers at another university. For each diet they fit a distribution over the counts of each sequence and found that the two distributions had different parameters and thus the diets caused a difference in microbiome composition. However, my associate and I felt that that was the incorrect approach. Instead, we recommended and implemented a novel solution for the client.

We noted that, if the diets were in fact different, then the relative frequencies of each sequence should be different between the two diets. If that were the case, a model with a different sequence distribution for each diet would fit better than a single model for both diets. When we fit that model (a multinomial distribution over the sequences), we found that to be the case. However, due to the nature of the problem, splitting out the calves into any two groups would likely result in a better fit as the model with two distributions is much more flexible that a single distribution. In order to understand if the effect we were seeing was real, we needed a baseline model improvement, a null model.

Permutation tests are common tests in situations where we might not have an analytical method for estimating the underlying variation in the process. Were this a simple test like a t-test, where we have central limit theory to guarantee that we can use normal distributions, a permutation test would still work, but would not be the most effecient method. The way to conduct a permutation test is to randomly allocate calves into two diets (such that the number of calves in each diet was equal) and compute the model improvement (going from a single distribution to a diet-specific distribution). If we do that a lot, we get an estimate for the distribution of model improvement *if there was no effect due to diet*. We can then compare our observed effect and see how large it is.

The client was elated to discover that the diet did cause differences in gut microbiomes even accounting for change. They were happy with the results and eager to descibe this novel solution which was better grounded in statistical theory than what other researchers were using.

[1] Bull MJ, Plummer NT. Part 1: The Human Gut Microbiome in Health and Disease. Integrative Medicine: A Clinician’s Journal. 2014;13(6):17-22.