Support | Portal

Statistical Methods and Terminology

Statistical Calculations

Statistical analyses are performed on the log-transformed batch-normalized, imputed data using Metabolon’s internal pipeline, which uses R (http://cran.r-project.org/) to perform statistical computations via a Jupyter Notebooks user interface. Below are examples of frequently employed significance tests and classification methods followed by a discussion of p- and q-value significance thresholds.

Welch’s two-sample t-test

A Welch’s two-sample t-test is used to test whether two unknown means are different from two independent populations.

This version of the two-sample t-test allows for unequal variances (variance is the square of the standard deviation) and has an approximate t-distribution with degrees of freedom estimated using Satterthwaite’s approximation. We typically use a two-sided test (tests whether the means are different) instead of a one-sided test (tests whether one mean is greater than the other).

Two-way ANOVA

When performing an analysis of variance (ANOVA) significance test, it is assumed that variance is the same across all populations.

In a two-way ANOVA, three statistical tests are typically performed: the main effect of each factor individually and that of the interaction. Suppose we have two factors, A and B, where A represents the genotype and B represents diet in a mouse study. Suppose each of these factors has two levels (A: wild type, knock out; B: standard diet, high-fat diet). In this example, there are four possible combinations (“treatments”): A1B1, A1B2, A2B1, and A2B2. The overall ANOVA F-test yields the p-value for testing whether all four of these means are equal or whether at least one pair is different.

However, we are also interested in the individual effects of genotype and diet. A main effect is a contrast that tests one factor across all levels of the other factor. Hence the A main effect compares (A1B1 + A1B2)/2 vs. (A2B1 + A2B2)/2, and the B-main effect compares (A1B1 + A2B2)/2 vs. (A1B2 + A2B2)/2. The interaction is a contrast that tests whether the mean difference for one factor depends on the level of the other factor, which is (A1B2 + A2B1)/2 vs. (A1B1 + A2B2)/2.

Some sample plots are shown below. The first plot illustrates a B main effect that does not depend on the level of A, so there is no A main effect and no interaction. In the second plot, the mean difference for B is the same at each level of A, and the mean difference for A is the same at each level of B, indicating the absence of a statistical interaction. The final plot illustrates main effects for A and B as well as an interaction: the effect of B depends on the level of A (0 for A1 but 2 for A2); in other words, the effect of diet depends on the genotype. Additionally, the interpretation of the main effects depends on whether there is an interaction.

Figure1 MainEffect

p-values

For statistical significance testing, p-values are provided. The lower the p-value, the greater the evidence that the null hypothesis (typically that two population means are equal) is false. If “statistical significance” is declared for p-values less than 0.05, then 5% of the time, the incorrect conclusion that the means are different when actually they are the same is made.

The p-value is the probability that the test statistic is at least as extreme as observed in this experiment, given that the null hypothesis is true. Hence, the more extreme the statistic, the lower the p-value and the more evidence the data give against the null hypothesis.

q-values

A significance level of 0.05 is the false positive rate when there is one test. However, for a large number of tests, false positives need to be accounted for. There are different methods to correct for multiple tests. The oldest methods are family-wise error rate adjustments (Bonferroni, Tukey, etc.), but these tend to be extremely conservative for very large numbers of tests.

With gene arrays, using the False Discovery Rate (FDR) is more common. The family-wise error rate adjustments provide a high degree of confidence that there are zero false discoveries. However, with FDR methods, a small number of false discoveries can be accounted for. The FDR for a given set of compounds can be estimated using the q-value1.

To interpret the q-value, the data must first be sorted by the p-value, then the significance cutoff (typically p < 0.05) must be chosen. The q-value gives the false discovery rate for the selected list (i.e., an estimate of the proportion of false discoveries for the list of compounds whose p-value is below the significance cutoff). In Table 1 below, if the whole list is declared significant, then the false discovery rate is approximately 10%. If everything from Compound 079 and above is declared significant, then the false discovery rate is approximately 2.5%.

Figure2 QValues

Table 1. Example of q-value interpretation.

Instrument and Process Variability

Instrument variability is determined by calculating the median relative standard deviation (RSD) for the internal standards that are added to each sample prior to injection into the mass spectrometers. Overall process variability is determined by calculating the median RSD for all endogenous metabolites (i.e., non-instrument standards) present in 100% of the Client Matrix samples, which are technical replicates of pooled client samples. RSD values can be found within the Heatmap Excell file, downloaded from the “Data &Integration” tab of the portal.

References

  1. Storey J and Tibshirani R. Statistical significance for genomewide studies. Proc Natl Acad Sci USA 2003;100(16):9440-9445.

See how Metabolon can advance your path to preclinical and clinical insights

Contact Us

Talk with an expert

Request a quote for our services, get more information on sample types and handling procedures, request a letter of support, or submit a question about how metabolomics can advance your research.

Corporate Headquarters

617 Davis Drive, Suite 100
Morrisville, NC 27560

Mailing Address:
P.O. Box 110407
Research Triangle Park, NC 27709

+1 (919) 572-1721

References

1. Zgoda-Pols, J.R., et al., Metabolomics analysis reveals elevation of 3-indoxyl sulfate in plasma and brain during chemically-induced acute kidney injury in mice: investigation of nicotinic acid receptor agonists. Toxicol Appl Pharmacol, 2011. 255(1): p. 48-56.

2. Bryant, J.A., et al., The impact of an oral purified microbiome therapeutic on the gastrointestinal microbiome. Nat Med, 2026. 32(1): p. 186-196

3. McGovern, B .H., et al., SER-109, an Investigational Microbiome Drugto Reduce Recurrence After Clostridioides difficile Infection: Lessons Learned From a Phase 2 Trial. Clin Infect Dis, 2021. 72(12): p. 2132-2140.

4. Feuerstadt, P., et al., SER-109, an Oral Microbiome Therapy for Recurrent Clostridioides difficile Infection. N Engl J Med, 2022. 386(3): p. 220-229.

5. Hu, Z., et al., Targeted metabolomics reveals novel diagnostic biomarkers for colorectal cancer. Mol Oncol, 2025. 19(6): p. 1737-1750.

6. Butler, F.M., et al., Vegetarian Dietary Patterns and Diet-Related Metabolites Are Associated With Kidney Function in the Adventist Health Study-2 Cohort. J Ren Nutr, 2025.

7. Stanford, J., et al., Metabolomic Profiling and Diet Quality Scoring in a Randomized Crossover Trial of Healthy and Typical Dietary Patterns. Mol Nutr Food Res, 2025 . 69(23): p. e70271.

8. O’Connor, L.E., et al., Metabolomic Profiling of an Ultraprocessed Dietary Pattern in a Domiciled Randomized Controlled Crossover Feeding Trial. J Nutr, 2023. 153(8): p. 2181-2192.

9. Fritsch, D.A., et al., Microbiome function underpins the efficacy of a fiber-supplemented dietary intervention in dogs with chronic large bowel diarrhea. BMC Vet Res, 2022. 18(1): p. 245.

10. Leal, L.N., et al., Preweaning nutrient supply improves lactation productivity and reduces the risk of culling in Holstein cows. J Dairy Sci, 2025. 108(6): p. 5875-5888.

11. Ahsin, M., et al., Soil and pasture health underlie improved beef nutrient density determined by untargeted metabolomics in Southern US grass finished beef systems. NPJ Sci Food, 2025. 9(1): p. 151.

12. Yin, W., et al., Plasma lipid profiling across species for the identification of optimal animal models of human dyslipidemia. J Lipid Res, 2012. 53(1): p. 51-65.

13. Porter, F .D., et al., Cholesterol oxidation products are sensitive and specific blood-based biomarkers for Niemann-Pick C1 disease. Sci Transl Med, 2010. 2(56): p. 56ra81.

14. Needham, B .D., et al., Plasma and Fecal Metabolite Profiles in Autism Spectrum Disorder. Biol Psychiatry, 2021. 89(5): p. 451-462

15. Li, C., et al., Estradiol and mTORC2 cooperate to enhance prostaglandin biosynthesis and tumorigenesis in TSC2-deficient LAM cells. J Exp Med, 2014. 211(1): p. 15-28.

16. Green, P.G., et al., Metabolic flexibility and reverse remodelling of the failing human heart. Eur Heart J, 2025. 46(25): p. 2422-2433.

17. Maekawa, H., et al., SGLT2 inhibition protects kidney function by SAM-dependent epigenetic repression of inflammatory genes under metabolic stress. J Clin Invest, 2025. 135(19).

18. Wu, D., et al., Integrated screens reveal that guanine nucleotide depletion, which is irreversible via targeting IMPDH2, inhibits pancreatic cancer and potentiates KRAS inhibition. Gut, 2026.

19. Schwerdtfeger, L.A., et al., Gut microbiota and metabolites are linked to disease progression in multiple sclerosis. Cell Rep Med, 2025. 6(4): p. 102055.

20. Wu, H., et al., Microbiome-metabolome dynamics associated with impaired glucose control and responses to lifestyle changes. Nat Med, 2025. 31(7): p. 2222-2231.

21. Jacobs, J.P., et al., Cognitive behavioral therapy for irritable bowel syndrome induces bidirectional alterations in the brain-gut-microbiome axis associated with gastrointestinal symptom improvement. Microbiome, 2021. 9(1): p. 236.

22. Pietzner, M., et al., Plasma metabolites to profile pathways in noncommunicable disease multimorbidity. Nat Med, 2021. 27(3): p. 471-479.

23. Faquih, T.O., et al., Robust Metabolomic Age Prediction Based on a Wide Selection of Metabolites. J Gerontol A Biol Sci Med Sci, 2025. 80(3).

24. Scherer, N., et al., Coupling metabolomics and exome sequencing reveals graded effects of rare damaging heterozygous variants on gene function and human traits. Nat Genet, 2025. 57(1): p. 193-205.

25. Holmes, Z.C., et al., Untargeted metabolomic analysis of human milk from healthy mothers reveals drivers of metabolite variability. Sci Rep, 2024. 14(1): p. 20827.

26. Titz, B., et al., Implications of Ocular Confounding Factors for Aqueous Humor Proteomic and Metabolomic Analyses in Retinal Diseases. Transl Vis Sci Technol, 2024. 13(6): p. 17.

27. Bloom, S.M., et al., Cysteine dependence of Lactobacillus iners is a potential therapeutic target for vaginal microbiota modulation. Nat Microbiol, 2022. 7(3): p. 434-450.

28. Leimer, E.M., et al., Lipid profile of human synovial fluid following intra-articular ankle fracture. J Orthop Res, 2017. 35(3): p. 657-666.