Metabolon Logo
Metabolon Logo

Metabolomic Profiling and Population-scale Metabogenomics

Metabolomic Profiling and Population-scale Metabogenomics

The discovery of DNA’s structure triggered the genomic revolution. Technologies were rapidly developed that could sequence DNA quickly and cost-effectively. The goal was to enhance our understanding of the molecular basis of disease and improve overall human health.1 Parallel advances in RNA and protein profiling fueled the focus on gene, transcript, and protein influence on disease causation, diagnosis, and therapeutic development. It made sense to focus on these influences since the central dogma of molecular biology is the foundation of functional genomics which connects genes to phenotypes. However, a gap in the central dogma is the integral role of small molecules—endogenous, microbiome, and environmental sources exerting significant influence on gene, transcript, and protein expression and functions. Small molecules represent the missing “omic” within the “cause” and “effect” outcomes in functional genomic studies. This information gap attributable to small molecules was first described as an important “missing link” in the central dogma,2 spurring significant investments in profiling and the use of metabolomics in scientific and clinical research.

Metabolites Fill Critical Information Gaps in Functional Genomics

Small molecules are prerequisites for many fundamental functions in biological systems; they are the building blocks for DNA, RNA, and protein synthesis. They drive energy production and influence the epi-transformation of DNA (epi-genetic), RNA (epi-transcriptomic), and post-translational modifications (PTM) of proteins (epi-proteomic). These transformations have a significant impact on biological response, broadly influencing the phenotype. Metabolites generated within biochemical pathways represent the intermediate phenotype3 and link the gene, transcript, protein, microbiota, and external environmental factors to an observed phenotype and/or clinically relevant endpoint. Put simply, metabolites and lipids have a symbiotic relationship with DNA and RNA on affecting host biology.

Furthermore, evidence suggests the presence and concentration of metabolites impacts the active or dormant state of genes within a sequence.4 The addition of metabolomics to functional genomic studies fills critical information gaps linking observed genomic changes to phenotypic outcomes, i.e., gaps in the “cause” and “effect” information continuum.

Metabolomics is Ready for Utilization

Over the past decade, technologies supporting large-scale profiling of metabolomics have matured for scalable implementation and utility ranging from discovery research to longitudinal monitoring of health, wellness, disease onset, and outcomes in population health cohorts.5-7 Despite these advances, metabolomic profiling, utility, and adoption within the framework of major scientific investigations are underappreciated and often overlooked. Metabolomics is an important tool that, when incorporated into functional genomic investigations, fills important and insightful information gaps not readily addressed by the other molecular modalities.8-10

Metabogenomic analysis enables the use of well-curated metabolites and associated biochemical pathways for the functional gene annotation.

Gene Annotation is Based on Metabolomics

A compelling rationale for the inclusion of metabolomic profiling within molecular and functional genomic studies for biological and functional insights is found in the construction of KEGG, MetaCyc, and related databases supporting the functional annotation of genes. Some of the earliest gene annotations were the result of linking newly discovered genes to an established framework of biochemical pathways associated with interpretable functional endpoints (phenotype). Pathways for glycolysis and tricarboxylic acid cycle (TCA, Krebs cycle) were some of the earliest characterized biochemical pathways that were part of the metabolic process network created for functional annotation of genes.

The current framework of the biochemical process network is founded on nearly a century of work by biochemists and clinical chemists focused on the metabolism of the parent molecule into its derivatives. Each of the small metabolites within these complex enzymatic networks has a broad spectrum of influences on biological functions. This historical repository of information connecting the enzymatic knowledgebase to protein and eventually to the gene was the basis for the establishment of the Kyoto Encyclopedia of Genes and Genomes (KEGG) in 1995, enabling biological interpretation of sequenced genomes.11 These databases are supported by information from the human metabolomes; the most recent update of the Human Metabolome Database (HMDB) contains comprehensive information on over 200,000 metabolite entries.12

Metabolomics Advances

The rapid advances in gene sequencing technology enabled the sequencing of the first complete human genome in a span of 13 years (1990 – 2003, HGP), outpacing the discovery and structural elucidation of new metabolites and associated biochemical pathways. This continued divergence of advances over two decades between genetics and biochemistry resulted in a perceptible information gap in associating metabolism-based functional attributes with newly identified genes. Metabolomics technologies have matured significantly in the past decade and now include comprehensive untargeted metabolomic profiling of known metabolites and a significant number of predicted metabolites with unknown structures.5 The remarkable success of metabolomic profiling in the diagnosis of inborn error of metabolism and rare diseases with no known genomic cause provides a convincing argument for the use of untargeted metabolomic profiling in these cases.13-15 Rapid advances in computational capabilities and data analytics, improved statistical modeling, and machine learning have enabled a gradual transition in closing the gap on predictions of linking genes to metabolites within metabolic pathway networks and their potential role in health and disease.6,7,16 These advances are highlighted by the most recent updates in databases such as KEGG, MetaCyc, and others.11,17 Gene ontology (GO) remains the most widely used computational resource for linking genes to phenotypes.18 Despite these advancements, an estimated 30% of genes identified have either unknown functions or are annotated based on predicted function and lack experimental validation.19 Furthermore, the automation of standard annotation processes has major limitations in accurately capturing the biological complexity, multimodality, and redundancies within systems networks.20 Since a significant number of unknown, predicted, or mis-annotated genes encode for potential enzymes, untargeted metabolomic profiling serves as a powerful tool for elucidation of their activity and relationship to physiological function.21

Population Scale Metabogenomics: Mapping Genetic Determinants of Metabotypes to Understand Mechanism of Action

The value of gene-wide association studies (GWAS) is the ability of well-designed studies to detect genetic variant drivers underlying diseases for potential risk prediction, i.e., identify “cause” factors within large populations.22 GWAS have successfully identified disease-associated variants over the last 15 years, even as samples sizes and study design scope have increased and statistical methods have advanced.22,23 However, the presence of genetic markers associated with disease-specific phenotypic traits, such as those represented by polygenic risk scores, has not been significantly useful in population screening and risk prediction.24,25

Small molecule metabolite panels have served as useful functional markers of good health or a harbinger of impending system dysfunction and disease, e.g., blood creatinine and/or blood urea nitrogen as markers of kidney function. Small molecules serve as good representatives of the cumulative consequence of biological activity, making them excellent candidates to serve as surrogate markers of the “effect” of an observable functional phenotype. Thus, metabolomics provides the opportunity for improved diagnosis, screening, and prediction, specifically when combined with genomics and proteomics.26-28

Population scale metabogenomic analysis successfully identified gene variants associated with metabolic phenotypes and accounted for observed variances within a cohort.6 Furthermore, a functional genomic study that included metabolomics was instrumental in identifying the functionality of “silent” mutations in genes that lacked an obvious overt phenotype.29 Metabolomic-linked GWAS studies have identified genetic determinants influencing metabolites in the blood in distinct ethnic populations.30-32 Furthermore, a metabogenomic study based on genotype-metabotype associations was used for further identification of associated metabolites.33 Most importantly, large metabogenomics studies continue to provide insight into the genetic basis of metabolic individuality and the effects of these associations on diseases ranging from cardiovascular disease, metabolic diseases such as obesity and diabetes, metabolic drivers of cancer, and profiles associated with human health and wellbeing.7,15,34,35 Thus, population-scale metabogenomic studies will continue to add to the existing knowledge base and provide new insights into the genetic underpinnings of biological functions in disease and health.


  1. Nurk S, Koren S, Rhie A, et al. The complete sequence of a human genome. Science. Apr 2022;376(6588):44-53. doi:10.1126/science.abj6987
  2. Schreiber SL. Small molecules: the missing link in the central dogma. Nat Chem Biol. Jul 2005;1(2):64-6. doi:10.1038/nchembio0705-64
  3. Fiehn O. Metabolomics–the link between genotypes and phenotypes. Plant Mol Biol. Jan 2002;48(1-2):155-71.
  4. Li X, Egervari G, Wang Y, Berger SL, Lu Z. Regulation of chromatin and gene expression by metabolic enzymes and metabolites. Nat Rev Mol Cell Biol. Sep 2018;19(9):563-578. doi:10.1038/s41580-018-0029-7
  5. Kell DB, Oliver SG. The metabolome 18 years on: a concept comes of age. Metabolomics. 2016;12(9):148. doi:10.1007/s11306-016-1108-4
  6. Gieger C, Geistlinger L, Altmaier E, et al. Genetics meets metabolomics: a genome-wide association study of metabolite profiles in human serum. PLoS Genet. Nov 2008;4(11):e1000282. doi:10.1371/journal.pgen.1000282
  7. Surendran P, Stewart ID, Au Yeung VPW, et al. Rare and common genetic determinants of metabolic individuality and their effects on human health. Nat Med. Nov 2022;28(11):2321-2332. doi:10.1038/s41591-022-02046-0
  8. Rhee K. Minding the gaps: metabolomics mends functional genomics. EMBO Rep. Nov 2013;14(11):949-50. doi:10.1038/embor.2013.155
  9. Bino RJ, Hall RD, Fiehn O, et al. Potential of metabolomics as a functional genomics tool. Trends Plant Sci. Sep 2004;9(9):418-25. doi:10.1016/j.tplants.2004.07.004
  10. Griffin JL. The Cinderella story of metabolic profiling: does metabolomics get to go to the functional genomics ball? Philos Trans R Soc Lond B Biol Sci. Jan 29 2006;361(1465):147-61. doi:10.1098/rstb.2005.1734
  11. Kanehisa M, Furumichi M, Tanabe M, Sato Y, Morishima K. KEGG: new perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res. Jan 4 2017;45(D1):D353-d361. doi:10.1093/nar/gkw1092
  12. Wishart DS, Guo A, Oler E, et al. HMDB 5.0: the Human Metabolome Database for 2022. Nucleic Acids Res. Jan 7 2022;50(D1):D622-d631. doi:10.1093/nar/gkab1062
  13. Liu N, Xiao J, Gijavanekar C, et al. Comparison of Untargeted Metabolomic Profiling vs Traditional Metabolic Screening to Identify Inborn Errors of Metabolism. JAMA Netw Open. Jul 1 2021;4(7):e2114155. doi:10.1001/jamanetworkopen.2021.14155
  14. Jans JJM, Broeks MH, Verhoeven-Duif NM. Metabolomics in diagnostics of inborn metabolic disorders. Current Opinion in Systems Biology. 2022/03/01/ 2022;29:100409. doi:
  15. Yin X, Chan LS, Bose D, et al. Genome-wide association studies of metabolites in Finnish men identify disease-relevant loci. Nat Commun. Mar 28 2022;13(1):1644. doi:10.1038/s41467-022-29143-5
  16. Ferreira EA, Veenvliet ARJ, Engelke UFH, et al. Diagnosing, discarding, or de-VUSsing: A practical guide to (un)targeted metabolomics as variant-transcending functional tests. Genet Med. Jan 2023;25(1):125-134. doi:10.1016/j.gim.2022.10.002
  17. Caspi R, Billington R, Keseler IM, et al. The MetaCyc database of metabolic pathways and enzymes – a 2019 update. Nucleic Acids Res. Jan 8 2020;48(D1):D445-d453. doi:10.1093/nar/gkz862
  18. Attrill H, Gaudet P, Huntley RP, et al. Annotation of gene product function from high-throughput studies using the Gene Ontology. Database (Oxford). Jan 1 2019;2019doi:10.1093/database/baz007
  19. Xue B, Rhee SY. Status of genome function annotation in model organisms and crops. Plant Direct. Jul 2023;7(7):e499. doi:10.1002/pld3.499
  20. Danchin A, Ouzounis C, Tokuyasu T, Zucker JD. No wisdom in the crowd: genome annotation in the era of big data – current status and future prospects. Microb Biotechnol. Jul 2018;11(4):588-605. doi:10.1111/1751-7915.13284
  21. Prosser GA, Larrouy-Maumus G, de Carvalho LP. Metabolomic strategies for the identification of new enzyme functions and metabolic pathways. EMBO Rep. Jun 2014;15(6):657-69. doi:10.15252/embr.201338283
  22. Abdellaoui A, Yengo L, Verweij KJH, Visscher PM. 15 years of GWAS discovery: Realizing the promise. Am J Hum Genet. Feb 2 2023;110(2):179-194. doi:10.1016/j.ajhg.2022.12.011
  23. Hindorff LA, Sethupathy P, Junkins HA, et al. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci U S A. Jun 9 2009;106(23):9362-7. doi:10.1073/pnas.0903103106
  24. Hingorani AD, Gratton J, Finan C, et al. Performance of polygenic risk scores in screening, prediction, and risk stratification: secondary analysis of data in the Polygenic Score Catalog. BMJ Med. 2023;2(1):e000554. doi:10.1136/bmjmed-2023-000554
  25. Wald NJ, Old R. The illusion of polygenic disease risk prediction. Genet Med. Aug 2019;21(8):1705-1707. doi:10.1038/s41436-018-0418-5
  26. Saigusa D, Matsukawa N, Hishinuma E, Koshiba S. Identification of biomarkers to diagnose diseases and find adverse drug reactions by metabolomics. Drug Metab Pharmacokinet. Apr 2021;37:100373. doi:10.1016/j.dmpk.2020.11.008
  27. Gowda GA, Zhang S, Gu H, Asiago V, Shanaiah N, Raftery D. Metabolomics-based methods for early disease diagnostics. Expert Rev Mol Diagn. Sep 2008;8(5):617-33. doi:10.1586/14737159.8.5.617
  28. Pietzner M, Stewart ID, Raffler J, et al. Plasma metabolites to profile pathways in noncommunicable disease multimorbidity. Nat Med. Mar 2021;27(3):471-479. doi:10.1038/s41591-021-01266-0
  29. Raamsdonk LM, Teusink B, Broadhurst D, et al. A functional genomics strategy that uses metabolome data to reveal the phenotype of silent mutations. Nat Biotechnol. Jan 2001;19(1):45-50. doi:10.1038/83496
  30. Shin SY, Fauman EB, Petersen AK, et al. An atlas of genetic influences on human blood metabolites. Nat Genet. Jun 2014;46(6):543-550. doi:10.1038/ng.2982
  31. Yu B, Zheng Y, Alexander D, Morrison AC, Coresh J, Boerwinkle E. Genetic determinants influencing human serum metabolome among African Americans. PLoS genetics. 2014;10(3):e1004212. doi:10.1371/journal.pgen.1004212 Accessed 2014/03//.
  39. Feofanova EV, Brown MR, Alkis T, et al. Whole-Genome Sequencing Analysis of Human Metabolome in Multi-Ethnic Populations. Nat Commun. May 30 2023;14(1):3111. doi:10.1038/s41467-023-38800-2
  40. Krumsiek J, Suhre K, Evans AM, et al. Mining the unknown: a systems approach to metabolite identification combining genetic and metabolic information. PLoS Genet. 2012;8(10):e1003005. doi:10.1371/journal.pgen.1003005
  41. Smith CJ, Sinnott-Armstrong N, Cichońska A, et al. Integrative analysis of metabolite GWAS illuminates the molecular basis of pleiotropy and genetic correlation. Elife. Sep 8 2022;11doi:10.7554/eLife.79348
  42. Danzi F, Pacchiana R, Mafficini A, et al. To metabolomics and beyond: a technological portfolio to investigate cancer metabolism. Signal Transduct Target Ther. Mar 22 2023;8(1):137. doi:10.1038/s41392-023-01380-0
Ranga Sarangarajan, Ph.D.
Ranga leads Metabolon’s R&D teams to deliver metabolomics data and insights that expand and accelerate the impact of life sciences research in all its applications, including biopharma and diagnostics.


Share this article

Contact Us

Talk with an expert

Request a quote for our services, get more information on sample types and handling procedures, request a letter of support, or submit a question about how metabolomics can advance your research.

Corporate Headquarters

617 Davis Drive, Suite 100
Morrisville, NC 27560

Mailing Address:
P.O. Box 110407
Research Triangle Park, NC 27709

+1 (919) 572-1721