Thursday, January 15, 2026
HomeHealth & ClimateResearch Roundup: Insights on Declining Neighborhoods, Electronic Health Records, and Gun Trafficking

Research Roundup: Insights on Declining Neighborhoods, Electronic Health Records, and Gun Trafficking

Revolutionizing Genetic Risk Prediction: Yale Researchers Unveil EEPRS Using Electronic Health Records

Yale Researchers Unveil Groundbreaking Method to Enhance Genetic Risk Prediction Using Electronic Health Records

New Haven, CT – Researchers at the Yale School of Public Health have introduced a pioneering approach to genetic risk prediction that leverages the extensive, often underutilized data found in electronic health records (EHRs). This innovative method, known as Electronic Health Record Embedding Enhanced Polygenic Risk Scores (EEPRS), aims to provide more accurate and clinically relevant predictions of disease risk by integrating advanced embedding techniques with traditional genome-wide association study (GWAS) data.

The Need for Enhanced Risk Prediction

Current polygenic risk scores (PRS) typically categorize diseases in a binary manner—case or control—thus failing to capture the intricate, multidimensional patterns present in EHRs. These records encompass a wealth of information, including thousands of diagnoses, symptoms, and clinical encounters, which are often overlooked in conventional risk assessment methods. The EEPRS framework addresses this limitation by employing natural language processing (NLP) tools, such as Word2Vec and large language models like GPT, to create numerical representations of clinical phenotypes.

How EEPRS Works

EEPRS utilizes modern embedding techniques to convert complex health data into a format that can be easily analyzed by computers. By applying these methods, the researchers can generate embeddings that reflect nuanced relationships within the data. These embeddings are then integrated into the risk score calculations, relying solely on GWAS summary statistics. This approach allows for a more comprehensive analysis of genetic risk factors, moving beyond the simplistic binary classifications of traditional PRS.

In evaluations conducted across 41 traits in the UK Biobank, EEPRS demonstrated superior performance compared to single-trait PRS methods. Notably, the most significant improvements were observed in cardiovascular-related phenotypes, underscoring the method’s potential in identifying subtle genetic signals that could inform early-risk identification.

Advancements in the EEPRS Framework

The research team also introduced two enhancements to the EEPRS methodology: EEPRS-optimal and MTAG-EEPRS. The EEPRS-optimal variant employs cross-validation techniques to select the most effective embedding strategy for each trait, optimizing prediction accuracy. Meanwhile, MTAG-EEPRS serves as a multi-trait extension, further enhancing the robustness of the predictions.

Publication and Expert Insights

The findings were published in The American Journal of Human Genetics, with Dr. Hongyu Zhao, PhD, serving as the corresponding author. Dr. Zhao, who holds the Ira V. Hiscock Professorship of Biostatistics and is a Professor of Genetics and Statistics and Data Science, emphasized the transformative potential of this research.

Lead author Leqi Xu, a doctoral candidate in biostatistics, remarked, “By capturing the nuanced relationships embedded in electronic health records, EEPRS allows us to build more powerful and interpretable genetic risk models that reflect the true complexity of human health.” This statement highlights the framework’s ability to enhance the granularity of genetic risk assessments.

Implications for Precision Medicine

If widely adopted, the EEPRS framework could significantly accelerate the field of precision medicine. By uncovering subtler genetic signals and improving early-risk identification across a diverse array of diseases, this innovative approach holds the promise of transforming how healthcare providers assess and manage patient risk.

As the healthcare landscape continues to evolve, the integration of advanced data analytics and genetic research will be crucial in developing more personalized treatment strategies. The EEPRS method represents a significant step forward in this endeavor, paving the way for a future where genetic risk assessments are not only more accurate but also more reflective of the complexities of human health.

Journal Reference

Xu, Leqi et al. (2025). Improving polygenic risk prediction performance by integrating electronic health records through phenotype embedding. The American Journal of Human Genetics. DOI: 10.1016/j.ajhg.2025.11.006.

For further information, please visit the Yale School of Public Health website or access the full study through the journal link provided.

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular