Research Applications with Harmonized Variables from the Framingham, MESA, ARIC, and REGARDS Studies | Duke Clinical and Translational Science Institute

November 6, 2024

12:00 pm to 1:00 pm

Virtual

More information

Event sponsored by:

AI Health

+DataScience (+DS)

Biostatistics and Bioinformatics

Computer Science

CTSI CREDO

Duke Clinical and Translational Science Award (CTSA)

Electrical and Computer Engineering (ECE)

Contact:

None

Speaker:

Presented by: Chuan Hong, PhD; Assistant Professor of Biostatistics & Bioinformatics, Duke University School of Medicine Pratheek Mallya, MS; Product Development Manager, Data Science; American Heart Association Matt Engelhard PhD, MD; Assistant Professor of Biostatistics & Bioinformatics; Duke University School of Medicin Moderated by: Michael Pencina, PhD; Vice Dean for Data Science, Duke University School of Medicine, Chief Data Scientist for Duke Health, and Director of Duke AI Health Jennifer Hall, PhD, FAHA; Chief of Data Science and Co-Director of the Institute for Precision Cardiovascular Medicine; American Heart Association

Research in stroke risk prediction and prevention is enhanced by the inclusion of a broad range of data from different patient cohorts. Integrating and harmonizing multiple data sources increases generalizability, sample size, and representation of understudied populations-strengthening the evidence for the scientific questions being addressed. In an AI Health Virtual Seminar presented earlier this year, researchers from Duke AI Health and the American Heart Association (AHA) shared the open metadata repository they developed for the harmonization of stroke risk prediction variables from four large, National Institutes of Health (NIH)-funded cohort studies: REGARDS (Reasons for Geographic and Racial Differences in Stroke), FHS (Framingham Heart Study), MESA (Multi-Ethnic Study of Atherosclerosis), and ARIC (Atherosclerosis Risk in Communities). In this follow-up seminar, leading researchers from Duke AI Health and AHA will present new methodologies and results from studies that were conducted with the harmonized dataset. Chuan Hong, Assistant Professor of Biostatistics & Bioinformatics; Duke University School of Medicine, will present a learning network for cohort-to-EHR variable harmonization based on semantic learning. Pratheek Mallya, Product Development Manager, Data Science; American Heart Association, will introduce a technique using natural language processing (NLP) models to automatically harmonize and standardize variable descriptions from three different stroke data cohorts and compare the performance of the proposed method with a baseline logistic regression model. Matt Engelhard, Assistant Professor of Biostatistics & Bioinformatics; Duke University School of Medicine, will present an AI model for stroke risk prediction designed to make predictions more similar.