DunedinPACNI estimates the longitudinal Pace of Aging from a single brain image to track health and disease
This research complies with all relevant ethical regulations; all study protocols were approved by the relevant ethical review boards. The specific ethical review boards are detailed in the description of each dataset. The premise and analysis plan for this study were preregistered ( All analyses and code were checked for accuracy by an independent analyst. Analyses were conducted on data collected through the Dunedin Study, HCP, ADNI, UKB and BrainLat. The details of each study and dataset are described below.
Data sources
Dunedin Study
Participants are members of the Dunedin Study, a longitudinal investigation of health and behavior in a population-representative birth cohort. The 1,037 participants (91% of eligible births, 48% female) were all people born between April 1972 and March 1973 in Dunedin, New Zealand, who were residents in the province and who participated in the first assessment at age 3 years19. The cohort represented the full range of socioeconomic status in the general population of New Zealand’s South Island and, as adults, matched the New Zealand National Health and Nutrition Survey on key adult health indicators (for example, BMI, smoking and general practitioner visits) and the New Zealand Census of citizens of the same age on educational attainment19,83. Study members are primarily of New Zealand European ethnicity; 8.6% reported Māori ethnicity at age 45 years.
General assessments were performed at birth as well as ages 3, 5, 7, 9, 11, 13, 15, 18, 21, 26, 32 and 38 years, and most recently (completed April 2019) at age 45 years, when 938 of the 997 living study members (94.1%) participated. At each assessment, study members were brought to the Dunedin Study Research Unit at the University of Otago for interviews and examinations. In addition, staff provided standardized ratings; informant questionnaires were sent to people who the study members nominated as people who knew them well, and administrative records were searched. The Dunedin Study was approved by the University of Otago Ethics Committee and study members gave written informed consent before participating.
MRI
As a component of the assessments at age 45 years, study members were scanned using a Siemens MAGNETOM Skyra (Siemens Healthineers) 3T scanner equipped with a 64-channel head/neck coil at the Pacific Radiology Group imaging center in Dunedin, New Zealand. High-resolution T1-weighted images were obtained using an MP-RAGE sequence with the following parameters: repetition time (TR) = 2,400 ms; echo time (TE) = 1.98 ms; 208 sagittal slices; flip angle, 9°; field of view (FOV), 224 mm; matrix = 256 × 256 px; slice thickness = 0.9 mm with no gap (voxel size 0.9 × 0.875 × 0.875 mm); and total scan time = 6 min and 52 s. Three-dimensional (3D) fluid-attenuated inversion recovery (FLAIR) images were obtained with the following parameters: TR = 8,000 ms; TE = 399 ms; 160 sagittal slices; FOV = 240 mm; matrix = 232 × 256 px; slice thickness = 1.2 mm (voxel size 0.9 × 0.9 × 1.2 mm); and total scan time = 5 min and 38 s. Additionally, a gradient echo field map was acquired with the following parameters: TR = 712 ms; TE = 4.92 and 7.38 ms; 72 axial slices; FOV = 200 mm; matrix = 100 × 100 px; slice thickness = 2.0 mm (voxel size 2 mm isotropic); and total scan time = 2 min and 25 s. Of the 938 study members seen at phase 45, 63 declined to participate in MRI scanning, that is, 875 study members completed the MRI scanning protocol. Scanned study members did not differ from other living participants in terms of childhood neurocognitive functioning or childhood socioeconomic status (see the attrition analysis in Extended Data Figs. 1 and 2). Of these 875 study members for whom data was available, four were excluded because of major incidental findings or previous injuries (for example, large tumors or extensive damage to the brain or skull), nine because of missing FLAIR or field map scans, one because of poor surface mapping yielding and one because of missing the Pace of Aging variable. This yielded a final training sample of 860 study members (see Supplementary Fig. 4 for the inclusion details).
Structural MRI data were processed using FreeSurfer v.6.0 (ref. 36). Specifically, T1-weighted images were processed and refined with 3D FLAIR images using the recon-all pipeline.
Pace of Aging
Participants’ pace of biological aging was measured as changes in 19 biomarkers of study members’ cardiovascular, metabolic, pulmonary, kidney, immune and dental systems across ages 26, 32, 38 and 45 years. This measure quantifies participants’ rate of aging in year-equivalent units of physiological decline per chronological year. The average participant experienced 1 year of physiological decline per year, that is, a mean (s.d.) Pace of Aging of 1 (0.3)2. See the ‘Statistical analysis’ section for more details.
Physical functioning
One-legged balance was measured using the unipedal stance test as the maximum time achieved across three trials of the test with eyes closed84,85,86. Gait speed (meters per second) was assessed with the 6-m-long GAITRite Electronic Walkway (CIR Systems) with 2-m acceleration and 2-m deceleration before and after the walkway, respectively. Gait speed was assessed under three walking conditions: usual gait speed (walk at a normal pace from a standing start, measured as a mean of two walks) and two challenge paradigms, dual-task gait speed (walk at a normal pace while reciting alternate letters of the alphabet out loud, starting with the letter ‘A’, measured as a mean of two walks) and maximum gait speed (walk as fast as safely possible, measured as a mean of three walks). Gait speed was correlated across the three walk conditions87. To increase reliability and take advantage of the variation in all three walk conditions (usual gait and the two challenge paradigms), we calculated the mean of the three highly correlated individual walk conditions to generate our primary measure of composite gait speed. The step in place test was measured as the number of times the right knee was lifted to mid-thigh height (measured as the height half-way between the knee cap and the iliac crest) in 2 min at a self-directed pace88,89. Chair rises were measured as the number of stands with no hands completed in 30 s from a seated position88,90. Handgrip strength was measured for each hand (elbow held at 90°, upper arm held tight against the trunk) as the maximum value achieved across three trials using a Jamar digital dynamometer91,92. Analyses using handgrip strength controlled for BMI. Visuomotor coordination was measured as the time to completion of the grooved pegboard test. Scores were reversed so that higher values corresponded to better performance. Physical limitations were measured with the RAND 36-Item Health Survey 1.0 physical functioning scale. Responses (‘limited a lot’, ‘limited a little’, ‘not limited at all’) assessed difficulty with completing several activities (for example, climbing several flights of stairs, walking more than 1 km, participating in strenuous sports). Scores were reversed to reflect physical limitations so that a high score indicates more limitations.
Subjective health and age appearance
We obtained reports about study members’ health and age appearance from three sources: self-reports; informant impressions; and staff impressions. For self-reports, we asked study members about their own impressions of how old they looked: ‘Do you think you look older, younger, or about your actual age?’ Response options were younger than their age, about their actual age, or older than their age. We also asked study members to rate their age perceptions in years: ‘How old do you feel?’ For informant impressions, informants who knew a study member well (94% response rate) were asked: ‘Compared to others their age, do you think they (the study member) look younger or older than others their age? Response options were: ‘much younger’, ‘a bit younger’, ‘about the same’, ‘a bit older’ or ‘much older’. For staff impressions, four members of the Dunedin Study unit staff completed a brief questionnaire describing each study member. To assess age appearance, staff used a seven-item scale to assign a ‘relative age’ to each study member (1 = young-looking, 7 = old-looking). Correlations between self-ratings, informant ratings and staff ratings ranged from 0.34 to 0.52. All reporters rated the study member’s general health using the following response options: excellent, very good, good, fair or poor. Correlations between self-ratings, informant ratings and staff ratings ranged from r = 0.48 to r = 0.55.
Cognitive functioning
The Wechsler Adult Intelligence Scale, Fourth Edition was administered at age 45 years, yielding the adult IQ. In addition to full-scale IQ, the Wechsler Adult Intelligence Scale, Fourth Edition yields indexes of four specific cognitive functional domains: processing speed; working memory; perceptual reasoning; and verbal comprehension. The Wechsler Intelligence Scale for Children-Revised was administered at ages 7, 9 and 11 years. To increase the baseline reliability, the three scores were averaged, yielding the childhood IQ. We measured cognitive decline by studying adult IQ scores after controlling for childhood IQ scores. We focused on change in overall IQ given evidence that aging-related slopes are correlated across all cognitive functions, indicating that research on cognitive decline may be best focused on a highly reliable summary index, rather than focused on individual functions93.
Facial age
Facial age was based on two measurements of perceived age by an independent panel of eight people. First, age range was assessed by an independent panel of four raters, who were presented with standardized (non-smiling) digital facial photographs of study members when they were 45 years old. Raters, who were kept blind to the actual age of study members, used a Likert scale to categorize each study member into a 5-year age range (that is, from 20 to 24 years old and up to 70+ years old). Interrater reliability was 0.77. The scores for each study member were averaged across all raters. Second, relative age was assessed by a different panel of four raters, who were told that all photos were of people aged 45 years old. These raters then used a 7-item Likert scale to assign a ‘relative age’ to each participant (that is, 1 = ‘young-looking’ to 7 = ‘old-looking’). Interrater reliability was 0.79. The measure of perceived age at 45 years (that is, facial age) was derived by standardizing and averaging age range and relative age scores.
HCP
The HCP is a publicly available dataset that includes 1,206 participants with extensive MRI data49. HCP data access is managed by the WU-Minn HCP consortium. All participants provided written informed consent. Specifically, we used data from 45 participants who completed the scan protocol a second time (with a mean interval between scans of approximately 140 days) allowing for the calculation of test–retest reliability. All participants were free of current psychiatric or neurological illness and were between 25 and 35 years of age. The mean age of the HCP test–retest sample analyzed was 30.3 years (s.d. = 3.3 years, range = 22–35 years) at the first time point.
MRI
Structural MRI data were analyzed using the HCP minimal preprocessing pipeline94. Briefly, T1-weighted images were processed using a custom FreeSurfer recon-all pipeline optimized for structural MRI with a higher resolution than 1 mm isotropic. Details on HCP MRI data acquisition have been described elsewhere94.
ADNI
The primary goal of ADNI is to test whether serial MRI, positron emission tomography, other biological markers and clinical and neuropsychological assessments can be combined to measure the progression of neurodegeneration in participants with MCI, AD and CN older adults (adni.loni.usc.edu)95. Cognitive and diagnostic data were downloaded on 12 June 2022. MRI data curated from the Alzheimer’s Disease Sequencing Project collection were downloaded on 7 December 2023. ADNI was approved by the institutional review boards of all the participating institutions. All participants provided written informed consent. The ADNI sample demographic information can be found in Supplementary Table 17.
MRI
T1-weighted scans were collected using either 1.5T or 3T scanners. MRI acquisition parameters varied across ADNI sites and waves; however, the targets for acquisition were isotropic 1-mm3 voxels96. Raw T1-weighted images were processed using longitudinal FreeSurfer v.6.0. Scans were excluded for low quality if they did not have a quality control rating of ‘pass’ from the ADNI investigators or if segmentation failed visual inspection. Scans were also excluded if participants were missing demographic data, such as age, sex or diagnosis (Supplementary Fig. 5). Further details on the MRI methods in ADNI can be found at adni.loni.usc.edu.
Cognitive and behavioral functioning
ADNI participants completed several cognitive and behavioral assessments at the time of scanning. The ADAS-Cog is a structured scale that evaluates memory, reasoning, language, orientation, ideational praxis and constructional praxis97. Delayed word recall and number cancellation are included in addition to the 11 standard ADAS items98. The test is scored for errors, ranging from 0 (best performance) to 85 (worst performance). The MMSE is a screening instrument that evaluates orientation, memory, attention, concentration, naming, repetition and comprehension, and ability to create a sentence and to copy two overlapping pentagons99. The MMSE is scored as the number of correctly completed items ranging from 0 (worst performance) to 30 (best performance). The MoCA is designed to detect people at the MCI stage of cognitive dysfunction100. The scale ranges from 0 (worst performance) to 30 (best performance). The RAVLT is a list learning task that assesses learning and memory. On each of five learning trials, 15 unrelated nouns are presented orally at the rate of one word per second and immediate free recall of the words is elicited. After a 30-min delay filled with unrelated testing, free recall of the original 15-word list is elicited. Both immediate recall and percentage forgotten are used. The LogMem tests I and II (delayed paragraph recall) are from the Wechsler Memory Scale-Revised. Free recall of one short story is elicited immediately after being read aloud to the participant and again after a 30-min delay. The total bits of information recalled after the delay interval (maximum score = 25) are analyzed. The Trail Making Test, Part B, consists of 25 circles, either numbered (one through 13) or containing letters (A through L). Participants connect the circles while alternating between numbers and letters (for example, A to 1, 1 to B, B to 2, 2 to C). Time to complete (300 s maximum) is the primary measure of interest. The FAQ is a self-report measure of instrumental ADLs, such as preparing meals, performing chores, keeping a schedule and traveling outside of one’s neighborhood101. Each unique cognitive testing measure was paired with the participant’s most temporally proximate brain scan within 6 months of cognitive testing.
Cognitive status
ADNI participants were classified into CN, MCI or dementia groups by ADNI study physicians based on subjective memory complaints, multiple neurocognitive and behavioral assessment scores, and level of impairment in ADLs. Complete diagnostic criteria can be found at adni.loni.usc.edu. Each individual scan was categorized according to the most temporally proximate cognitive diagnosis received by that participant.
Education
Education level was measured according to self-reported years of education. For the purposes of visualization in Fig. 5e, participants were grouped according to the following thresholds: less than high school: <12 years; high school: 12 years; some college: 12–15 years; college: 16 years; more than college: >16 years.
UKB
UKB is a UK population-based prospective study of 502,486 participants between the ages of 40 and 69 at baseline assessment102. We analyzed data from 42,583 participants who underwent brain MRI. The data used in these analyses were downloaded in April 2023. UKB was approved by the North West Multi-centre for Research Ethics Committee. All participants provided written informed consent. UKB sample demographic information can be found in Supplementary Table 17.
MRI
MRI methods for UKB have been described in detail elsewhere103. Briefly, MRI data were collected using three identical 3T Siemens Skyra scanners with a 32-channel Siemens head coil. T1-weighted images were obtained using a 3D MP-RAGE with the following parameters: TR = 2,000 ms; inversion time = 880 ms; 208 sagittal slices, matrix = 256 × 256 px; slice thickness = 1 mm with no gap; and total scan time = 4 min and 52 s. Our study made use of imaging-derived phenotypes generated using an image-processing pipeline developed and run on behalf of UKB103. As part of this pipeline, raw T1-weighted images were processed using the cross-sectional FreeSurfer v.6.0. All brain measures used in the cross-sectional analyses presented in this study were derived from the outputs of this FreeSurfer pipeline. We excluded UKB participants with a very low signal-to-noise ratio and highly unusual summary morphometrics indicative of low-quality reconstruction (Supplementary Fig. 6).
To measure change in hippocampal volume in the subset of UKB participants with longitudinal MRI data, we reprocessed all T1-weighted images for this subset of participants using the longitudinal FreeSurfer v.6.0 pipeline104. This allowed us to avoid the known biases that can be introduced by different processing stages of the longitudinal pipeline in different hardware and software environments. Specifically, we reprocessed both time points of each participant’s T1-weighted scans with the cross-sectional recon-all pipeline105. Then, we built an unbiased within-participant template106 using robust, inverse, consistent registration107 and reprocessed each T1-weighted scan through the automated longitudinal pipeline104.
Cognitive functioning
UKB participants completed a battery of cognitive tests at the time of MRI. We investigated cognitive functioning using the following measures: Reaction Time (field ID = 20023), Fluid Intelligence (field ID = 20016), Numeric Memory (field ID = 4282), Trail A (field ID = 6348) and Trail B (field ID = 6350), symbol digit substitution (field ID = 23324), Tower Rearranging (field ID = 21004) and Matrix Completion (field ID = 6373). The details of these cognitive tests have been described elsewhere108.
Frailty and self-reported health
To further investigate aging-related health, we used the Fried Frailty Index53. Briefly, the Fried Frailty Index is based on meeting the criteria for declining functioning across five domains: unintentional weight loss; exhaustion; weakness; physical inactivity; and slow walking speed. Index scores range from 0 to 5, with higher scores indicating greater frailty109. During their imaging visit, UKB participants were also asked to rate their overall health as ‘poor’, ‘fair’, ‘good’ or ‘excellent’. We used these ratings to investigate self-reported overall health (field ID = 2178).
Disease and mortality records
To assess the influence of DunedinPACNI on aging-related disease and mortality risk in UKB, we used variables from algorithmically defined health outcomes. Briefly, algorithmically defined outcomes are generated by combining information from baseline assessments (self-reported medical conditions, operations and medications) with linked data from hospital admissions and death registries. Because of the relatively small number of aging-related disease diagnoses at the follow-up, we defined aging-related morbidity as being diagnosed with myocardial infarction (field ID = 42000), chronic obstructive pulmonary disease (field ID = 42016), dementia (field ID = 42018) or stroke (field ID = 42006). Furthermore, we defined the risk of chronic disease as the emergence of one or more of these diagnoses among participants who were healthy at the time of scanning (that is, baseline). Mortality was quantified during the follow-up from death records (field ID = 40000).
Education, income and ethnicity
To test the association between DunedinPACNI and socioeconomic gradients of health, we tested whether UKB participants differed in DunedinPACNI scores as a function of educational attainment and household income. We grouped participants into three categories according to their self-reported educational qualifications (field ID = 6138) following prior work110. Specifically, these groups were: high (college or university degree); medium (A/AS level or equivalent or O level/GCSE or equivalent); and low (none of these). We also tested whether UKB participants differed in DunedinPACNI scores as a function of household income (field ID = 738).
We also conducted sensitivity analyses while restricting the UKB sample to either only low-income or only non-White participants. We considered participants as having a low income if they reported making less than £18,000 per year in household income. We considered participants to be non-White if they did not report their ethnic background (field ID = 21000) as ‘any other White background’, ‘British’, ‘do not know’, ‘Irish’, ‘prefer not to answer’ or ‘White’.
BrainLat
BrainLat is a multimodal neuroimaging dataset of patients with neurodegenerative diseases and healthy adult controls collected in Argentina, Chile, Colombia, Mexico and Peru63. We analyzed neuroimaging data from 368 individuals who were either cognitively healthy or diagnosed with AD or behavioral variant FTD. The BrainLat study was approved by the institutional ethical boards of each recruitment site. All participants, or their legal representatives, provided written informed consent. The BrainLat demographic data can be found in Supplementary Table 17.
MRI
The MRI methods for BrainLat have been described in detail elsewhere63. Briefly, T1-weighted MP-RAGE scans were collected on either 1.5 or 3T scanners. Acquisition parameters varied across sites, but scans most frequently had isometric 1-mm3 voxels. Scans were downloaded and then processed using FreeSurfer v.6.0. Participants were excluded if they failed the automated FreeSurfer quality metrics or a visual quality check of segmentation output (Supplementary Fig. 7).
Diagnostic classification
All participants included could speak fluent Spanish and had adequate visual and auditory capacity for testing. Participants were classified as CN if they had a modified clinical dementia rating of 0, an MMSE score above 25 and lacked a history of substance abuse, and neurological or psychiatric disorders. Patients were classified into the AD or FTD groups according to the National Institute of Neurological Disorders and Stroke–Alzheimer Disease and Related Disorders working group for probable AD or probable behavioral variant FTD. Diagnosis was supported using appropriate MRI or positron emission tomography imaging when needed63.
Cognitive status
BrainLat participants were evaluated with the MoCA. The MoCA is designed to detect people at the MCI stage of cognitive dysfunction100. The scale ranges from 0 (worst performance) to 30 (best performance).
Statistical analyses
Pace of Aging
The derivation of Pace of Aging has been described elsewhere1,2. Briefly, we measured a panel of the following 19 biomarkers (Fig. 1a) at ages 26, 32, 38, and 45 years: BMI, waist/hip ratio, HbA1c, leptin, blood pressure (mean arterial pressure), cardiorespiratory fitness (VO2max), forced vital capacity ratio (FEV1/FVC), FEV1, total cholesterol, triglycerides, HDL, lipoprotein(a), apolipoprotein B100/A1 ratio, eGFR, blood urea nitrogen, hsCRP, white blood cell count, mean periodontal AL and the number of dental-caries-affected tooth surfaces (tooth decay). To calculate each study member’s Pace of Aging, we first transformed the biomarker values to a standardized scale. For each biomarker at each wave, we standardized values according to the age 26 distribution. Next, we calculated each study member’s slope for each of the 19 biomarkers using a mixed-effects growth model that regressed the biomarker’s level on age. Finally, we combined information from the 19 slopes of the biomarkers using a unit-weighting scheme. We calculated each study member’s Pace of Aging as the sum of age-dependent annual changes in biomarker z-scores. Biomarker standardization was performed separately for men and women.
DunedinPACNI
A schematic of DunedinPACNI model development can be found in Fig. 1. We trained an elastic net regression model to estimate the Pace of Aging from structural neuroimaging phenotypes in 860 Dunedin Study members at age 45 years (for the attrition analysis and inclusion criteria see Extended Data Figs. 1 and 2 and Supplementary Fig. 4). We selected 315 FreeSurfer measures as predictors from the following categories: regional CT, regional cortical SA, regional cortical GMV, regional cortical GWR and ‘ASEG’ volumes (that is, regional subcortical GMV, ventricular volumes and bilateral volume of white matter hypointensities). All cortical data were parcellated according to the Desikan–Killiany Atlas111. Note that although many ADNI scans do not pass quality control (Supplementary Fig. 5), FreeSurfer is a robust segmentation method, especially in healthy individuals112. Four phenotypes from the ‘ASEG’ volumes were excluded because of insufficient variance in the Dunedin Study (left and right white matter hypointensities, left and right non-white matter hypointensities). Model training was performed using the caret package in R. We conducted a grid search across a range of α and λ values. We used 100 repetitions of tenfold cross-validation to estimate model performance in held-out participants. The effect of sex was regressed from the Pace of Aging before model training. To prevent information leak during cross-validation, we regressed sex from each training set and applied the resulting β weights to each test set. This approach ensured that our model only used information from the training set, including covariate regression, when calculating predictions in each test set. We selected optimal tuning parameters according to the highest variance explained and lowest mean absolute error. The optimal tuning parameters were α = 0.214 and λ = 0.100. Using these parameters, we fitted the model to the entire n = 860 sample. The raw elastic net regression model weights can be found in Supplementary Table 18.
To generate DunedinPACNI scores in HCP, ADNI, UKB and BrainLat participants, we applied the regression weights from the DunedinPACNI model to FreeSurfer-derived phenotypes in each dataset and summed the products and model intercept. In ADNI, UKB and BrainLat, DunedinPACNI scores were correlated with chronological age (ADNI: r = 0.37; UKB: r = 0.50; BrainLat: r = 0.37; Supplementary Fig. 8).
In addition, we conducted the same procedure again without GWR because this measure is not always distributed in public datasets. We observed slightly reduced model accuracy when GWR was not included. DunedinPACNI estimates without GWR phenotypes showed excellent test–retest reliability in HCP. DunedinPACNI estimates were similar with and without GWR phenotypes in ADNI, UKB and BrainLat (see Supplementary Figs. 2 and 9 for more details).
Brain age gap
We submitted raw T1-weighted images from ADNI, UKB and BrainLat to the publicly available brainageR algorithm (v.2.1). This model, which has been described in detail elsewhere113, is trained to predict chronological age in a sample of healthy, cognitively unimpaired individuals aged 18–92 years. This algorithm was selected because it generates highly reliable estimates among published algorithms64. Briefly, brainageR is estimated by first segmenting and normalizing T1-weighted images using SPM12. Next, coefficients derived from a Gaussian process regression model predicting chronological age in a training dataset (n = 2,001) are applied to morphometric features from brain segmentations to predict participants’ chronological age. Brain age gap was subsequently estimated by subtracting actual chronological age from predicted age113. Notably, 15 ADNI scans failed the brain age gap pipeline (14 failed visual inspection of segmentation, one error when computing the predicted age). These scans were excluded from all brain age gap analyses, including comparative analyses with DunedinPACNI.
Dunedin Study validation analyses
To first test the validity of DunedinPACNI in the Dunedin Study training sample, we tested for linear associations between DunedinPACNI scores and one-legged balance, gait speed, step in place, chair stands, grip strength, visuomotor coordination, subjective physical limitations, subjective health, cognitive function, child-to-adult cognitive decline and facial aging while controlling for sex. We compared these effect sizes to associations between each of these measures and the original, longitudinal Pace of Aging.
Test–retest reliability
We used the HCP dataset to assess the test–retest reliability of DunedinPACNI. Reliability was quantified using a two-way mixed-effects intraclass correlation coefficient (3,1) with session modeled as a fixed effect, participant as a random effect and the test–retest interval as an effect of no interest114.
Cognitive and physical functioning
We first used linear regression models to test for associations between DunedinPACNI and scores on tests of cognition, physical function and health in ADNI and UKB. All analyses controlled for age and sex. In ADNI, we calculated robust standard errors to account for nonindependence from repeated observations. We also tested the standardized differences in DunedinPACNI scores between three groups based on cognitive status: CN, MCI and dementia. All group difference comparisons controlled for age and sex. We again calculated robust standard errors to account for nonindependence and conduced a sensitivity analysis while controlling for APOE ε4 carriership. We repeated these analyses with brain age gap. Notably, when conducting analyses on the combined effects of DunedinPACNI and brain age gap on cognitive outcomes in ADNI, we restricted the sample to the first time point of each measure. These analyses included only one observation per participant, allowing us to more easily combine effect sizes and CIs.
Dementia survival analysis
We conducted a Cox proportional hazards regression using the baseline DunedinPACNI scores of ADNI participants to predict their probability of cognitive decline or clinical conversion to dementia during the follow-up window. Conversion in CN participants was defined as having a diagnosis of CN at baseline but a diagnosis of MCI or dementia at the end of the follow-up. Conversion in participants with MCI was defined as having a diagnosis of MCI at baseline and a diagnosis of dementia by the end of the follow-up. Participants who had a baseline diagnosis of dementia or transitioned from MCI to CN were not included in this analysis. The analysis controlled for sex, age at baseline and length of the observation window. We investigated the influence of AD genetic risk on these results by conducting all analyses while additionally controlling for APOE ε4 carriership. We repeated these analyses for brain age gap.
Prediction of hippocampal atrophy rates
We used repeated MRI measurements from ADNI (n = 1,302) and UKB (n = 4,601) to generate estimates of change in hippocampal GMV. We used longitudinal ComBat on ADNI MRI data to remove differential scanner effects115. Next, using all available time points for each participant, we generated multilevel linear models for bilateral hippocampal volume with random effects for both participant and age. Using these models, we derived trajectories to track change in hippocampal GMV for each participant. We then tested whether each participant’s baseline DunedinPACNI scores could predict their subsequent rate of hippocampal atrophy. These analyses controlled for age, sex and length of the observation period. We investigated the influence of AD genetic risk on these results by conducting these analyses while additionally controlling for APOE ε4 carriership. We repeated these analyses for brain age gap.
Morbidity and mortality survival analyses
To investigate the association between DunedinPACNI and morbidity, we used UKB data to calculate the standardized differences in DunedinPACNI scores between three groups based on the number of lifetime chronic disease diagnosis (0, 1, 2+). Next, we conducted a Cox proportional hazards regression using UKB participants’ baseline DunedinPACNI scores to predict the onset of a chronic aging-related disease (n = 827 emergent diagnoses: myocardial infarction, chronic obstructive pulmonary disease, dementia or stroke) in participants who had never previously received any of these diagnoses at the time of scanning (n = 40,753). Similarly, to investigate the association between DunedinPACNI and mortality, we conducted a Cox proportional hazards regression using UKB participants’ baseline DunedinPACNI scores to predict death (n = 757 deaths). Both models controlled for baseline age, time to onset and sex. We repeated these analyses for brain age gap.
Socioeconomic inequality analyses
To investigate whether DunedinPACNI reflected gradients of socioeconomic inequality57, we first tested for linear relationships between DunedinPACNI and years of education in ADNI and UKB. We also tested for a linear relationship between DunedinPACNI and household income in UKB. These analyses controlled for sex and age. In ADNI, we included only the first MRI observation per participant.
Replication in a Latin American sample
To investigate whether DunedinPACNI generalizes to samples of individuals who are underrepresented in neuroimaging research59, we tested whether the degree of acceleration in DunedinPACNI among ADNI participants with dementia was similar in BrainLat participants with dementia. We tested for standardized differences by comparing the AD and FTD groups to the CN group, respectively. We also tested for linear associations between DunedinPACNI and MoCA scores. All analyses in this sample controlled for age and sex. We then compared the magnitude of acceleration among BrainLat participants with dementia to the previously identified acceleration among ADNI participants with dementia. Lastly, we compared the strength of the linear association between DunedinPACNI and MoCA scores in BrainLat participants to the previously identified association in ADNI participants.
Comparison with hippocampal and ventricular volume
We investigated how DunedinPACNI differs from two commonly used MRI-based measures of brain aging: hippocampal volume and ventricular volume. We calculated hippocampal volume as the sum of left and right hippocampal GMV measures derived from FreeSurfer. Likewise, we calculated ventricular volume as the sum of the left and right lateral ventricular volume measures. We first repeated cross-sectional associations with cognition, frailty and poor health in UKB, substituting DunedinPACNI with hippocampal volume or ventricular volume. Next, we conducted this same procedure with our Cox proportional hazards regression models of chronic disease and mortality risk in UKB, and cognitive decline risk among CN ADNI participants. Lastly, we compared coefficients for all analyses while including DunedinPACNI and either hippocampal volume or ventricular volume in the respective models. All analyses controlled for age and sex.
All visualizations were generated using the R package ggplot2 (ref. 116).
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
link
