In [1]:
import pandas as pd
from IPython.display import Image

Advancement of aging in HIV+ individuals revealed by epigenomic profiling

Data Introduction Processing

In this study, we sought to determine whether HIV-infected patients have altered signs of molecular aging. Samples of whole blood were collected from 137 HIV-infected, HAART-treated but otherwise generally healthy Caucasian males and 48 Caucasian male controls (Supplementary Table, Extended Data Fig. 1). Genome-wide DNA methylation profiles were determined using the Ilumina Infinium HumanMethylation450 BeadChip assay. Data were normalized using standard techniques (Methods); quality control measures resulted in 2 controls being removed.

Standard Filters, Quality Control

  • Number of cases: 137
  • Number of controls: 46
  • Discarded samples: 2 controls

Model Concordance

  • Number of cases: 134
  • Number of Controls: 44
  • Discarded Samples: 2 control, 3 cases

Cellular Composition Adjustment

Methylation profiles from whole blood can be skewed due to differences in cellular composition. To correct for these differences, we used a method described by Jaffe and Irizzary to estimate the cellular composition of our whole blood samples using methylation profiles from flow-sorted cellular populations. The estimated cellular compositions of our whole blood samples were highly concordant with clinically measured values (Extended Data Fig. 2) thereby validating this approach. This analysis highlighted certain markers that were highly correlated with cell type composition. To limit the effects of cell-type specific epigenetic markers in downstream analysis, methylation levels were adjusted to remove the expected differences resulting from cohort-wide differences in cellular composition (Methods, Extended Data Fig. 3).

Naive Model of HIV and Aging

We next used unsupervised methods to identify a panel of age-associated methylation sites. This panel was identified using a nested validation design to analyze methylation profiles using two cohorts of HIV-uninfected patients. In a genome-wide screen of 538 subjects from Hannum et al. [Hannum 2013], we found 11,920 probes associated with age at a 5% FWER (Likelihood ratio test in multivariate regression model). Screening these probes in a second control cohort from the EPIC study (n = 662) confirmed 5,088 probes (Fig. 1a). Of these validated age-associated probes, we found a very high association with differential methylation in HIV-positive patients relative to control patients (Fig. 1b). Furthermore in an unsupervised principal component analysis of these probes across the cohort of our 137 HIV-positive and 46 HIV-negative samples, we found a positive additive association of the first principal component with both age and HIV status (Fig. 1c, association assessed in multivariate linear model P < 10-10, additivity assessed by liklihood ratio test with interaction term, P = 0.2). These findings suggest a link between HIV and aging through either a direct effect of HIV infection on biological aging or an indirect effect through a common mechanism such as cellular stress or inflammation (Fig. 1d).

  • These number are before filtering with model concordance. I may also generate Figure 1, post filtering for the supplement.
  • The models for this study are run in this notebook
  • I use cellular composition, and gender as covariates
  • The statistic is a likelihood ratio tet with and without age, the p-values are probably conservative as age is a bit redundant with cellular composition, but not too huge of an effect
  • The claim of additivity is determined by failure to reject a drop-1 test for an interaction term between age and HIV status

Methylation Age Models

Our group and later work by Horvath has used supervised learning to devise models of chronological age based on a subset of DNA methylation marks. Although these two models used different methods and were based on different training data, we found that their predictions were very similar (Pearson r=.9). A post-hoc analysis revealed that much of the variation between the two models is attributable to measurement error (Figure 2a). To further compare these two models, we tested two independent datasets and found that the models have similar performance on whole blood samples (Table 1). Furthermore a consensus of the two models out-performs either individual model, and is thus used throughout the rest of the study (Methods, Table 1, Figure 2b-c).

Age Advancement in HPV+ Patients

This consensus aging model was then applied to our cohort of HIV patients and healthy controls (Table 1). The predictions for the control patients have a very high concordance with chronological age, with 95% agreement and an average error of 3.1 years (Figure 2d). In contrast, the HIV-infected patients showed an increased biological age across the cohort with an average effect of 5.0 years (t-test $P < 10^{-8}$, Figure 2c-d). Consistent with our previous findings using an unsupervised approach, these findings also suggest that HIV infection leads either directly or indirectly to increased aging.

There are a number of possible mechanisms to explain age advancement in HIV-infected patients. The effect could occur early in the course of disease perhaps as a consequence of acute infection or in response to drug treatment. Alternatively or in addition, HIV infection could accelerate the aging processes due to the cumulative effects of the virus, changes in blood composition, and/or cumulative effects of therapeutic intervention. Notably, we found that patients recently infected with HIV showed a similar age advancement as those patients with chronic infection (P > 0.5, Mann-Whitney U Test; Figure 2d). Using in a linear model, we found that the slope of chronological verses molecular time since infection did not differ from one (0.98 +/- 0.06) whereas the y-intersect is clearly positive (5.3 +/- 0.9). These data suggest that the observed increased molecular age in HIV-infected patients occurs early in the course of infection and/or treatment, as opposed to an accelerated aging rate.

Covariates of HIV in Age Advanment

While the direct effects of cell composition changes were mitigated by the adjustment of methylation levels as described previously, this does not rule out downstream, indirect changes in the methylome in response to changes in cellular composition. To assess this, we constructed a multivariate linear model on age advancement, with cell composition variables alongside HIV status (Table 2). In this model, the presence HIV was associated with an age advancement of 3.5 +/- 1.1 years while the presence of natural killer also account for additional increases in apparent methylation age. In an even more conservative test, we modeled the age advancement with cellular composition variables alone, and found that the unexplained variation in this model had a significant association with HIV infection (P = 0.03). While the direction of causality in the relationships between HIV, cell composition and aging remain unclear, this analysis shows that in the limiting case, HIV still has some independent association with advanced molecular aging.

HIV Specific Methylation

We next sought to assess whether the signatures of increased aging in HIV-infected patients differed from normal aging in uninfected patients. We found very high concordance between the molecular markers associated with advanced aging in the HIV cohort and our compilation of healthy controls (Pearson r = .68). Furthermore, this agreement of age advancement between HIV cases and the control data is higher than the agreement between technical batches in the run (Table). Although it is clear that HIV alters the epigenome through molecular aging, we see no evidence that this process is different than normal environmental factors that lead to age advancement.

Data Processing for Controls

Here we are limiting patients to those between 25 and 68.

Hannum

  • Start with 538
  • Model concordance filter down to 497

EPIC

  • Start with 662 after standard QC
  • Model concordance filters down to 637

Univariate linear models

  • Odds-ratio of agreement in sign between HIV and age coorelations: 2.3
  • Correlation between HIV and age coorelations: 0.35
  • Number of probes tested 485,512

Multivatiate linear models

  • Fraction of probes best fit by combination of age and HIV: 7%
  • Fraction of these probes with sign of HIV and Age in agreement: 83%, $P<10^{16}$
  • Average (mean) parital effect of HIV infection: ~30 years

Multivatiate linear models (v2)

  • Fraction of probes best fit by combination of age and HIV: 1.3%
  • Fraction of these probes with sign of HIV and Age in agreement: 88%, $P=10^X$
  • Average (median) parital effect of HIV infection: ~23.5 years

Benchmarking Methylation Age Models

We then sought to more directly quantify the effect of HIV infection on human aging. For this, we applied two distinct models of molecular aging to the data (Hannum, Horvath, Methods). While these models used slightly different methodology and different training data in their formulations, in practice we find that their predictions are very similar (r=.9), and in post-hoc analysis found much of the variation in the models is likely attributable to measurement error. To benchmark these two models, we incorporated independent datasets not used for training of either model, and found that the models have comparable performance on whole blood samples. Furthermore a consensus of the two predictions out-performs either individual model, and is thus used throughout the rest of the study (Methods).

Model Preformance

EPIC Hannum Primary Cohort
n r error (years) % error n r error (years) % error n r error (years) % error
Combined, Filtered 637 0.82 3.65 0.07 497 0.86 4.06 0.08 175 0.81 4.58 0.11
Predicted Age (Combined) 662 0.82 3.69 0.07 538 0.87 4.12 0.08 187 0.81 4.60 0.11
predicted age (Hannum) 662 0.82 3.69 0.07 538 0.87 4.19 0.08 187 0.80 4.72 0.11
predicted age (Horvath) 662 0.75 4.64 0.09 538 0.82 5.14 0.10 187 0.78 5.36 0.13

In [34]:
model_agreement = pd.DataFrame([[46, .95, 3.3, .074],
                                [137, .86, 3.6, .072]], 
                          index=['Internal Controls','HIV Cohort'],
                          columns=['n', 'r', 'error (years)', '% error'])
model_agreement


Out[34]:
n r error (years) % error
Internal Controls 46 0.95 3.3 0.074
HIV Cohort 137 0.86 3.6 0.072
  • Try and quantify effect of measurment error
  • Out of study benchmarks
    • Need to gather numbers for these
    • Show that preformance is comparable and that consensus is slightly better

Application of Aging Model

We applied this consensus aging model to our cohort of HIV patients and healthy controls (Table 1). The predictions for the control patients have very high concordance with subject chronological age, with 94% agreement and an average error of 3.3 years (Figure 2a). Consistent with the age advancement hypothesis, HIV infected patients show an increase in biological age across the cohort as compared to the controls with an average effect of 5.2 years (t-test P < 10-7, Figure 2b-c).

  • Table 1: of all models and fits.
  • Model fit of controls: 95% agreement, average error of 3.3 years
  • Increase of HIV compared to controls: ANCOVA P < X
  • Average effect size of increase: 5.2 years

Interestingly, we find that patients with recent HIV infection have similar age advancement as those patients with chronic infection (Mann-Whitney U, P > 0.5, Figure 2c). Furthermore, in a linear model we find that the slope of chronological verses molecular time since infection does not differ from one (slope=0.98 +/- 0.06) whereas the intersect term is significantly positive (b=5.66 +/- 0.96). These data suggest that the aging of patients with HIV seems to be a constant advancement in molecular age relatively early in the course of the infection, as opposed to an acceleration of the rate of aging.

  • Age advancment in HIV short vs. HIV long: Mann-Whitney U, P > 0.5
  • Linear model of HIV vs. Controls
    • slope: 0.98 +/- 0.06
    • Intercept: 5.66 +/- 0.96

Multivariate Aging Models

Indeed, we do find associations of cellular composition with patient age in our case and control datasets. Furthermore, we find the ratio of CD8 T-cells to CD4 T-cells about 15 times higher in HIV-infected patients. While direct effects of such composition changes were mitigated by the adjustment of methylation levels as described in the Methods, this does not rule out downstream, indirect changes in the methylome due to cellular composition. To assess this, we constructed a multivariate linear model on age advancement, with cell composition as well as the HIV indicator variable as a constant effect. In this model, the HIV term remains significant with an average age advancement of 4.4 +/- 1.1 years while the presence of natural killer T-cells as well as the absence of CD4 T-cells also account for additional increases in apparent methylation age. In an even more conservative test, we modeled the age advancement with cellular composition alone, and found that the unexplained variation in this model had a significant association with HIV infection (P = 0.02). While the direction of causality in the relationships between HIV, cell composition and aging remain unclear, this analysis shows that in the limiting case, HIV still has some independent association with advanced molecular aging.

Confounding

  • Association of cellular composition with patient age: (Supplemental Figure?)
  • Ratio of CD8 to CD4 T-cells: ~15

Multivariate Model 1: Grab bag

  • HIV effect: 4.4 +/- 1.1 years
  • Other contributions: CD4T and NK cells
  • Supplemental Figure?

Multivariate Model 2: Two Step

  • Should we compare fits?
  • Association of residuals wit HIV
  • Tack on to Supplemental Figure

HIV Aging vs. Healthy Aging

We next sought to assess whether the epigenetic remodeling responsible for the accelerated age phenotype differed within HIV-infected patients as compared to the controls. Surprisingly we found very high concordance of the molecular markers associated with this advanced aging in the HIV cohort and our compilation of healthy controls (r = .68). Furthermore, this agreement of age advancement between HIV cases and the control data is higher than the agreement between technical batches in the run (Table). Thus while it is clear that HIV effects the epigenome through molecular aging, we see no evidence that this process is different than normal environmental factors that lead to age advancement.

  • Correlates of age advancment in healthy tissues

    • Literature review
    • Show in control data-sets
  • Comparison of HIV+ to controls

    • Internal vs. external controls
    • Show effect of data-set size (downsample?)
    • Compare to reproducability across batches

Figures

Figure 1

  • Panel a

    • Age vs. HIV linear model r-values
    • Hexbin + marginals
  • Panel b

    • Sample linear model
  • Panel c
    • Distrubution of HIV effect size

Figure 2

  • Panel a
    • Age vs. bio-age, control samples
    • r-value, error matching text
  • Panel b
    • time since onset vs. bio-time since onset, HIV patients
    • r-value, slope, intercept
  • Panel c
    • Violin-plot, residual on controls, short-HIV, and long-HIV

Supplemental Figures

Cell Composition Adjustement


In [7]:
Image('/cellar/users/agross/Downloads_Old/CC_Adjustment.jpg', width=600)


Out[7]:

Cell Composition Benchmarking


In [5]:
Image('/cellar/users/agross/figures/figS1.png', width=800)


Out[5]:

Cellular Composition and Age

Multivariate Models

Age advancment in healthy tissue

  • Control data-sets
  • Top probes, genes
  • Correlation between datasets