Read in Methylation Aging Models

  • These are linear models generated in different studies.
  • To reconstruct the age predictor from these models me must add a linear combination of the input probes.

In [4]:
import NotebookImport
from Imports import path


importing IPython notebook from Imports
Populating the interactive namespace from numpy and matplotlib

In [5]:
import numpy as np
import pandas as pd

Horvath's Transfer Functions

  • He trains his model on a non-linear transform of the age as an output
  • These are helper functions to go back and forth the origional age and transformed age spaces

In [6]:
def tranfer_fx(x, adult_age=20):
    x = np.float(x)
    x=(x+1)/(1+adult_age)
    y = np.log(x) if x <= 1 else x - 1
    return y
    

def anti_tranfer_fx(x, adult_age=20):
    if x < 0:
        return (1+adult_age)*np.exp(x)-1
    else:
        return (1+adult_age)*x+adult_age

Horvath Model

  • Relies on data normalized by Horvath's normalization function do not plug in unnormalized beta values
  • Horvath preforms a non-linear transform to his data prior to modeling, so after our predictions are made we must run the anti-tranfer function in order to get at age predictions
  • Here I am using the model trained with shrunken coefficients as I expect this to generalize better

In [7]:
horvath_model = pd.read_table(path + 'data/Horvath_Model.csv', index_col=0, 
                              skiprows=[0,1])
horvath_intercept = horvath_model.CoefficientTraining['(Intercept)']
horvath_model = horvath_model.iloc[1:]

def run_horvath_model(df):
    '''
    Uses global variables horvath_model and horvath_intercept.  At some point I should
    move this to a class.
    
    Input data-frame should be normalized using Horvath's normalization script.
    '''
    df = df.T.fillna(horvath_model.medianByCpG).T
    df = df.ix[horvath_model.CoefficientTraining.dropna().index]
    pred_age = df.T.dot(horvath_model.CoefficientTraining.dropna()) + horvath_intercept
    pred_age = pred_age.apply(anti_tranfer_fx)
    pred_age.name = 'predicted age (Horvath)'
    return pred_age

Hannum Model

  • Hannum does no supply an intercept with his model
  • In the paper, these is talk about gender and BMI covariates, but I can not find these anywhere

In [8]:
hannum_model = pd.read_csv(path + 'data/Hannum_All.csv', index_col=0)

def run_hannum_model(df):
    df = df.ix[hannum_model.Coefficient.index]
    pred_age = df.T.dot(hannum_model.Coefficient)
    pred_age.name = 'predicted age (Hannum)'
    return pred_age