Supplemental Information:

"Clonal heterogeneity influences the fate of new adaptive mutations"

Ignacio Vázquez-García, Francisco Salinas, Jing Li, Andrej Fischer, Benjamin Barré, Johan Hallin, Anders Bergström, Elisa Alonso-Pérez, Jonas Warringer, Ville Mustonen, Gianni Liti

Figure 3 (+ Supp. Figs.)

This IPython notebook is provided for reproduction of Figures 2, S3, S4 and S7 of the paper. It can be viewed by copying its URL to nbviewer and it can be run by opening it in binder.



In [19]:

    
# Load external dependencies
from setup import *
# Load internal dependencies
import config,gmm,plot,utils

%load_ext autoreload
%autoreload 2

%matplotlib inline









    



The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload



In [20]:

    
ids = pd.read_csv(dir_data+'seq/sample_ids_merged_dup.csv')

ids.loc[ids.clone.isnull(),'type'] = 'population'
ids.loc[(ids.clone.notnull()) & (ids.time==0),'type'] = 'ancestral clone'
ids.loc[(ids.clone.notnull()) & (ids.time==32),'type'] = 'evolved clone'

for seq_type, seq_id in ids.groupby('type'):
    print('{0} sequencing coverage\nBottom quartile: {1:.2f}x, Top quartile: {2:.2f}x, Min: {3:.2f}x, Max: {4:.2f}x, Median: {5:.2f}x\n'\
    .format(seq_type.capitalize(),
            seq_id['coverage'].quantile(.25), \
            seq_id['coverage'].quantile(.75), \
            seq_id['coverage'].min(), \
            seq_id['coverage'].max(), \
            seq_id['coverage'].median()))









    



Ancestral clone sequencing coverage
Bottom quartile: 15.45x, Top quartile: 26.13x, Min: 9.06x, Max: 36.60x, Median: 23.37x

Evolved clone sequencing coverage
Bottom quartile: 28.48x, Top quartile: 31.96x, Min: 26.06x, Max: 36.01x, Median: 30.38x

Population sequencing coverage
Bottom quartile: 84.10x, Top quartile: 103.98x, Min: 24.66x, Max: 150.10x, Median: 94.81x

Data import

Top panels - Import subclonal frequency



In [21]:

    
# Load data
seq_st_df = pd.read_csv(dir_data+'seq/subclonality/seq_subclonality.csv', encoding='utf-8')

# Compute cumulative haplotype frequencies for major subclones
seq_st_df['clonal'] = seq_st_df.apply(
    lambda x: 
    x[['subclone A','subclone B','subclone C','subclone D']].fillna(0).sum(), 
    axis=1
)
# Calculate the remaining bulk fraction
seq_st_df['bulk'] = 1.0 - seq_st_df['clonal']

seq_st_df.head()









    Out[21]:







  
    
      
      background
      cross
      cross_rep
      selection
      selection_rep
      subclone A
      subclone B
      subclone C
      subclone D
      time
      population
      sample
      clonal
      bulk
    
  
  
    
      0
      WAxNA
      F12
      2
      RM
      2
      0.0200
      NaN
      NaN
      NaN
      2
      WAxNA_F12_2_RM_2
      WAxNA_F12_2_RM_2_T2
      0.0200
      0.9800
    
    
      1
      WAxNA
      F12
      2
      RM
      2
      0.0202
      NaN
      NaN
      NaN
      4
      WAxNA_F12_2_RM_2
      WAxNA_F12_2_RM_2_T4
      0.0202
      0.9798
    
    
      2
      WAxNA
      F12
      2
      RM
      2
      0.0705
      NaN
      NaN
      NaN
      8
      WAxNA_F12_2_RM_2
      WAxNA_F12_2_RM_2_T8
      0.0705
      0.9295
    
    
      3
      WAxNA
      F12
      2
      RM
      2
      0.1098
      NaN
      NaN
      NaN
      16
      WAxNA_F12_2_RM_2
      WAxNA_F12_2_RM_2_T16
      0.1098
      0.8902
    
    
      4
      WAxNA
      F12
      2
      RM
      2
      0.1097
      NaN
      NaN
      NaN
      32
      WAxNA_F12_2_RM_2
      WAxNA_F12_2_RM_2_T32
      0.1097
      0.8903

Middle panels - Import mutation counts



In [22]:

    
# Load data
seq_dn_df = pd.read_csv(dir_data+'seq/de-novo/seq_de_novo_snv_indel.csv', encoding='utf-8', keep_default_na=False)

print(seq_dn_df.shape)
seq_dn_df.head()









    



(572, 83)






    Out[22]:







  
    
      
      ac
      af
      alt
      an
      allele
      amino_acids
      biotype
      baseqranksum
      cds_position
      chr_arabic
      ...
      population
      coverage
      substitution
      assignment
      mutation_type
      chr_length
      chr_roman
      pos_cum
      variant_type
      consequence_short
    
  
  
    
      0
      1
      0.1
      [A]
      2.0
      
      
      non-coding
      1.093
      
      7
      ...
      NA_HU_1
      57.370612
      G>A
      bulk
      passenger
      1090940
      VII
      4141136.0
      snv
      non-coding
    
    
      1
      0
      0.1
      [A]
      2.0
      A
      K/M
      protein_coding
      -0.027
      341
      7
      ...
      NA_HU_1
      57.370612
      T>A
      bulk
      driver
      1090940
      VII
      4594951.0
      snv
      non-synonymous
    
    
      2
      0
      0.1
      [A]
      2.0
      A
      R/I
      protein_coding
      -1.198
      101
      7
      ...
      NA_HU_1
      57.370612
      C>A
      bulk
      driver
      1090940
      VII
      4595191.0
      snv
      non-synonymous
    
    
      3
      1
      0.1
      [T]
      2.0
      
      
      non-coding
      0.395
      
      12
      ...
      NA_HU_1
      57.370612
      C>T
      bulk
      passenger
      1078177
      XII
      8063245.0
      snv
      non-coding
    
    
      4
      1
      0.1
      [A]
      2.0
      A
      G
      protein_coding
      0.956
      861
      14
      ...
      NA_HU_1
      57.370612
      C>A
      bulk
      passenger
      784333
      XIV
      9384437.0
      snv
      synonymous
    
  

5 rows × 83 columns

The tally of SNVs and indels across whole-population genome sequences is:



In [23]:

    
seq_dn_df[(seq_dn_df.clone!='')].groupby(['selection','population','time','variant_type']).size()









    Out[23]:





selection  population        time  variant_type
HU         WAxNA_F12_1_HU_1  32    snv             11
           WAxNA_F12_1_HU_2  32    snv             10
           WAxNA_F12_1_HU_3  32    snv             17
           WAxNA_F12_2_HU_1  32    snv              8
           WAxNA_F12_2_HU_2  32    snv             17
           WAxNA_F12_2_HU_3  32    indel            1
                                   snv             19
RM         WAxNA_F12_1_RM_1  32    indel            1
                                   snv             10
           WAxNA_F12_1_RM_2  32    snv             14
           WAxNA_F12_1_RM_3  32    snv             15
           WAxNA_F12_2_RM_1  32    snv              9
           WAxNA_F12_2_RM_2  32    snv              6
           WAxNA_F12_2_RM_3  32    snv              2
YPD        WAxNA_F12_1_YPD   0     indel            2
                                   snv             38
           WAxNA_F12_2_YPD   0     indel            7
                                   snv             55
dtype: int64



In [24]:

    
seq_dn_df[(seq_dn_df.time==0) & (seq_dn_df.clone!='') & (seq_dn_df.ploidy=='haploid')].groupby(['selection','mutation_type','variant_type']).size()









    Out[24]:





selection  mutation_type  variant_type
YPD        passenger      indel            9
                          snv             82
dtype: int64



In [25]:

    
seq_dn_df[(seq_dn_df.time==32) & (seq_dn_df.clone!='')].groupby(['selection','mutation_type','variant_type']).size()









    Out[25]:





selection  mutation_type  variant_type
HU         driver         snv             22
           passenger      indel            1
                          snv             60
RM         driver         snv             13
           passenger      indel            1
                          snv             43
dtype: int64

Bottom panels - Import phenotype evolution



In [26]:

    
# Load data
pheno_df = pd.read_csv(dir_data+'pheno/populations/pheno_populations.csv.gz', encoding='utf-8', keep_default_na=False, na_values='NaN')

# Filter out strains used for spatial control
pheno_df = pheno_df[(pheno_df.group == 'ancestral')|\
                    (pheno_df.group == 'evolved')]

groups_ph = pheno_df.groupby(['group','cross','cross_rep','selection','selection_rep'])
pheno_df = pheno_df[pheno_df.selection_rep != '']

for (ii,((group,cross,cross_rep,selection,selection_rep),g1)) in enumerate(groups_ph):
    
    if group=='evolved':
        
        df = groups_ph.get_group(('ancestral',cross,cross_rep,selection,''))        
        df.loc[:,'selection_rep'] = df.selection_rep.replace([''],[selection_rep])
        df.loc[:,'population'] = df['background']+'_'+df['cross']+'_'+df['cross_rep'].apply(str)+'_'+df['selection']+'_'+df['selection_rep'].apply(str)

        pheno_df = pheno_df.append(df)
        
pheno_df = pheno_df.reset_index(drop=True)

# Set reference as mean phenotype of the ancestral hybrid
def normalize_phenotype(df, param_abs='norm_growth_rate', param_rel='rel_growth_rate'):
    df[param_rel] = df[param_abs] - df[df.group=='ancestral'][param_abs].mean()
    return df

pheno_df = pheno_df.groupby(['selection','environment','population'], as_index=False).apply(normalize_phenotype, param_abs='norm_growth_rate', param_rel='rel_growth_rate')
pheno_df = pheno_df.groupby(['selection','environment','population'], as_index=False).apply(normalize_phenotype, param_abs='norm_doubling_time', param_rel='rel_doubling_time')

# # Filter out measurement replicates with >5% measurement error
# pheno_df['pct'] = pheno_df.groupby(['selection','environment','population','group','isolate','gene','genotype_long'])['rel_growth_rate']\
# .apply(lambda x: (x-x.mean())/float(x.mean()))
# pheno_df = pheno_df[abs(pheno_df['pct'])<10]

pheno_df.head()  # show dataframe header to stdout









    Out[26]:







  
    
      
      selection
      environment
      run
      plate
      row
      column
      group
      population
      background
      ploidy
      ...
      genotype_long
      amino_acids
      mating
      auxotrophy
      abs_growth_rate
      abs_doubling_time
      norm_growth_rate
      norm_doubling_time
      rel_growth_rate
      rel_doubling_time
    
  
  
    
      0
      HU
      HU
      1
      1
      1
      3
      evolved
      WAxNA_F12_1_HU_3
      WAxNA
      diploid
      ...
      
      
      MATa/α
      
      0.065577
      3.930664
      0.650017
      0.621451
      -0.079774
      0.158258
    
    
      1
      HU
      HU
      1
      1
      1
      4
      evolved
      WAxNA_F12_2_HU_3
      WAxNA
      diploid
      ...
      
      
      MATa/α
      
      0.058038
      4.106870
      0.610609
      0.711680
      -0.097435
      0.203138
    
    
      2
      HU
      HU
      1
      1
      1
      7
      evolved
      WAxNA_F12_1_HU_3
      WAxNA
      diploid
      ...
      
      
      MATa/α
      
      0.105026
      3.251181
      0.797564
      0.326327
      0.067773
      -0.136866
    
    
      3
      HU
      HU
      1
      1
      1
      8
      evolved
      WAxNA_F12_2_HU_3
      WAxNA
      diploid
      ...
      
      
      MATa/α
      
      0.085085
      3.554953
      0.737843
      0.438614
      0.029800
      -0.069928
    
    
      4
      HU
      HU
      1
      1
      1
      11
      evolved
      WAxNA_F12_1_HU_3
      WAxNA
      diploid
      ...
      
      
      MATa/α
      
      0.108511
      3.204089
      0.793288
      0.334084
      0.063496
      -0.129109
    
  

5 rows × 28 columns

Figure 3 - Subclonal heterogeneity



In [28]:

    
param = 'rel_growth_rate'

panels = {
    'HU': {
        'WAxNA_F12_1_HU_2':0,
        'WAxNA_F12_1_HU_3':1,
        'WAxNA_F12_2_HU_3':2
    },
    'RM': {
        'WAxNA_F12_1_RM_3':0,
        'WAxNA_F12_1_RM_4':1,
        'WAxNA_F12_2_RM_2':2
    }
}

populations = panels['HU'].keys()+panels['RM'].keys()
groups_st = seq_st_df[seq_st_df.population.isin(populations)]
groups_dn = seq_dn_df[(seq_dn_df.population.isin(populations))& \
                      (seq_dn_df.clone=='')& \
                      (seq_dn_df.gene!='non-coding')]
groups_ph = pheno_df[pheno_df.population.isin(populations)& \
                     np.isfinite(pheno_df[param])] # Take rows where param is finite
        
groups_st = groups_st.groupby('selection')
groups_dn = groups_dn.groupby('selection')
groups_ph = groups_ph.groupby(['selection','environment'])

for (ii, environment) in enumerate(['HU','RM']):
    
    fig, axes = plt.subplots(nrows=2, ncols=3, figsize=(6, 4), sharey='row')
    
    fig.subplots_adjust(left=0.07,bottom=0.07,right=0.85,top=0.95,hspace=0.3,wspace=0.1)
    
    # Set scales
    for ax in axes[0]:
        ax.set_xlim(0, 32)
        ax.set_ylim(0, 1)
    for ax in axes[1]:
        if environment=='HU':
            ax.set_xlim(-0.3, 0.5)
            ax.set_ylim(0, 0.15)
        elif environment=='RM':
            ax.set_xlim(-0.5, 1.9)
            ax.set_ylim(0, 0.12)

    ### Top panels ###
    # De novo mutations #
    for (jj, (population, gdn)) in enumerate(groups_dn.get_group(environment).groupby('population')):
        
        # Retrieve axes
        ax1 = axes[0][panels[environment][population]]
        
        for (gene, cds_pos, sub, protein_pos, amino_acids, consequence), gdx in \
    gdn.groupby(['gene','cds_position','substitution','protein_position','amino_acids','consequence_short']):
        
            assignment = gdx.assignment.unique()[0]
            mutation_type = gdx.mutation_type.unique()[0]
            
            gdx.time = gdx.time.apply(int)
            gdx = gdx.sort_values('time').reset_index(drop=True)
            gdx = gdx.sort_index()
            
            ax1.plot(
                gdx.index.values, gdx.frequency.values,
                color=config.lineages[assignment]['fill'], 
                **utils.merge_two_dicts(config.mutation_type[mutation_type],
                                        config.consequence_short[consequence])
            )
            
            if mutation_type=='driver':
                index = np.argmax(gdx.frequency)
                ax1.annotate(gene, xy=(index,gdx.frequency[index]), style='italic', fontsize=6,
                             textcoords='offset points', xytext=(0, 13), ha = 'center', va = 'top',
                             path_effects=[path_effects.withStroke(linewidth=0.5, foreground="w")], zorder=3)
                ax1.annotate(amino_acids.split('/')[0]+protein_pos+amino_acids.split('/')[1], 
                             xy=(index,gdx.frequency[index]), fontsize=5,
                             textcoords='offset points', xytext=(0, 7), ha = 'center', va = 'top',
                             path_effects=[path_effects.withStroke(linewidth=0.4, foreground="w")], zorder=3)
    
    # Subclonal frequency #
    for (jj, (population,gst)) in enumerate(groups_st.get_group(environment).groupby('population')):

        # Retrieve axes
        ax2 = axes[0][panels[environment][population]]

        # Set title
        ax2.set_title(population.replace('_',' '), fontsize=7, weight='bold')
        # 
        gst.set_index('time', inplace=True)
        colors=[config.lineages[x]['fill'] for x in ['subclone A','subclone B','bulk']]
        gst[['subclone A','subclone B','bulk']].plot(
            ax=ax2, kind='bar', legend=False, 
            stacked=True, rot=0, width=0.75, position=0.5, color=colors
        )
        
        # Rotate the x-axis ticks
        ax2.set_xlabel('', rotation=0)

    ### Bottom panels ###
    for (jj, (population, gph)) in enumerate(groups_ph.get_group((environment,environment)).groupby('population')):

        # Retrieve axes
        ax3 = axes[1][panels[environment][population]]
        utils.simple_axes(ax3)
                                    
        for (kk, (time, gt)) in enumerate(gph.groupby('group')):
            
            print(environment, population, time)
            
            x, y = plot.histogram_binned_data(ax, gt[param], bins=34)

            ax3.plot(x, y, color=config.population['color'][time], linewidth=0.75)
            
            ax3.fill_between(x, 0, y, label=config.population['long_label'][time], 
                             alpha=0.45, facecolor=config.population['color'][time])
            
            # Mean of all isolates
            gt_all = gt.groupby(['isolate','gene','genotype_long','assignment'])
            gt_all = gt_all[param].agg(np.mean)#.mean()
            # Mean of random isolates
            gt_random = gt[(gt['assignment']=='')].groupby(['isolate','gene','genotype_long','assignment'])
            gt_random = gt_random[param].agg(np.mean)#.mean()
            # Mean of targeted isolates
            gt_target = gt[(gt['assignment']!='')].groupby(['isolate','gene','genotype_long','assignment'])
            gt_target = gt_target[param].agg(np.mean)#.mean()
            
            # Gaussian mixture model
            X = gt_random[:, np.newaxis]
            N = np.arange(1, 4)
                        
            models = gmm.gmm_fit(X, N)
            
            # Compute the AIC and the BIC
            AIC = [m.aic(X) for m in models]
            BIC = [m.bic(X) for m in models]
            M_best = models[np.argmin(BIC)]
            print BIC
            # Mean of the distribution
            for m, v in zip(abs(M_best.means_.ravel()), M_best.covariances_.ravel()):
                print('Mean: %.6f, Variance: %.6f' % (m, v,))
                ax3.plot([m,m], ax3.get_ylim(), 
                         color=config.population['color'][time], 
                         linestyle='--', dashes=(4,3), linewidth=1)
                pos = ax3.get_ylim()[0] * 0.75 + ax3.get_ylim()[1] * 0.25
                trans = ax3.get_xaxis_transform() # x in data units, y in axes fraction                    
                ax3.annotate(
                    np.around(m, 2), xy=(m, 0.85), xycoords=trans, fontsize=6,
                    color='k', va='center', ha=('right' if time=='ancestral' else 'left'),
                    xytext=((-5 if time=='ancestral' else 5),0), textcoords='offset points',
                    path_effects=[path_effects.withStroke(linewidth=0.5, foreground="w")]
                )
                        
            x_data = np.array(gt_all)
            y_data = np.repeat([0.03*(ax3.get_ylim()[1]-ax3.get_ylim()[0])], len(x_data))
            
            markerline, stemlines, baseline = ax3.stem(x_data, y_data)
                            
            plt.setp(markerline, 'markerfacecolor', config.population['color'][time], markersize = 0)
            plt.setp(stemlines, linewidth=1, color=config.population['color'][time], 
                     path_effects=[path_effects.withStroke(linewidth=0.75, foreground="w")])
            plt.setp(baseline, 'color', 'none')
            
            if len(gt_target)>0:
            
                x_data = np.array(gt_target)
                y_data = np.repeat([0.2*(ax3.get_ylim()[1]-ax3.get_ylim()[0])], len(x_data))
                
                markerline, stemlines, baseline = ax3.stem(x_data, y_data)
                
                plt.setp(markerline, 'color', config.population['color'][time], 
                         markersize = 2.75, markeredgewidth=.75, markeredgecolor='k', zorder=3)
                plt.setp(stemlines, linewidth=.75, color=config.population['color'][time],
                         path_effects=[path_effects.withStroke(linewidth=1.25, foreground='k')], zorder=2)  
                plt.setp(baseline, 'color', 'none', zorder=1)
                    
                for (isolate, gene, genotype, assignment), mean in gt_target.iteritems():
                    ax3.annotate(
                        gene, xy = (mean, 0.2), xycoords=('data','axes fraction'), 
                        xytext = (0, 8), textcoords = 'offset points', 
                        ha = 'center', va = 'top', fontsize = 6, style = 'italic',
                        path_effects=[path_effects.withStroke(linewidth=0.5, foreground="w")]
                    )
    
    # Set axes labels
    axes[0, 1].set_xlabel(r'Time, $t$ (days)')
    axes[0, 0].set_ylabel('Cumulative subclone\n frequency, $f_j$ (bars)')
    axes[0, 2].twinx().set_ylabel('Allele frequency (lines)', rotation=270, va='baseline')
    axes[1, 1].set_xlabel(r'Rel. growth rate, $\lambda_{k}(t)$')
    axes[1, 0].set_ylabel('Density')

    # Set legends
    leg1 = axes[0, 2].legend(bbox_to_anchor=(1.3, 0.75), frameon=False,
                             loc='center left', borderaxespad=0.,
                             handlelength=0.75, title='Lineage', prop={'size':6})
    
    driver_artist = lines.Line2D((0,1),(0,0), color=config.lineages['bulk']['fill'], 
                                 **config.mutation_type['driver'])
    passenger_artist = lines.Line2D((0,1),(0,0), color=config.lineages['bulk']['fill'], 
                                    **config.mutation_type['passenger'])
    nonsyn_artist = lines.Line2D((0,1),(0,0), mfc=config.lineages['bulk']['fill'],
                                 linestyle='', linewidth=1.5,
                                 path_effects=[path_effects.withStroke(linewidth=2, foreground="k")],
                                 **config.consequence_short['non-synonymous'])
    syn_artist = lines.Line2D((0,1),(0,0), mfc=config.lineages['bulk']['fill'],
                              linestyle='', linewidth=1.5,
                              path_effects=[path_effects.withStroke(linewidth=2, foreground="k")], 
                              **config.consequence_short['synonymous'])
    leg2 = axes[0, 2].legend([driver_artist,passenger_artist,nonsyn_artist,syn_artist], 
                             ['driver','passenger','non-synonymous','synonymous'], 
                             bbox_to_anchor=(1.3, 0.25), ncol=1,
                             frameon=False, loc='lower left',
                             borderaxespad=0, handlelength=1.75, 
                             title='Mutation', prop={'size':6})
    axes[0, 2].add_artist(leg1)
    
    axes[0, 2].get_legend().get_title().set_fontsize('7')
    
    leg3 = axes[1, 2].legend(bbox_to_anchor=(1.3, 0.5), frameon=False, 
                             loc='center left', borderaxespad=0., framealpha=1,
                             handlelength=0.75, title='Time', prop={'size':6})
    
    axes[1, 2].get_legend().get_title().set_fontsize('7')
    
    for leg in [leg1,leg2]:
        plt.setp(leg.get_title(), fontsize=7)
    
    # Set panel labels
    axes[0,0].text(-0.24, 1.1,  chr(2*ii + ord('A')), transform=axes[0,0].transAxes,
                   fontsize=9, fontweight='bold', va='top', ha='right')
    axes[0,1].text(0.5, 1.2, 'Selection: %s' % config.selection['long_label'][environment], 
                   transform=axes[0,1].transAxes, fontsize=8, va='center', ha='center')
    axes[1,0].text(-0.24, 1.1,  chr(2*ii + ord('B')), transform=axes[1,0].transAxes,
                   fontsize=9, fontweight='bold', va='top', ha='right')

    # Axes limits
    for ax in fig.get_axes():
        
        ax.xaxis.label.set_size(6)
        ax.yaxis.label.set_size(6)
        ax.tick_params(axis='both', which='major', size=2, labelsize=6)
        ax.tick_params(axis='both', which='minor', size=0, labelsize=0)
        
        plt.setp(ax.get_xticklabels(), fontsize=6)
        plt.setp(ax.get_yticklabels(), fontsize=6)
        
        for loc in ['top','bottom','left','right']:
            ax.spines[loc].set_linewidth(0.75)
        
        if ax.is_last_row():
            if environment=='HU':
                ax.xaxis.set_major_locator(ticker.MaxNLocator(nbins=5))
                ax.yaxis.set_major_locator(ticker.MaxNLocator(nbins=5))
            elif environment=='RM':
                ax.xaxis.set_major_locator(ticker.MaxNLocator(nbins=5))
                ax.yaxis.set_major_locator(ticker.MaxNLocator(nbins=4))
    
    plot.save_figure(dir_paper+'figures/figure3/figure3_%s' % environment)
    plt.show()









    



('HU', u'WAxNA_F12_1_HU_2', u'ancestral')
[-284.23225508271184, -272.21007639285102, -258.19741532980646]
Mean: 0.000036, Variance: 0.002758
('HU', u'WAxNA_F12_1_HU_2', u'evolved')
[-186.94223493879264, -185.11558777103389, -173.39666344572879]
Mean: 0.054732, Variance: 0.007596
('HU', u'WAxNA_F12_1_HU_3', u'ancestral')
[-284.23225508271184, -272.21007639285102, -258.1974153298064]
Mean: 0.000036, Variance: 0.002758
('HU', u'WAxNA_F12_1_HU_3', u'evolved')
[-131.31986323703262, -149.25143821279616, -137.15645452825072]
Mean: 0.028203, Variance: 0.003428
Mean: 0.254325, Variance: 0.004882
('HU', u'WAxNA_F12_2_HU_3', u'ancestral')
[-263.64658590072952, -249.61261745149946, -235.20187429120534]
Mean: 0.000059, Variance: 0.003417
('HU', u'WAxNA_F12_2_HU_3', u'evolved')
[-196.21637022666482, -183.80029162667867, -175.41868584307633]
Mean: 0.137476, Variance: 0.006896






    












    



('RM', u'WAxNA_F12_1_RM_3', u'ancestral')
[-206.80324105953684, -211.82110430248329, -205.25484454861623]
Mean: 0.006682, Variance: 0.004395
Mean: 0.295974, Variance: 0.000290
('RM', u'WAxNA_F12_1_RM_3', u'evolved')
[143.66527693589273, 87.284098812658186, 86.31216015433526]
Mean: 0.969834, Variance: 0.018774
Mean: 1.363481, Variance: 0.005851
Mean: 0.087430, Variance: 0.016931
('RM', u'WAxNA_F12_1_RM_4', u'ancestral')
[-206.80324105953684, -211.82110430248329, -205.25484454861623]
Mean: 0.295974, Variance: 0.000290
Mean: 0.006682, Variance: 0.004395
('RM', u'WAxNA_F12_1_RM_4', u'evolved')
[104.93036222867757, 31.444643123539485, 33.778211560167328]
Mean: 0.878288, Variance: 0.016522
Mean: 0.122248, Variance: 0.016378
('RM', u'WAxNA_F12_2_RM_2', u'ancestral')
[-234.32468248365427, -219.98715046624565, -222.66055750224211]
Mean: 0.000176, Variance: 0.004637
('RM', u'WAxNA_F12_2_RM_2', u'evolved')
[65.971647630509167, -61.248773611046211, -58.14314643222103]
Mean: 0.071637, Variance: 0.009230
Mean: 0.942282, Variance: 0.024275

Fig. 3: Reconstruction of subclonal dynamics. (A and C), Competing subclones evolved in (A) hydroxyurea and (C) rapamycin experienced a variety of fates. Time is on the $x$-axis, starting after crossing when the population has no competing subclones. Cumulative haplotype frequency of subclones (bars) and allele frequency of de novo mutants (lines) are on the $y$-axis. Most commonly, selective sweeps were observed where a spontaneous mutation arose and increased in frequency. Driver mutations are solid lines and passenger mutations are dashed lines, colored by subclone assignment; circles and squares denote non-synonymous and synonymous mutations, respectively. (B and D) Variability in intra-population growth rate , estimated by random sampling of 96 individuals at initial ($t = 0$ days, blue) and final ($t = 32$ days, red) time points ($n = 32$ technical replicates per individual). Mean growth rates by individual are shown at the foot of the histogram (Fig. S7). The posterior means of the distribution modes fitted by a Gaussian mixture model are indicated as dashed lines. The fitter individuals (pins) carry driver mutations, measured by targeted sampling and sequencing.

Figure S3 - Sequence evolution of WAxNA founders



In [29]:

    
panels = {
    'HU': {
        'WAxNA_F12_1_HU_1':(0,1), 
        'WAxNA_F12_1_HU_2':(0,2), 
        'WAxNA_F12_1_HU_3':(0,3), 
        'WAxNA_F12_2_HU_1':(1,1), 
        'WAxNA_F12_2_HU_2':(1,2), 
        'WAxNA_F12_2_HU_3':(1,3)
    },
    'RM': {
        'WAxNA_F2_1_RM_1':(0,0),
        'WAxNA_F12_1_RM_1':(0,1), 
        'WAxNA_F12_1_RM_2':(0,2), 
        'WAxNA_F12_1_RM_3':(0,3), 
        'WAxNA_F12_1_RM_4':(0,4),
        'WAxNA_F2_1_RM_2':(1,0),
        'WAxNA_F12_2_RM_1':(1,1), 
        'WAxNA_F12_2_RM_2':(1,2), 
        'WAxNA_F12_2_RM_3':(1,3), 
        'WAxNA_F12_2_RM_4':(1,4)
    }
}

populations = panels['HU'].keys()+panels['RM'].keys()
groups_st = seq_st_df[seq_st_df.population.isin(populations)].groupby(['selection','population'])
groups_dn = seq_dn_df[(seq_dn_df.population.isin(populations))&\
                      (seq_dn_df.clone=='')&\
                      (seq_dn_df.gene!='non-coding')].groupby(['selection','population'])

# Create a figure with subplots
fig = plt.figure(figsize=(10, 10))

grid = gridspec.GridSpec(2, 1)

gs = {}

for (ii, e) in enumerate(['HU','RM']):
    
    nrows = 2
    ncols = 5
    gs[e] = gridspec.GridSpecFromSubplotSpec(nrows, ncols,
                                             subplot_spec=grid[ii],
                                             hspace=0.3, wspace=0.15)
    
    for (jj, p) in enumerate(panels[e]):
        
        # Retrieve axes
        ax1 = plt.subplot(gs[e][panels[e][p]])
        ax2 = ax1.twinx()
                    
        ### Subclone frequency ###
        gst = groups_st.get_group((e,p))
        
        # Set title
        ax1.set_title(p.replace('_',' '), fontsize=7, weight='bold')

        # Bar plot
        gst = gst.set_index('time')
        gst = gst[['subclone A','subclone B','subclone C','subclone D','bulk']]
        gst.plot(ax=ax1, kind='bar',
                 legend=False, stacked=True, width=0.75, position=0.5,
                 color=[config.lineages[c]['fill'] for c in gst.columns])
        
        ### De novo mutations ###
        if (e,p) in groups_dn.groups.keys():

            gdn = groups_dn.get_group((e,p))
            
            for (gene, pos, cds, sub, protein_pos, amino_acids, consequence), gdx \
            in gdn.groupby(['gene','pos','cds_position','substitution',\
                            'protein_position','amino_acids','consequence_short']):
                
                assignment = gdx.assignment.unique()[0]
                mutation_type = gdx.mutation_type.unique()[0]
            
                gdx = gdx.sort_values('time').reset_index(drop=True)
                gdx = gdx.sort_index()

                ax2.plot(gdx.index.values, gdx.frequency.values,
                         color=config.lineages[assignment]['line'],
                         **utils.merge_two_dicts(config.mutation_type[mutation_type],
                                                 config.consequence_short[consequence]))

                if mutation_type=='driver':
                    index = np.argmax(gdx.frequency)
                    ax2.annotate(
                        gene, xy=(index,gdx.frequency[index]), style='italic', fontsize=6,
                        textcoords='offset points', xytext=(0, 13), ha = 'center', va = 'top',
                        path_effects=[path_effects.withStroke(linewidth=0.5, foreground="w")], zorder=3
                    )
                    ax2.annotate(
                        amino_acids.split('/')[0]+protein_pos+amino_acids.split('/')[1], 
                        xy=(index,gdx.frequency[index]), fontsize=5,
                        textcoords='offset points', xytext=(0, 7), ha = 'center', va = 'top',
                        path_effects=[path_effects.withStroke(linewidth=0.4, foreground="w")], zorder=3
                    )
        
        # Set legends
        if (e,p) in [('HU','WAxNA_F12_1_HU_3'),('RM','WAxNA_F12_1_RM_4')]:
            
            leg1 = ax1.legend(bbox_to_anchor=(1.3, -0.125), ncol=1,
                              frameon=False, loc='lower left', 
                              borderaxespad=0., handlelength=0.7, 
                              title='Lineage', prop={'size':6})
    
        if (e,p) in [('HU','WAxNA_F12_2_HU_3'),('RM','WAxNA_F12_2_RM_4')]:
            
            driver_artist = lines.Line2D((0,1),(0,0), color=config.lineages['bulk']['fill'], 
                                         **config.mutation_type['driver'])
            passenger_artist = lines.Line2D((0,1),(0,0), color=config.lineages['bulk']['fill'], 
                                            **config.mutation_type['passenger'])
            nonsyn_artist = lines.Line2D((0,1),(0,0), mfc=config.lineages['bulk']['fill'], linestyle='',
                                         path_effects=[path_effects.withStroke(linewidth=2, foreground="k")],
                                         **config.consequence_short['non-synonymous'])
            syn_artist = lines.Line2D((0,1),(0,0), mfc=config.lineages['bulk']['fill'], linestyle='', 
                                      path_effects=[path_effects.withStroke(linewidth=2, foreground="k")],
                                      **config.consequence_short['synonymous'])
            leg2 = ax1.legend([driver_artist,passenger_artist,nonsyn_artist,syn_artist], 
                              ['driver','passenger','non-synonymous','synonymous'], 
                              bbox_to_anchor=(1.3, 1.125), ncol=1,
                              frameon=False, loc='upper left',
                              borderaxespad=0, handlelength=1.75, 
                              title='Mutation', prop={'size':6})
            
        for leg in [leg1,leg2]:
            plt.setp(leg.get_title(),fontsize=6)

        # Set axes labels
        if (e,p) in [('HU','WAxNA_F12_2_HU_2'),('RM','WAxNA_F12_2_RM_2')]:
            ax1.set_xlabel(r'Time, $t$ (days)')
        else:
            ax1.set_xlabel('')
        if (e,p) in [('HU','WAxNA_F12_1_HU_1'),('RM','WAxNA_F2_1_RM_1'),
                     ('HU','WAxNA_F12_2_HU_1'),('RM','WAxNA_F2_1_RM_2')]:
            ax1.set_ylabel('Cumulative subclone\n frequency, $f_j$ (bars)')
        else:
            ax1.set_yticklabels([])
        if (e,p) in [('HU','WAxNA_F12_1_HU_3'),('RM','WAxNA_F12_1_RM_4'),
                     ('HU','WAxNA_F12_2_HU_3'),('RM','WAxNA_F12_2_RM_4')]:
            ax2.set_ylabel('Allele frequency (lines)', rotation=270, va='baseline')
        else:
            ax2.set_yticklabels([])
        
        plt.setp(ax1.xaxis.get_majorticklabels(), rotation=0) # rotate the x-axis ticks

        # Set panel labels
        if (e,p) in [('HU','WAxNA_F12_1_HU_1'),('RM','WAxNA_F2_1_RM_1')]:
            ax1.text(-0.25, 1.2,  chr(ii + ord('A')), transform=ax1.transAxes,
                     fontsize=9, fontweight='bold', va='center', ha='right')
        if (e,p) in [('HU','WAxNA_F12_1_HU_2'),('RM','WAxNA_F12_1_RM_2')]:
            ax1.text(0.5, 1.2, 'Selection: %s' % config.selection['long_label'][e],
                     transform=ax1.transAxes, fontsize=8, va='center', ha='center')

for ax in fig.get_axes():
            
    ax.set_ylim(0, 1) # axes limits
    
    ax.xaxis.label.set_size(6)
    ax.yaxis.label.set_size(6)
    ax.tick_params(axis='both', which='major', size=2, labelsize=6)
    ax.tick_params(axis='both', which='minor', size=0, labelsize=0)
            
    plt.setp(ax.get_xticklabels(), fontsize=6)
    plt.setp(ax.get_yticklabels(), fontsize=6)

    for tick in ax.get_xticklabels():
        tick.set_visible(True)
        
    for loc in ['top','bottom','left','right']:
        ax.spines[loc].set_linewidth(.75)

plot.save_figure(dir_supp+'figures/supp_figure_seq_subclonal_dynamics/supp_figure_seq_subclonal_dynamics_cross')
plt.show()

Fig. S3: Subclonal dynamics in time for WAxNA founders evolved in (A) hydroxyurea and (B) rapamycin, measured by whole-population sequencing. Time is on the $x$-axis, starting after crossing when the population has no competing subclones. Cumulative haplotype frequency of subclones (bars) and allele frequency of de novo mutants (lines) are on the $y$-axis. Driver mutations are solid lines and passenger mutations are dashed lines, colored by subclone assignment; circles and squares denote non-synonymous and synonymous mutations, respectively. No macroscopic subclones or de novo mutations were detected in any of the control replicates in YPD.

Figure S4 - Sequence evolution of WA, NA founders



In [30]:

    
panels = {
    'HU': {
        'WA_HU_1':(0,0),
        'WA_HU_2':(0,1),
        'NA_HU_1':(0,2),
        'NA_HU_2':(0,3),
    },
    'RM': {
        'WA_RM_1':(0,0),
        'WA_RM_2':(0,1),
        'NA_RM_1':(0,2),
        'NA_RM_2':(0,3),
    }
}

populations = panels['HU'].keys()+panels['RM'].keys()
groups_dn = seq_dn_df[(seq_dn_df.population.isin(populations)) & \
                      (seq_dn_df.clone=='') & \
                      (seq_dn_df.gene!='non-coding')].groupby(['selection','population'])

# Get a figure with a lot of subplots
fig = plt.figure(figsize=(8, 5))

grid = gridspec.GridSpec(2, 1, hspace=0.5)

gs = {}

for (ii, e) in enumerate(['HU','RM']):
    
    nrows = 1
    ncols = 4
    gs[e] = gridspec.GridSpecFromSubplotSpec(nrows, ncols,
                                             subplot_spec=grid[ii],
                                             wspace=0.15)

    ### De novo mutations ###
    for (jj, p) in enumerate(panels[e].keys()):
        
        # Retrieve axes
        ax = plt.subplot(gs[e][panels[e][p]])
        
        # Set title
        ax.set_title(p.replace('_',' '), fontsize=7, weight='bold')
    
        # Set axes labels
        if (e,p) in [('HU','WA_HU_1'),('RM','WA_RM_1')]:
            ax.set_ylabel('Allele frequency')
            ax.text(-0.15, 1.2, chr(ii + ord('A')), transform=ax.transAxes,
                    fontsize=9, fontweight='bold', va='center', ha='right')
            ax.text(0., 1.2, 'Selection: %s' % config.selection['long_label'][e], 
                    transform=ax.transAxes, fontsize=8, va='center', ha='left')
            ax.set_yticklabels([0.0,0.2,0.4,0.6,0.8,1.0])
        else:
            ax.set_yticklabels([])

        ax.set_xlabel(r'Time, $t$ (days)')
        
        # Set legend
        if (e,p) in [('HU','NA_HU_2')]:
            driver_artist = lines.Line2D((0,1),(0,0), color=config.lineages['bulk']['fill'], 
                                         **config.mutation_type['driver'])
            passenger_artist = lines.Line2D((0,1),(0,0), color=config.lineages['bulk']['fill'], 
                                            **config.mutation_type['passenger'])
            nonsyn_artist = lines.Line2D((0,1),(0,0), mfc=config.lineages['bulk']['fill'], linestyle='',
                                         path_effects=[path_effects.withStroke(linewidth=2, foreground="k")],
                                         **config.consequence_short['non-synonymous'])
            syn_artist = lines.Line2D((0,1),(0,0), mfc=config.lineages['bulk']['fill'], linestyle='', 
                                      path_effects=[path_effects.withStroke(linewidth=2, foreground="k")],
                                      **config.consequence_short['synonymous'])
            leg1 = ax.legend([driver_artist,passenger_artist,nonsyn_artist,syn_artist], 
                             ['driver','passenger','non-synonymous','synonymous'], 
                             bbox_to_anchor=(1.1, -0.25), ncol=1,
                             frameon=False, loc='center left',
                             borderaxespad=0, handlelength=1.75, 
                             title='Mutation', prop={'size':6})
            plt.setp(leg1.get_title(),fontsize=6)
        
        # Set empty panels
        if (e,p) in groups_dn.groups.keys():
            gdn = groups_dn.get_group((e,p))
        else:
            ax.axvspan(8, 32, facecolor='w', edgecolor='0.5', alpha=0.5, hatch='//')
            ax.annotate('Extinct', xy=(16,0.5), fontsize=6, ha='center',
                        path_effects=[path_effects.withStroke(linewidth=0.5, foreground="w")])
            continue

        for (gene, cds_pos, sub, protein_pos, amino_acids, consequence), gdx in \
    gdn.groupby(['gene','cds_position','substitution','protein_position','amino_acids','consequence_short']):
        
            assignment = gdx.assignment.unique()[0]
            mutation_type = gdx.mutation_type.unique()[0]

            gdx.time = gdx.time.apply(int)
            gdx = gdx.sort_values('time').reset_index(drop=True)
            gdx = gdx.sort_index()
            gdx = gdx.set_index('time')
            
            ax.plot(gdx.index, gdx.frequency,
                    color=config.lineages['bulk']['line'], 
                    **utils.merge_two_dicts(config.mutation_type[mutation_type],
                                            config.consequence_short[consequence]))
            
            if mutation_type=='driver':
                index = np.argmax(gdx.frequency)
                ax.annotate(
                    gene, xy=(index,gdx.frequency[index]), style='italic', fontsize=6,
                    textcoords='offset points', xytext=(0, 13), ha = 'center', va = 'top',
                    path_effects=[path_effects.withStroke(linewidth=0.5, foreground="w")], zorder=3
                )
                ax.annotate(
                    amino_acids.split('/')[0]+protein_pos+amino_acids.split('/')[1], 
                    xy=(index,gdx.frequency[index]), fontsize=5,
                    textcoords='offset points', xytext=(0, 7), ha = 'center', va = 'top',
                    path_effects=[path_effects.withStroke(linewidth=0.4, foreground="w")], zorder=3
                )

for ax in fig.get_axes():
    
    ax.set_xlim(2, 32) # axes limits
    ax.set_ylim(0, 1)
    
    ax.xaxis.label.set_size(6)
    ax.yaxis.label.set_size(6)
    ax.tick_params(axis='both', which='major', size=2, labelsize=6)
    ax.tick_params(axis='both', which='minor', size=0, labelsize=0)
    
    plt.setp(ax.get_xticklabels(), fontsize=6)
    plt.setp(ax.get_yticklabels(), fontsize=6)
    
    ax.set_xscale('log', base=2)
    ax.set_xticks([2, 4, 8, 16, 32])
    ax.xaxis.set_major_formatter(ticker.ScalarFormatter())
    ax.yaxis.set_major_locator(ticker.MaxNLocator(nbins=5))
    
    for loc in ['top','bottom','left','right']:
        ax.spines[loc].set_linewidth(0.75)

plot.save_figure(dir_supp+'figures/supp_figure_seq_subclonal_dynamics/supp_figure_seq_subclonal_dynamics_parents')
plt.show()

Fig. S4: Subclonal dynamics in time for WA and NA founders evolved in (A) hydroxyurea and (B) rapamycin, measured by whole-population sequencing. WA founders evolved in hydroxyurea did not survive after 4 days. Driver mutations are solid lines and passenger mutations are dashed lines, colored by subclone assignment; circles and squares denote non-synonymous and synonymous mutations, respectively. No de novo mutations were detected in any of the control replicates in YPD.

Figure S9 - Phenotype evolution

We are inferring the model's components ($F, \lambda_1, \sigma_{\lambda_1}, \lambda_2, \sigma_{\lambda_2}$) using a Gaussian mixture model.



In [18]:

    
param='rel_growth_rate'

scatter_panels = {
    'WAxNA_F12_1_HU_2':0,
    'WAxNA_F12_1_HU_3':1,
    'WAxNA_F12_2_HU_3':2,
    'WAxNA_F12_1_RM_3':3,
    'WAxNA_F12_1_RM_4':4,
    'WAxNA_F12_2_RM_2':5,
}

data = pheno_df[pheno_df.population.isin(scatter_panels.keys())& \
                np.isfinite(pheno_df[param])] # Take rows where param is finite
data = pd.pivot_table(
    data, 
    index=['selection','population','group','isolate','gene','genotype_long','assignment'], 
    columns='environment', 
    values=param,
    aggfunc=np.mean
)

corr = pheno_df[pheno_df.population.isin(scatter_panels.keys())& \
                np.isfinite(pheno_df[param])] # Take rows where param is finite
corr = pd.pivot_table(
    corr, 
    index=['isolate','gene','genotype_long','assignment'], 
    columns=['selection','population','group','environment'], 
    values=param,
    aggfunc=np.mean
)
corr = corr.groupby(level=['selection','population','group'], axis=1, group_keys=False)
corr = corr.apply(lambda x: x.corr(method='spearman'))
corr = corr.query('environment==\'SC\'')
corr = pd.melt(corr).dropna()
corr = corr.pivot_table(columns=['group'], index=['selection','population','environment'], values='value')
    
fig = plt.figure(figsize=(7.5,5.25))
fig.subplots_adjust(left=0.02, right=0.98, bottom=0.02, top=0.98)

# Make outer gridspec
grid = gridspec.GridSpec(nrows=2, ncols=3, width_ratios=[2, 2, 2], hspace=.5, wspace=.25) 

gs = {}

for ii, ((s, p), gp) in enumerate(data.groupby(level=['selection','population'])):
    print(s, p)
    # Use gridspec to assign different formats to panels in one plot
    gs[(s,p)] = gridspec.GridSpecFromSubplotSpec(nrows=2, ncols=2, hspace=.05, wspace=.05, 
                                                 width_ratios=[4,1], height_ratios=[1,4], 
                                                 subplot_spec=grid[scatter_panels[p]])
    
    ax = plt.subplot(gs[(s,p)][:])
    ax_scatter = plt.subplot(gs[(s,p)][1,0])
    ax_x = plt.subplot(gs[(s,p)][0,0])
    ax_y = plt.subplot(gs[(s,p)][1,1])

    # Define plot ranges at beginning, since used often later
    x = gp['SC'].values
    y = gp[s].values
    
    if s=='HU':
        x_range = [-0.2, 0.45]
        y_range = [-0.175, 0.225]
        x_count_range = [0, 0.4]
        y_count_range = [0, 0.3]
    elif s=='RM':
        x_range = [-0.4, 1.6]
        y_range = [-0.2, 0.19]
        x_count_range = [0, 0.4]
        y_count_range = [0, 0.2]

    # Set title
    ax_x.set_title(p.replace('_',' '), fontsize=7, weight='bold')
    
    ax_scatter.annotate(
        'Ancestral (t = 0d)\n' r'$\rho$ = {:.2f}'.format(corr.ix[s, p, s]['ancestral']),
        xy=(1.25, 1.15), xycoords='axes fraction', fontsize=6,
        color=config.population['color']['ancestral'], ha='center', va='bottom'
    )
    ax_scatter.annotate(
        'Evolved (t = 32d)\n' r'$\rho$ = {:.2f}'.format(corr.ix[s, p, s]['evolved']),
        xy=(1.25, 1.025), xycoords='axes fraction', fontsize=6,
        color=config.population['color']['evolved'], ha='center', va='bottom'
    )
    ax_scatter.axvline(x=0, ls='--', lw=1.5, color='lightgray', zorder=0)
    ax_scatter.axhline(y=0, ls='--', lw=1.5, color='lightgray', zorder=0)

    for jj, (t, gt) in enumerate(gp.groupby(level='group')):
        
        gt_all = gt.groupby(level=['isolate','gene','genotype_long','assignment']).agg([np.mean])
        gt_random = gt.query('assignment==\'\'').groupby(level=['isolate','gene','genotype_long','assignment']).agg([np.mean])
        gt_target = gt.query('assignment!=\'\'').groupby(level=['isolate','gene','genotype_long','assignment']).agg([np.mean])
        print gt_target
        x_a = gt_all[s]
        y_a = gt_all['SC']
        
        x_r = gt_random[s]
        y_r = gt_random['SC']
        color = config.population['color'][t]
        
        # Scatter plot
        plot.scatter_plot(x_r, y_r, ax=ax_scatter, marker='.', color=color, ms=3)
        ax_scatter.set_xlim(x_range)
        ax_scatter.set_ylim(y_range)

#         ax_scatter.annotate(corr.ix[s, p, 'SC'][t],
#                             xy=(0.95, 0.05), xycoords='axes fraction', fontsize=8,
#                             color=color, ha='right', va='bottom')
        
        for (isolate, gene, genotype, assignment), data in gt_target.iterrows():
            x_t = gt_target[s]
            y_t = gt_target['SC']
            plot.scatter_plot(x_t, y_t, ax=ax_scatter, marker='o', ms=3, mec='k', mfc=color)
            ax_scatter.annotate(
                gene, xy = (data[s], data['SC']), xycoords='data', 
                xytext = (0, 8), textcoords = 'offset points', 
                ha = 'center', va = 'top', 
                fontsize = 6, style = 'italic', 
                path_effects=[path_effects.withStroke(linewidth=0.5, foreground="w")]
            )
        
        # x-axis
        plot.histogram_x(x_r, ax=ax_x, time=t)
        ax_x.set_xlim(x_range)
        ax_x.set_ylim(y_count_range)
        
        # Mean of sequenced isolates       
#         lollipops(x_s, ax_x)
        
        # y-axis
        plot.histogram_y(y_r, ax=ax_y, time=t)
        ax_y.set_xlim(x_count_range)
        ax_y.set_ylim(y_range)

# Set axes labels
ax = plt.subplot(gs[('HU','WAxNA_F12_1_HU_3')][1,0])
ax.set_xlabel('%s\nRel. growth rate, $\lambda_k(t)$' % config.environment['long_label']['HU'])
ax = plt.subplot(gs[('HU','WAxNA_F12_1_HU_2')][1,0])
ax.set_ylabel('Rel. growth rate, $\lambda_k(t)$\n%s' % config.environment['long_label']['SC'])

ax = plt.subplot(gs[('RM','WAxNA_F12_1_RM_4')][1,0])
ax.set_xlabel('%s\nRel. growth rate, $\lambda_k(t)$' % config.environment['long_label']['RM'])
ax = plt.subplot(gs[('RM','WAxNA_F12_1_RM_3')][1,0])
ax.set_ylabel('Rel. growth rate, $\lambda_k(t)$\n%s' % config.environment['long_label']['SC'])

# Set panel labels
ax = plt.subplot(gs[('HU','WAxNA_F12_1_HU_2')][0,0])
ax.text(-.2, 1.75, chr(ord('A')), transform=ax.transAxes,
        fontsize=9, fontweight='bold', va='center', ha='right')
ax = plt.subplot(gs[('HU','WAxNA_F12_1_HU_3')][0,0])
ax.text(0.5, 1.75, 'Selection: %s' % config.selection['long_label']['HU'], transform=ax.transAxes,
        fontsize=8, va='center', ha='center')
ax = plt.subplot(gs[('RM','WAxNA_F12_1_RM_3')][0,0])
ax.text(-.2, 1.75, chr(ord('B')), transform=ax.transAxes,
        fontsize=9, fontweight='bold', va='center', ha='right')
ax = plt.subplot(gs[('RM','WAxNA_F12_1_RM_4')][0,0])
ax.text(0.5, 1.75, 'Selection: %s' % config.selection['long_label']['RM'], transform=ax.transAxes,
        fontsize=8, va='center', ha='center')
    
# Axes limits
for ax in fig.get_axes():
    ax.xaxis.label.set_size(6)
    ax.yaxis.label.set_size(6)
    ax.tick_params(axis='both', which='major', size=2, labelsize=6)
    ax.tick_params(axis='both', which='minor', size=0, labelsize=6)
    
    for loc in ['top','bottom','left','right']:
        ax.spines[loc].set_linewidth(0.75)

plot.save_figure(dir_supp+'figures/supp_figure_pheno_evolution/supp_figure_pheno_evolution')
plt.show()









    



(u'HU', u'WAxNA_F12_1_HU_2')
Empty DataFrame
Columns: [(HU, mean), (RM, mean), (SC, mean)]
Index: []
                                             HU   RM        SC
                                           mean mean      mean
isolate gene genotype_long assignment                         
1       RNR4 RNR4*/RNR4*   bulk       -0.031631  NaN -0.050832
2       RNR4 RNR4*/RNR4*   bulk        0.058815  NaN -0.094872
(u'HU', u'WAxNA_F12_1_HU_3')
Empty DataFrame
Columns: [(HU, mean), (RM, mean), (SC, mean)]
Index: []
                                             HU   RM        SC
                                           mean mean      mean
isolate gene genotype_long assignment                         
1       RNR2 RNR2*/RNR2    subclone A  0.176067  NaN -0.030428
2       RNR2 RNR2*/RNR2    subclone A  0.209420  NaN -0.037551
3       RNR2 RNR2*/RNR2    subclone A  0.192393  NaN -0.056146
4       RNR2 RNR2*/RNR2    subclone A  0.203274  NaN -0.046559
5       RNR2 RNR2*/RNR2    subclone A  0.349191  NaN  0.032273
6       RNR2 RNR2*/RNR2    subclone A  0.384629  NaN  0.070329
(u'HU', u'WAxNA_F12_2_HU_3')
Empty DataFrame
Columns: [(HU, mean), (RM, mean), (SC, mean)]
Index: []
                                             HU   RM        SC
                                           mean mean      mean
isolate gene genotype_long assignment                         
1       RNR4 RNR4*/RNR4*   subclone A  0.186776  NaN -0.025375
2       RNR2 RNR2*/RNR2    bulk        0.167723  NaN  0.058833
3       RNR2 RNR2*/RNR2    bulk        0.173507  NaN -0.018978
4       RNR4 RNR4*/RNR4*   subclone A  0.174113  NaN -0.056191
5       RNR4 RNR4*/RNR4*   subclone A  0.153926  NaN -0.089085
6       RNR4 RNR4*/RNR4*   subclone A  0.158836  NaN -0.056436
(u'RM', u'WAxNA_F12_1_RM_3')
Empty DataFrame
Columns: [(HU, mean), (RM, mean), (SC, mean)]
Index: []
                                        HU        RM        SC
                                      mean      mean      mean
isolate gene genotype_long assignment                         
1       TOR1 TOR1*/TOR1    bulk        NaN  1.300394  0.057327
2       FPR1 FPR1*/FPR1*   subclone A  NaN  0.868465 -0.136658
3       FPR1 FPR1*/FPR1*   subclone A  NaN  0.842563 -0.123537
(u'RM', u'WAxNA_F12_1_RM_4')
Empty DataFrame
Columns: [(HU, mean), (RM, mean), (SC, mean)]
Index: []
Empty DataFrame
Columns: [(HU, mean), (RM, mean), (SC, mean)]
Index: []
(u'RM', u'WAxNA_F12_2_RM_2')
Empty DataFrame
Columns: [(HU, mean), (RM, mean), (SC, mean)]
Index: []
                                        HU        RM        SC
                                      mean      mean      mean
isolate gene genotype_long assignment                         
1                          bulk        NaN -0.010488 -0.006145
2                          bulk        NaN -0.018509 -0.055439
3                          bulk        NaN  0.017906 -0.028011
4                          bulk        NaN -0.045135 -0.069053
5                          bulk        NaN  0.118939  0.011753
6       TOR1 TOR1*/TOR1    bulk        NaN  0.740531 -0.033746

Fig. S9: Variability in intra-population growth rate and fitness correlations. Fitness correlations of ancestral and evolved populations across environments, estimated by random sampling of individuals at initial (0 days, green) and final time points (32 days, purple), before and after selection in (A) hydroxyurea and (B) rapamycin. The relative growth rate $\lambda_k(t)$ per individual $k$ is shown, calculated by averaging over ${n_r\,{=}\,32}$ technical replicates per individual. Relative growth rates are normalized with respect to the mean population growth rate $\langle\lambda_k\rangle_{t=0}$ at $t=0$ days (see Figures 3B and 3D). The relative growth rates $\lambda_k(t)$ in the stress environment ($x$-axis) are compared to the control environment ($y$-axis). Using a Gaussian mixture model, we found the posterior probability of the mixture modes of the the best-fit mixture (solid lines). The posterior means of the distribution modes are indicated as dashed lines. The fitter individuals carry driver mutations, as determined by targeted sampling and sequencing. Spearman's rank correlation, $\rho$, is shown on the top right of each panel, to assess the association between the growth rate of isolates in the stress and control environments at 0 and 32 days.

	background	cross	cross_rep	selection	selection_rep	subclone A	subclone B	subclone C	subclone D	time	population	sample	clonal	bulk
0	WAxNA	F12	2	RM	2	0.0200	NaN	NaN	NaN	2	WAxNA_F12_2_RM_2	WAxNA_F12_2_RM_2_T2	0.0200	0.9800
1	WAxNA	F12	2	RM	2	0.0202	NaN	NaN	NaN	4	WAxNA_F12_2_RM_2	WAxNA_F12_2_RM_2_T4	0.0202	0.9798
2	WAxNA	F12	2	RM	2	0.0705	NaN	NaN	NaN	8	WAxNA_F12_2_RM_2	WAxNA_F12_2_RM_2_T8	0.0705	0.9295
3	WAxNA	F12	2	RM	2	0.1098	NaN	NaN	NaN	16	WAxNA_F12_2_RM_2	WAxNA_F12_2_RM_2_T16	0.1098	0.8902
4	WAxNA	F12	2	RM	2	0.1097	NaN	NaN	NaN	32	WAxNA_F12_2_RM_2	WAxNA_F12_2_RM_2_T32	0.1097	0.8903

	ac	af	alt	an	allele	amino_acids	biotype	baseqranksum	cds_position	chr_arabic	...	population	coverage	substitution	assignment	mutation_type	chr_length	chr_roman	pos_cum	variant_type	consequence_short
0	1	0.1	[A]	2.0			non-coding	1.093		7	...	NA_HU_1	57.370612	G>A	bulk	passenger	1090940	VII	4141136.0	snv	non-coding
1	0	0.1	[A]	2.0	A	K/M	protein_coding	-0.027	341	7	...	NA_HU_1	57.370612	T>A	bulk	driver	1090940	VII	4594951.0	snv	non-synonymous
2	0	0.1	[A]	2.0	A	R/I	protein_coding	-1.198	101	7	...	NA_HU_1	57.370612	C>A	bulk	driver	1090940	VII	4595191.0	snv	non-synonymous
3	1	0.1	[T]	2.0			non-coding	0.395		12	...	NA_HU_1	57.370612	C>T	bulk	passenger	1078177	XII	8063245.0	snv	non-coding
4	1	0.1	[A]	2.0	A	G	protein_coding	0.956	861	14	...	NA_HU_1	57.370612	C>A	bulk	passenger	784333	XIV	9384437.0	snv	synonymous

	selection	environment	run	plate	row	column	group	population	background	ploidy	...	mating	abs_growth_rate	abs_doubling_time	norm_growth_rate	norm_doubling_time	rel_growth_rate	rel_doubling_time
0	HU	HU	1	1	1	3	evolved	WAxNA_F12_1_HU_3	WAxNA	diploid	...	MATa/α	0.065577	3.930664	0.650017	0.621451	-0.079774	0.158258
1	HU	HU	1	1	1	4	evolved	WAxNA_F12_2_HU_3	WAxNA	diploid	...	MATa/α	0.058038	4.106870	0.610609	0.711680	-0.097435	0.203138
2	HU	HU	1	1	1	7	evolved	WAxNA_F12_1_HU_3	WAxNA	diploid	...	MATa/α	0.105026	3.251181	0.797564	0.326327	0.067773	-0.136866
3	HU	HU	1	1	1	8	evolved	WAxNA_F12_2_HU_3	WAxNA	diploid	...	MATa/α	0.085085	3.554953	0.737843	0.438614	0.029800	-0.069928
4	HU	HU	1	1	1	11	evolved	WAxNA_F12_1_HU_3	WAxNA	diploid	...	MATa/α	0.108511	3.204089	0.793288	0.334084	0.063496	-0.129109