Supplemental Information:

"Clonal heterogeneity influences the fate of new adaptive mutations"

Ignacio Vázquez-García, Francisco Salinas, Jing Li, Andrej Fischer, Benjamin Barré, Johan Hallin, Anders Bergström, Elisa Alonso-Pérez, Jonas Warringer, Ville Mustonen, Gianni Liti

Figures S6, S7 and S8

This IPython notebook is provided for reproduction of Figures S6, S7 and S8 of the paper. It can be viewed by copying its URL to nbviewer and it can be run by opening it in binder.


In [1]:
# Load external dependencies
from setup import *
# Load internal dependencies
import config,plot,utils

%load_ext autoreload
%autoreload 2

%matplotlib inline

Data import


In [2]:
# Load data for genetic constructs
pheno_df = pd.read_csv(dir_data+'pheno/genetic-constructs/pheno_genetic_constructs.csv.gz', encoding='utf-8', keep_default_na=False, na_values='NaN')
pheno_df = pheno_df[pheno_df.group!='control'] # Filter out strains used for spatial control

# Load ancestral populations as control
control_df = pd.read_csv(dir_data+'pheno/populations/pheno_populations.csv.gz', encoding='utf-8', keep_default_na=False, na_values='NaN')
control_df = control_df[(control_df.group == 'ancestral')] # Filter out strains used for spatial control

# Normalise by the ancestral population
def norm(results, control, param='growth_rate'):

    results['rel_' + param] = results['norm_' + param] - control['norm_' + param].mean()
    
    return results

pheno_df = pheno_df.groupby(['selection','environment'], as_index=False).apply(
    lambda results_df: norm(results_df, control_df, param='growth_rate')
)

pheno_df = pheno_df.groupby(['selection','environment'], as_index=False).apply(
    lambda results_df: norm(results_df, control_df, param='doubling_time')
)

# Match shared controls by candidate gene
groups_1 = pheno_df.groupby(['selection','environment','gene','background'])
for (ii,((env_evo, env_test, gene, background),g1)) in enumerate(groups_1):
    
    if gene!='':
        tmp = groups_1.get_group((env_evo, env_test, '', background))        
        tmp.loc[:,'gene'] =  tmp['gene'].replace('', gene)
        
        pheno_df = pheno_df.append(tmp)

pheno_df = pheno_df[pheno_df.gene != '']
pheno_df = pheno_df.reset_index(drop=True)

# # Filter out measurement replicates with >5% measurement error
# pheno_df['pct'] = pheno_df.groupby(['selection','environment','gene','background','genotype_long'])['growth_rate']\
# .apply(lambda x: (x-x.mean())/float(x.mean()))
# pheno_df = pheno_df[abs(pheno_df['pct'])<0.05]

pheno_df.head() # Show dataframe header to stdout


Out[2]:
selection environment run plate row column group population background ploidy ... genotype_long amino_acids mating auxotrophy abs_growth_rate abs_doubling_time norm_growth_rate norm_doubling_time rel_growth_rate rel_doubling_time
0 HU HU 1 1 10 35 constructs WA haploid ... rnr4Δ MATa 0.027016 5.210060 0.478112 1.064578 -0.350431 0.777033
1 HU HU 1 1 10 39 constructs NA haploid ... rnr4Δ MATa NaN NaN NaN NaN NaN NaN
2 HU HU 1 1 10 43 constructs WA/WA diploid ... rnr4Δ/RNR4 MATa/α 0.039645 4.656728 0.544028 0.878248 -0.284516 0.590703
3 HU HU 1 1 10 47 constructs NA/NA diploid ... rnr4Δ/RNR4 MATa/α 0.071257 3.810827 0.699604 0.515390 -0.128939 0.227844
4 HU HU 1 1 12 35 constructs WA haploid ... rnr4Δ MATa 0.049038 4.349951 0.552197 0.856746 -0.276346 0.569200

5 rows × 28 columns

Figure S6 - Strategy for strain construction


In [3]:
from IPython.display import Image
Image(filename=dir_supp+'figures/supp_figure_schematic_constructs/supp_figure_schematic_constructs_publication.png')


Out[3]:

Fig. S6: Strategy for strain construction. (A-C) Gene deletion was mediated by homologous recombination between the terminals of the PCR product and the corresponding genomic sequence where the gene to be deleted (‘target’) is encoded. Blue and red lines indicate WA and NA chromosomes, respectively. Flanking regions in green indicate two different homologous sequences targeted for recombination, which are 30-40 bp long in S. cerevisiae. Genes of interest were individually deleted in both WA and NA haploids, resulting in rnr4∆, fpr1∆ and tor1∆ strains in both parental backgrounds. A similar strategy was used to delete genes in WA and NA homozygous diploids. RNR2 and RNR4 were only deleted in one allele while there is the wild-type gene remaining in the other allele. The primer sequences used are listed in Extended Data Table 6. (C) Evolved segregants with de novo mutations were isolated from the WAxNA $\text{F}_\text{12}$ populations. Using the same strategy, RNR2 or TOR1 could be rid of either the wild-type allele or the mutated allele. (D) We crossed the strain constructed in (A) with the parental strain with wild-type gene, to obtain strains with deleted genes in WA, NA homozygous diploid and WA/NA hybrid.

Figures S7 and S8 - Validation tests for driver and passenger mutations


In [4]:
from scipy import stats

param = 'rel_growth_rate'

# shape = pd.DataFrame({k: 
#  pd.pivot_table(x, values='norm_growth_rate', columns=['background','genotype_long']).shape
#  for k,x in pheno_df.groupby(['selection','environment','gene'])
# })

shape = pd.pivot_table(pheno_df, values='norm_growth_rate', columns=['selection','environment','gene'], index=['background','genotype_long']).count()

for (ii, ((env_evo), gph1)) in enumerate(pheno_df.groupby(['selection'])):

    # Remove NaNs
    gph1 = gph1[np.isfinite(gph1[param])]
    
    nrows=len(gph1.groupby(['background','gene','genotype_long']))
    ncols=len(gph1.groupby(['environment']))
            
    height, width = np.array([nrows*0.11, ncols*3], dtype=float)
    
    fig = plt.figure(figsize=(width, height))
    
    fig.subplots_adjust(left=0.07,bottom=0.01,right=0.85,top=0.99)

    grid = gridspec.GridSpec(1, 2, hspace=0., wspace=0.)
    
    gs = {}
        
    for (jj, ((env_test), gph2)) in enumerate(gph1.groupby(['environment'])):
        
        nrows = len(gph2['gene'].unique())
        ncols = 1
        height_ratios = shape[env_evo,env_test].values.squeeze()
        gs[env_test] = gridspec.GridSpecFromSubplotSpec(nrows, ncols, height_ratios=height_ratios, 
                                                        subplot_spec=grid[0,jj], hspace=0., wspace=0.)
    
        axes = {}
        
        for (kk, ((gene), gph3)) in enumerate(gph2.groupby('gene')):
                        
            gph3.loc[:,'rank_background'] = gph3['background'].map(config.construct_background['position'])
            gph3.loc[:,'rank_genotype'] = gph3['genotype_long'].map(config.construct_genotype[env_evo][gene])
            gph3.sort_values(['rank_background','rank_genotype'], 
                             ascending=[False,False], inplace=True)
      
            if kk == 0:
                axes[env_test] = plt.subplot(gs[env_test][kk])
                ax1 = axes[env_test]
                ax1.set_title(config.environment['long_label'][env_test], fontsize=7)
            else:
                ax1 = plt.subplot(gs[env_test][kk], sharex=axes[env_test])
            
            gph3 = gph3.set_index(['background','genotype_long'], append=True)[param]\
            .unstack(['background','genotype_long'])
                        
            # boxplot
            bp = gph3.plot(
                ax=ax1, kind='box', 
                widths=0.65, vert=False, return_type='dict',
                labels=gph3.columns.get_level_values('background')
            )
            
            colors = [config.construct_background['color']['wt'][b] if g=='WT' \
                      else config.construct_background['color']['mut'][b] \
                      for b,g in zip(gph3.columns.get_level_values('background'),\
                                     gph3.columns.get_level_values('genotype_long'))]
            
            plot.boxplot_custom(bp, ax1, colors=colors, hatches=['']*len(gph3.columns))
            
            for ll, x in enumerate(gph3.columns):
                
                if x in config.construct_tests[gene]:
                    
                    wt_data = gph3[config.construct_tests[gene][x]].dropna()
                    mut_data = gph3[x].dropna()
                    z_stat, p_val = stats.ranksums(wt_data,mut_data)
                                        
                    if p_val < 0.0001:
                    
                        x_min = min(wt_data.min(),mut_data.min())
                        x_max = max(wt_data.max(),mut_data.max())
                             
                        ax1.annotate('', xy=(x_max, ll+1), xycoords='data',
                                     xytext=(x_max, ll+2), textcoords='data',
                                     arrowprops=dict(arrowstyle="-", ec='#aaaaaa', linewidth=.75,
                                                     connectionstyle="bar,fraction=-0.3"))
                        ax1.annotate('*',#utils.stars(p_val), 
                                     xy=((x_max+0.02 if env_evo=='HU' else x_max+0.04), ll+1.5), xycoords='data',
                                     ha='center', va='center', fontsize=6, weight='bold', rotation=270)
                        
            if jj==0:
                ax1.annotate(gene, 
                             xy=(-0.2, 0.5), xycoords=("axes fraction", "axes fraction"),
                             ha='right', va='center', annotation_clip=False, rotation=90, 
                             fontsize=6, fontstyle='italic', fontweight='bold')

            # reset ticks
            ax1.set_yticks([])
            ax1.set_yticklabels([])
            
            ax1.set_axisbelow(False)
            
            ### vertical ###
            ax1.xaxis.grid(ls="-", lw=.75, color="0.9", zorder=0)
            ax1.axvline(x=0., color='k', ls="--", dashes=(7, 7), lw=.75, zorder=3)
            
            ### horizontal ###
            
            ## background
            ystart, yend, ylabels = plot.set_custom_labels(gph3.columns, 0)
            
            # grid
            ygrid=[yst+1.5 for yst in list(set(ystart.values()))]
            [ax1.axhline(g, lw=.75, ls="-", color="0.9", zorder=2) for g in ygrid]
            
            # labels
            if jj==0:
                # tick labels
                ax1.set_yticks([y+1 for y in ylabels.values()], minor=True)
                ax1.set_yticklabels(ylabels.keys(), minor=True)
                ax1.get_yaxis().tick_left()
                if kk==0:
                    # axis label
                    ax1.set_ylabel('Background', transform=ax1.transAxes, 
                                   weight='bold', rotation=0)
                    ax1.yaxis.set_label_coords(-0.1, 1.025)
            
            ## genotype
            # labels
            if jj==1:
                # tick labels
                ylabels = gph3.columns.get_level_values('genotype_long')
                ax1.set_yticks(np.arange(1, len(ylabels)+1), minor=True)
                ax1.set_yticklabels(['/'.join(y) if isinstance(y, tuple) else y for y in ylabels], minor=True)
                ax1.get_yaxis().tick_right()
                if kk==0:
                    # axis label
                    ax1.set_ylabel('Genotype', transform=ax1.transAxes, 
                                   weight='bold', rotation=0)
                    ax1.yaxis.set_label_coords(1.1, 1.025)
            
            # set axes labels
            ax1.set_xlabel(r'Rel. growth rate, $\lambda_{bg}$', fontsize=10)
            
            if jj==1 and kk==0:
                wa_artist = patches.Rectangle((0,0), width=1, height=1, edgecolor='k',
                                              facecolor=config.construct_background['color']['mut']['WA'])
                na_artist = patches.Rectangle((0,0), width=1, height=1, edgecolor='k',
                                              facecolor=config.construct_background['color']['mut']['NA'])
                wana_artist = patches.Rectangle((0,0), width=1, height=1, edgecolor='k',
                                                facecolor=config.construct_background['color']['mut']['WA/NA'])

                leg1 = ax1.legend([wa_artist,na_artist,wana_artist], 
                                  ['WA; WA/WA','NA; NA/NA','WA/NA'], 
                                  ncol=1, loc='upper right',
                                  borderaxespad=0, handlelength=0.75, 
                                  prop={'size':5}, title='Background', 
                                  labelspacing=.32)

                wt_artist = patches.Rectangle((0,0), width=1, height=1, facecolor='0.9', edgecolor='k')
                construct_artist = patches.Rectangle((0,0), width=1, height=1, facecolor='0.1', edgecolor='k')

                leg2 = ax1.legend([wt_artist,construct_artist], 
                                  ['WT','construct'], 
                                  ncol=1, loc='lower right',
                                  borderaxespad=0, handlelength=0.75, 
                                  prop={'size':5}, title='Genotype', 
                                  labelspacing=.32)
                        
                ax1.add_artist(leg1)
        
                for leg in [leg1,leg2]:
                    plt.setp(leg.get_title(),fontsize=6)
                    leg.set_zorder(2)
                    leg.get_frame().set_edgecolor('none')
                    leg.get_frame().set_facecolor('w')
                    
            if env_evo=='RM' and env_test=='RM' and gene=='FPR1':
                transform = transforms.blended_transform_factory(ax1.transAxes, ax1.transData)
                x0, x1, y0, y1 = (1.1, 1.9, .5, 7.5)
                im = ax1.imshow(plt.imread(dir_supp+'figures/supp_figure_pheno_constructs/FPR1_LOH_inset.png'),
                                aspect='auto', extent=(x0, x1, y0, y1), transform=transform, zorder=1)
            
            transform = transforms.blended_transform_factory(ax1.transAxes, ax1.transAxes)
            if env_evo=='HU':
                ax1.set_xlim(-0.45,0.49)
                ax1.xaxis.set_major_locator( ticker.MaxNLocator(nbins = 4) )
                ax1.xaxis.set_minor_locator( ticker.MaxNLocator(nbins = 4) )
                ax1.xaxis.set_major_formatter(ticker.FormatStrFormatter('%.2f'))
                if env_test=='HU':
                    ax1.spines['left'].set_visible(False)
                    ax1.spines['right'].set_visible(True)
                    patch = patches.Rectangle((-.18,0), width=.18, height=1,
                                              linewidth=.75, facecolor='0.9', edgecolor='w', 
                                              transform=transform, zorder=0)
                if env_test=='SC':
                    ax1.spines['left'].set_visible(True)
                    ax1.spines['right'].set_visible(False)
                    patch = patches.Rectangle((1,0), width=.4, height=1,
                                              linewidth=.75, facecolor='0.9', edgecolor='w', 
                                              transform=transform, zorder=0)
            elif env_evo=='RM':
                ax1.set_xlim(-0.65,2)
                ax1.xaxis.set_major_locator( ticker.MaxNLocator(nbins = 3) )
                ax1.xaxis.set_minor_locator( ticker.MaxNLocator(nbins = 3) )
                ax1.xaxis.set_major_formatter(ticker.FormatStrFormatter('%.2f'))
                if env_test=='RM':
                    ax1.spines['left'].set_visible(False)
                    ax1.spines['right'].set_visible(True)
                    patch = patches.Rectangle((-.18,0), width=.18, height=1,
                                              linewidth=.75, facecolor='0.9', edgecolor='w',  
                                              transform=transform, zorder=0)
                if env_test=='SC':
                    ax1.spines['left'].set_visible(True)
                    ax1.spines['right'].set_visible(False)
                    patch = patches.Rectangle((1,0), width=.525, height=1,
                                              linewidth=.75, facecolor='0.9', edgecolor='w',  
                                              transform=transform, zorder=0)
                    patch.set_clip_on(False)
            ax1.add_patch(patch)
            patch.set_clip_on(False)

        # Tweak axes
        for ax in fig.get_axes():
            ax.spines['top'].set_visible(True)
            ax.spines['bottom'].set_visible(True)
            
            ax.xaxis.label.set_size(6)
            ax.yaxis.label.set_size(6)
            
            ax.tick_params(axis='x', which='both', size=0, labelsize=6)
            ax.tick_params(axis='y', which='major', size=0, labelsize=6)
            ax.tick_params(axis='y', which='minor', size=0, labelsize=6)
            for sp in ax.spines.values():
                sp.set(color='k', linewidth=.5, linestyle='-')
    
    plot.save_figure(dir_supp+'figures/supp_figure_pheno_constructs/supp_figure_pheno_constructs_%s' % env_evo)
    
plt.show()


Fig. S7: Validation tests for driver mutations in hydroxyurea, measured in SC+HU (left) and SC (right). Growth rate measurements, $\lambda_{bg}$, are shown for measurement replicates of each construct ($n = 64$) and grouped by candidate gene and by background of the construct, where the background $b$ can be WA, NA (haploid); WA/WA, NA/NA (diploid); WA/NA (hybrid), and the genotype $g$ can be wild-type for the gene, deleted or hemizygous. Medians and 25%/75% percentiles across groups are shown, with medians as horizontal lines and outliers highlighted. The color of each of the boxes reflects the background, with WA and WA/WA (blue), NA and NA/NA (red) and WA/NA (purple). Lighter shades indicate a wild-type (WT) control for that specific background and darker colors are the candidate strains. Using a non-parametric Wilcoxon rank-sum test, we compared deletion strains against their respective WT control and hemizygous strains against each other. Significance tests with a $P$-value below $10^{−4}$ are highlighted.

Fig. S8: Validation tests for driver and passenger mutations in rapamycin, measured in SC+RM (left) and SC (right). Growth rate measurements, $\lambda_{bg}$, are shown for measurement replicates of each construct ($n = 64$) and grouped by candidate gene and by background of the construct, where the background $b$ can be WA, NA (haploid); WA/WA, NA/NA (diploid); WA/NA (hybrid), and the genotype $g$ can be wild-type for the gene, deleted or hemizygous. Medians and 25%/75% percentiles across groups are shown, with medians as horizontal lines and outliers highlighted. The color of each of the boxes reflects the background, with WA and WA/WA (blue), NA and NA/NA (red) and WA/NA (purple). Lighter shades indicate a wild-type (WT) control for that specific background and darker colors are the candidate strains. Using a non-parametric Wilcoxon rank-sum test, we compared deletion strains against their respective WT control and hemizygous strains against each other. Significance tests with a $P$-value below $10^{−4}$ are highlighted. Visual inspection of FPR1 heterozygous deletions using a spot assay (inset) manifests the immediate loss of the wild-type allele by LOH – validated by colony Sanger sequencing.


In [5]:
from IPython.display import Image
Image(filename=dir_supp+'figures/supp_figure_pheno_constructs/FPR1_LOH_inset.png')


Out[5]: