Supplemental Information:

"Clonal heterogeneity influences the fate of new adaptive mutations"

Ignacio Vázquez-García, Francisco Salinas, Jing Li, Andrej Fischer, Benjamin Barré, Johan Hallin, Anders Bergström, Elisa Alonso-Pérez, Jonas Warringer, Ville Mustonen, Gianni Liti

Figures S6, S7 and S8

This IPython notebook is provided for reproduction of Figures S6, S7 and S8 of the paper. It can be viewed by copying its URL to nbviewer and it can be run by opening it in binder.



In [1]:

    
# Load external dependencies
from setup import *
# Load internal dependencies
import config,plot,utils

%load_ext autoreload
%autoreload 2

%matplotlib inline

Data import



In [2]:

    
# Load data for genetic constructs
pheno_df = pd.read_csv(dir_data+'pheno/genetic-constructs/pheno_genetic_constructs.csv.gz', encoding='utf-8', keep_default_na=False, na_values='NaN')
pheno_df = pheno_df[pheno_df.group!='control'] # Filter out strains used for spatial control

# Load ancestral populations as control
control_df = pd.read_csv(dir_data+'pheno/populations/pheno_populations.csv.gz', encoding='utf-8', keep_default_na=False, na_values='NaN')
control_df = control_df[(control_df.group == 'ancestral')] # Filter out strains used for spatial control

# Normalise by the ancestral population
def norm(results, control, param='growth_rate'):

    results['rel_' + param] = results['norm_' + param] - control['norm_' + param].mean()
    
    return results

pheno_df = pheno_df.groupby(['selection','environment'], as_index=False).apply(
    lambda results_df: norm(results_df, control_df, param='growth_rate')
)

pheno_df = pheno_df.groupby(['selection','environment'], as_index=False).apply(
    lambda results_df: norm(results_df, control_df, param='doubling_time')
)

# Match shared controls by candidate gene
groups_1 = pheno_df.groupby(['selection','environment','gene','background'])
for (ii,((env_evo, env_test, gene, background),g1)) in enumerate(groups_1):
    
    if gene!='':
        tmp = groups_1.get_group((env_evo, env_test, '', background))        
        tmp.loc[:,'gene'] =  tmp['gene'].replace('', gene)
        
        pheno_df = pheno_df.append(tmp)

pheno_df = pheno_df[pheno_df.gene != '']
pheno_df = pheno_df.reset_index(drop=True)

# # Filter out measurement replicates with >5% measurement error
# pheno_df['pct'] = pheno_df.groupby(['selection','environment','gene','background','genotype_long'])['growth_rate']\
# .apply(lambda x: (x-x.mean())/float(x.mean()))
# pheno_df = pheno_df[abs(pheno_df['pct'])<0.05]

pheno_df.head() # Show dataframe header to stdout









    Out[2]:







  
    
      
      selection
      environment
      run
      plate
      row
      column
      group
      population
      background
      ploidy
      ...
      genotype_long
      amino_acids
      mating
      auxotrophy
      abs_growth_rate
      abs_doubling_time
      norm_growth_rate
      norm_doubling_time
      rel_growth_rate
      rel_doubling_time
    
  
  
    
      0
      HU
      HU
      1
      1
      10
      35
      constructs
      
      WA
      haploid
      ...
      rnr4Δ
      
      MATa
      
      0.027016
      5.210060
      0.478112
      1.064578
      -0.350431
      0.777033
    
    
      1
      HU
      HU
      1
      1
      10
      39
      constructs
      
      NA
      haploid
      ...
      rnr4Δ
      
      MATa
      
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
    
    
      2
      HU
      HU
      1
      1
      10
      43
      constructs
      
      WA/WA
      diploid
      ...
      rnr4Δ/RNR4
      
      MATa/α
      
      0.039645
      4.656728
      0.544028
      0.878248
      -0.284516
      0.590703
    
    
      3
      HU
      HU
      1
      1
      10
      47
      constructs
      
      NA/NA
      diploid
      ...
      rnr4Δ/RNR4
      
      MATa/α
      
      0.071257
      3.810827
      0.699604
      0.515390
      -0.128939
      0.227844
    
    
      4
      HU
      HU
      1
      1
      12
      35
      constructs
      
      WA
      haploid
      ...
      rnr4Δ
      
      MATa
      
      0.049038
      4.349951
      0.552197
      0.856746
      -0.276346
      0.569200
    
  

5 rows × 28 columns

Figure S6 - Strategy for strain construction



In [3]:

    
from IPython.display import Image
Image(filename=dir_supp+'figures/supp_figure_schematic_constructs/supp_figure_schematic_constructs_publication.png')









    Out[3]:

Fig. S6: Strategy for strain construction. (A-C) Gene deletion was mediated by homologous recombination between the terminals of the PCR product and the corresponding genomic sequence where the gene to be deleted (‘target’) is encoded. Blue and red lines indicate WA and NA chromosomes, respectively. Flanking regions in green indicate two different homologous sequences targeted for recombination, which are 30-40 bp long in S. cerevisiae. Genes of interest were individually deleted in both WA and NA haploids, resulting in rnr4∆, fpr1∆ and tor1∆ strains in both parental backgrounds. A similar strategy was used to delete genes in WA and NA homozygous diploids. RNR2 and RNR4 were only deleted in one allele while there is the wild-type gene remaining in the other allele. The primer sequences used are listed in Extended Data Table 6. (C) Evolved segregants with de novo mutations were isolated from the WAxNA $\text{F}_\text{12}$ populations. Using the same strategy, RNR2 or TOR1 could be rid of either the wild-type allele or the mutated allele. (D) We crossed the strain constructed in (A) with the parental strain with wild-type gene, to obtain strains with deleted genes in WA, NA homozygous diploid and WA/NA hybrid.

Figures S7 and S8 - Validation tests for driver and passenger mutations



In [4]:

    
from scipy import stats

param = 'rel_growth_rate'

# shape = pd.DataFrame({k: 
#  pd.pivot_table(x, values='norm_growth_rate', columns=['background','genotype_long']).shape
#  for k,x in pheno_df.groupby(['selection','environment','gene'])
# })

shape = pd.pivot_table(pheno_df, values='norm_growth_rate', columns=['selection','environment','gene'], index=['background','genotype_long']).count()

for (ii, ((env_evo), gph1)) in enumerate(pheno_df.groupby(['selection'])):

    # Remove NaNs
    gph1 = gph1[np.isfinite(gph1[param])]
    
    nrows=len(gph1.groupby(['background','gene','genotype_long']))
    ncols=len(gph1.groupby(['environment']))
            
    height, width = np.array([nrows*0.11, ncols*3], dtype=float)
    
    fig = plt.figure(figsize=(width, height))
    
    fig.subplots_adjust(left=0.07,bottom=0.01,right=0.85,top=0.99)

    grid = gridspec.GridSpec(1, 2, hspace=0., wspace=0.)
    
    gs = {}
        
    for (jj, ((env_test), gph2)) in enumerate(gph1.groupby(['environment'])):
        
        nrows = len(gph2['gene'].unique())
        ncols = 1
        height_ratios = shape[env_evo,env_test].values.squeeze()
        gs[env_test] = gridspec.GridSpecFromSubplotSpec(nrows, ncols, height_ratios=height_ratios, 
                                                        subplot_spec=grid[0,jj], hspace=0., wspace=0.)
    
        axes = {}
        
        for (kk, ((gene), gph3)) in enumerate(gph2.groupby('gene')):
                        
            gph3.loc[:,'rank_background'] = gph3['background'].map(config.construct_background['position'])
            gph3.loc[:,'rank_genotype'] = gph3['genotype_long'].map(config.construct_genotype[env_evo][gene])
            gph3.sort_values(['rank_background','rank_genotype'], 
                             ascending=[False,False], inplace=True)
      
            if kk == 0:
                axes[env_test] = plt.subplot(gs[env_test][kk])
                ax1 = axes[env_test]
                ax1.set_title(config.environment['long_label'][env_test], fontsize=7)
            else:
                ax1 = plt.subplot(gs[env_test][kk], sharex=axes[env_test])
            
            gph3 = gph3.set_index(['background','genotype_long'], append=True)[param]\
            .unstack(['background','genotype_long'])
                        
            # boxplot
            bp = gph3.plot(
                ax=ax1, kind='box', 
                widths=0.65, vert=False, return_type='dict',
                labels=gph3.columns.get_level_values('background')
            )
            
            colors = [config.construct_background['color']['wt'][b] if g=='WT' \
                      else config.construct_background['color']['mut'][b] \
                      for b,g in zip(gph3.columns.get_level_values('background'),\
                                     gph3.columns.get_level_values('genotype_long'))]
            
            plot.boxplot_custom(bp, ax1, colors=colors, hatches=['']*len(gph3.columns))
            
            for ll, x in enumerate(gph3.columns):
                
                if x in config.construct_tests[gene]:
                    
                    wt_data = gph3[config.construct_tests[gene][x]].dropna()
                    mut_data = gph3[x].dropna()
                    z_stat, p_val = stats.ranksums(wt_data,mut_data)
                                        
                    if p_val < 0.0001:
                    
                        x_min = min(wt_data.min(),mut_data.min())
                        x_max = max(wt_data.max(),mut_data.max())
                             
                        ax1.annotate('', xy=(x_max, ll+1), xycoords='data',
                                     xytext=(x_max, ll+2), textcoords='data',
                                     arrowprops=dict(arrowstyle="-", ec='#aaaaaa', linewidth=.75,
                                                     connectionstyle="bar,fraction=-0.3"))
                        ax1.annotate('*',#utils.stars(p_val), 
                                     xy=((x_max+0.02 if env_evo=='HU' else x_max+0.04), ll+1.5), xycoords='data',
                                     ha='center', va='center', fontsize=6, weight='bold', rotation=270)
                        
            if jj==0:
                ax1.annotate(gene, 
                             xy=(-0.2, 0.5), xycoords=("axes fraction", "axes fraction"),
                             ha='right', va='center', annotation_clip=False, rotation=90, 
                             fontsize=6, fontstyle='italic', fontweight='bold')

            # reset ticks
            ax1.set_yticks([])
            ax1.set_yticklabels([])
            
            ax1.set_axisbelow(False)
            
            ### vertical ###
            ax1.xaxis.grid(ls="-", lw=.75, color="0.9", zorder=0)
            ax1.axvline(x=0., color='k', ls="--", dashes=(7, 7), lw=.75, zorder=3)
            
            ### horizontal ###
            
            ## background
            ystart, yend, ylabels = plot.set_custom_labels(gph3.columns, 0)
            
            # grid
            ygrid=[yst+1.5 for yst in list(set(ystart.values()))]
            [ax1.axhline(g, lw=.75, ls="-", color="0.9", zorder=2) for g in ygrid]
            
            # labels
            if jj==0:
                # tick labels
                ax1.set_yticks([y+1 for y in ylabels.values()], minor=True)
                ax1.set_yticklabels(ylabels.keys(), minor=True)
                ax1.get_yaxis().tick_left()
                if kk==0:
                    # axis label
                    ax1.set_ylabel('Background', transform=ax1.transAxes, 
                                   weight='bold', rotation=0)
                    ax1.yaxis.set_label_coords(-0.1, 1.025)
            
            ## genotype
            # labels
            if jj==1:
                # tick labels
                ylabels = gph3.columns.get_level_values('genotype_long')
                ax1.set_yticks(np.arange(1, len(ylabels)+1), minor=True)
                ax1.set_yticklabels(['/'.join(y) if isinstance(y, tuple) else y for y in ylabels], minor=True)
                ax1.get_yaxis().tick_right()
                if kk==0:
                    # axis label
                    ax1.set_ylabel('Genotype', transform=ax1.transAxes, 
                                   weight='bold', rotation=0)
                    ax1.yaxis.set_label_coords(1.1, 1.025)
            
            # set axes labels
            ax1.set_xlabel(r'Rel. growth rate, $\lambda_{bg}$', fontsize=10)
            
            if jj==1 and kk==0:
                wa_artist = patches.Rectangle((0,0), width=1, height=1, edgecolor='k',
                                              facecolor=config.construct_background['color']['mut']['WA'])
                na_artist = patches.Rectangle((0,0), width=1, height=1, edgecolor='k',
                                              facecolor=config.construct_background['color']['mut']['NA'])
                wana_artist = patches.Rectangle((0,0), width=1, height=1, edgecolor='k',
                                                facecolor=config.construct_background['color']['mut']['WA/NA'])

                leg1 = ax1.legend([wa_artist,na_artist,wana_artist], 
                                  ['WA; WA/WA','NA; NA/NA','WA/NA'], 
                                  ncol=1, loc='upper right',
                                  borderaxespad=0, handlelength=0.75, 
                                  prop={'size':5}, title='Background', 
                                  labelspacing=.32)

                wt_artist = patches.Rectangle((0,0), width=1, height=1, facecolor='0.9', edgecolor='k')
                construct_artist = patches.Rectangle((0,0), width=1, height=1, facecolor='0.1', edgecolor='k')

                leg2 = ax1.legend([wt_artist,construct_artist], 
                                  ['WT','construct'], 
                                  ncol=1, loc='lower right',
                                  borderaxespad=0, handlelength=0.75, 
                                  prop={'size':5}, title='Genotype', 
                                  labelspacing=.32)
                        
                ax1.add_artist(leg1)
        
                for leg in [leg1,leg2]:
                    plt.setp(leg.get_title(),fontsize=6)
                    leg.set_zorder(2)
                    leg.get_frame().set_edgecolor('none')
                    leg.get_frame().set_facecolor('w')
                    
            if env_evo=='RM' and env_test=='RM' and gene=='FPR1':
                transform = transforms.blended_transform_factory(ax1.transAxes, ax1.transData)
                x0, x1, y0, y1 = (1.1, 1.9, .5, 7.5)
                im = ax1.imshow(plt.imread(dir_supp+'figures/supp_figure_pheno_constructs/FPR1_LOH_inset.png'),
                                aspect='auto', extent=(x0, x1, y0, y1), transform=transform, zorder=1)
            
            transform = transforms.blended_transform_factory(ax1.transAxes, ax1.transAxes)
            if env_evo=='HU':
                ax1.set_xlim(-0.45,0.49)
                ax1.xaxis.set_major_locator( ticker.MaxNLocator(nbins = 4) )
                ax1.xaxis.set_minor_locator( ticker.MaxNLocator(nbins = 4) )
                ax1.xaxis.set_major_formatter(ticker.FormatStrFormatter('%.2f'))
                if env_test=='HU':
                    ax1.spines['left'].set_visible(False)
                    ax1.spines['right'].set_visible(True)
                    patch = patches.Rectangle((-.18,0), width=.18, height=1,
                                              linewidth=.75, facecolor='0.9', edgecolor='w', 
                                              transform=transform, zorder=0)
                if env_test=='SC':
                    ax1.spines['left'].set_visible(True)
                    ax1.spines['right'].set_visible(False)
                    patch = patches.Rectangle((1,0), width=.4, height=1,
                                              linewidth=.75, facecolor='0.9', edgecolor='w', 
                                              transform=transform, zorder=0)
            elif env_evo=='RM':
                ax1.set_xlim(-0.65,2)
                ax1.xaxis.set_major_locator( ticker.MaxNLocator(nbins = 3) )
                ax1.xaxis.set_minor_locator( ticker.MaxNLocator(nbins = 3) )
                ax1.xaxis.set_major_formatter(ticker.FormatStrFormatter('%.2f'))
                if env_test=='RM':
                    ax1.spines['left'].set_visible(False)
                    ax1.spines['right'].set_visible(True)
                    patch = patches.Rectangle((-.18,0), width=.18, height=1,
                                              linewidth=.75, facecolor='0.9', edgecolor='w',  
                                              transform=transform, zorder=0)
                if env_test=='SC':
                    ax1.spines['left'].set_visible(True)
                    ax1.spines['right'].set_visible(False)
                    patch = patches.Rectangle((1,0), width=.525, height=1,
                                              linewidth=.75, facecolor='0.9', edgecolor='w',  
                                              transform=transform, zorder=0)
                    patch.set_clip_on(False)
            ax1.add_patch(patch)
            patch.set_clip_on(False)

        # Tweak axes
        for ax in fig.get_axes():
            ax.spines['top'].set_visible(True)
            ax.spines['bottom'].set_visible(True)
            
            ax.xaxis.label.set_size(6)
            ax.yaxis.label.set_size(6)
            
            ax.tick_params(axis='x', which='both', size=0, labelsize=6)
            ax.tick_params(axis='y', which='major', size=0, labelsize=6)
            ax.tick_params(axis='y', which='minor', size=0, labelsize=6)
            for sp in ax.spines.values():
                sp.set(color='k', linewidth=.5, linestyle='-')
    
    plot.save_figure(dir_supp+'figures/supp_figure_pheno_constructs/supp_figure_pheno_constructs_%s' % env_evo)
    
plt.show()

Fig. S7: Validation tests for driver mutations in hydroxyurea, measured in SC+HU (left) and SC (right). Growth rate measurements, $\lambda_{bg}$, are shown for measurement replicates of each construct ($n = 64$) and grouped by candidate gene and by background of the construct, where the background $b$ can be WA, NA (haploid); WA/WA, NA/NA (diploid); WA/NA (hybrid), and the genotype $g$ can be wild-type for the gene, deleted or hemizygous. Medians and 25%/75% percentiles across groups are shown, with medians as horizontal lines and outliers highlighted. The color of each of the boxes reflects the background, with WA and WA/WA (blue), NA and NA/NA (red) and WA/NA (purple). Lighter shades indicate a wild-type (WT) control for that specific background and darker colors are the candidate strains. Using a non-parametric Wilcoxon rank-sum test, we compared deletion strains against their respective WT control and hemizygous strains against each other. Significance tests with a $P$-value below $10^{−4}$ are highlighted.

Fig. S8: Validation tests for driver and passenger mutations in rapamycin, measured in SC+RM (left) and SC (right). Growth rate measurements, $\lambda_{bg}$, are shown for measurement replicates of each construct ($n = 64$) and grouped by candidate gene and by background of the construct, where the background $b$ can be WA, NA (haploid); WA/WA, NA/NA (diploid); WA/NA (hybrid), and the genotype $g$ can be wild-type for the gene, deleted or hemizygous. Medians and 25%/75% percentiles across groups are shown, with medians as horizontal lines and outliers highlighted. The color of each of the boxes reflects the background, with WA and WA/WA (blue), NA and NA/NA (red) and WA/NA (purple). Lighter shades indicate a wild-type (WT) control for that specific background and darker colors are the candidate strains. Using a non-parametric Wilcoxon rank-sum test, we compared deletion strains against their respective WT control and hemizygous strains against each other. Significance tests with a $P$-value below $10^{−4}$ are highlighted. Visual inspection of FPR1 heterozygous deletions using a spot assay (inset) manifests the immediate loss of the wild-type allele by LOH – validated by colony Sanger sequencing.



In [5]:

    
from IPython.display import Image
Image(filename=dir_supp+'figures/supp_figure_pheno_constructs/FPR1_LOH_inset.png')









    Out[5]:

	selection	environment	run	plate	row	column	group	background	ploidy	...	genotype_long	mating	abs_growth_rate	abs_doubling_time	norm_growth_rate	norm_doubling_time	rel_growth_rate	rel_doubling_time
0	HU	HU	1	1	10	35	constructs	WA	haploid	...	rnr4Δ	MATa	0.027016	5.210060	0.478112	1.064578	-0.350431	0.777033
1	HU	HU	1	1	10	39	constructs	NA	haploid	...	rnr4Δ	MATa	NaN	NaN	NaN	NaN	NaN	NaN
2	HU	HU	1	1	10	43	constructs	WA/WA	diploid	...	rnr4Δ/RNR4	MATa/α	0.039645	4.656728	0.544028	0.878248	-0.284516	0.590703
3	HU	HU	1	1	10	47	constructs	NA/NA	diploid	...	rnr4Δ/RNR4	MATa/α	0.071257	3.810827	0.699604	0.515390	-0.128939	0.227844
4	HU	HU	1	1	12	35	constructs	WA	haploid	...	rnr4Δ	MATa	0.049038	4.349951	0.552197	0.856746	-0.276346	0.569200