Here we will use the convenience class PSAIdentifier
to extract Hausdorff pair analysis data generated by PSA.
In [1]:
%matplotlib inline
%load_ext autoreload
%autoreload 2
# Suppress FutureWarning about element-wise comparison to None
# Occurs when calling PSA plotting functions
import warnings
warnings.filterwarnings('ignore')
In [2]:
from MDAnalysis import Universe
from MDAnalysis.analysis.psa import PSAnalysis
from pair_id import PairID
Initialize lists for the methods on which to perform PSA. PSA will be performed for four different simulations methods with three runs for each: DIMS, FRODA, rTMD-F, and rTMD-S. Also initialize a PSAIdentifier
object to keep track of the data corresponding to comparisons between pairs of simulations.
In [3]:
method_names = ['DIMS','FRODA','rTMD-F','rTMD-S']
labels = [] # Heat map labels (not plotted in this example)
simulations = [] # List of simulation topology/trajectory filename pairs
universes = [] # List of MDAnalysis Universes representing simulations
For each method, get the topology and each of three total trajectories (per method). Each simulation is represented as a (topology, trajectory)
pair of file names, which is appended to a master list of simulations.
In [4]:
for method in method_names:
# Note: DIMS uses the PSF topology format
topname = 'top.psf' if 'DIMS' in method or 'TMD' in method else 'top.pdb'
pathname = 'fitted_psa.dcd'
method_dir = 'methods/{}'.format(method)
if method is not 'LinInt':
for run in xrange(1, 4): # 3 runs per method
run_dir = '{}/{:03n}'.format(method_dir, run)
topology = '{}/{}'.format(method_dir, topname)
trajectory = '{}/{}'.format(run_dir, pathname)
labels.append(method + '(' + str(run) + ')')
simulations.append((topology, trajectory))
else: # only one LinInt trajectory
topology = '{}/{}'.format(method_dir, topname)
trajectory = '{}/{}'.format(method_dir, pathname)
labels.append(method)
simulations.append((topology, trajectory))
Generate a list of universes from the list of simulations.
In [5]:
for sim in simulations:
universes.append(Universe(*sim))
Initialize a PSA comparison from the universe list using a C$_\alpha$ trajectory representation.
In [6]:
psa_hpa = PSAnalysis(universes, path_select='name CA', labels=labels)
Generate PSA Path
s from the trajectories
In [7]:
psa_hpa.generate_paths()
Perform a Hausdorff pairs analysis on all of the Path
s
In [8]:
psa_hpa.run_pairs_analysis(hausdorff_pairs=True, neighbors=True)
In [9]:
identifier = PairID()
for name in method_names:
identifier.add_sim(name, [1,2,3])
Get the PSA ID for the second DIMS simulation (DIMS 2) and third rTMD-F simulation (rTMD-F 3)
In [10]:
pid = identifier.get_pair_id('DIMS 2', 'rTMD-F 3')
Get the indices of the frames for the DIMS 2 and rTMD-F 3 paths corresponding to the Hausdorff pair
In [11]:
psa_hpa.hausdorff_pairs[pid]
Out[11]:
In [12]:
psa_hpa.hausdorff_pairs[pid]['frames']
Out[12]:
Get the rmsd separating the Hausdorff pair (this is the Hausdorff distance!)
In [13]:
psa_hpa.hausdorff_pairs[pid]['distance']
Out[13]:
Get the indices of the nearest neighbor frames for the DIMS 2 and rTMD-F 3 paths
In [14]:
psa_hpa.nearest_neighbors[pid]['frames']
Out[14]:
Get the nearest neighbor rmsds for the paths
In [15]:
psa_hpa.nearest_neighbors[pid]['distances']
Out[15]:
Display the pandas
DataFrame
containing the set of simulations
In [16]:
df = identifier.data
df
Out[16]:
Get the simulation IDs for DIMS simulations 1, 2, and 3
In [17]:
df.loc[('DIMS',[1,2,3]), 'Sim ID']
Out[17]: