This is a set of basic examples of the usage and outputs of the various individual functions included in. There are generally two types of functions:
The list of supported search types, as well as the different types of information that can be returned for a given PDB ID, is large (and growing) and is enumerated completely in the docstrings of pypdb.py. The PDB allows a very wide range of different types of queries, and so any option that is not currently available can likely be implemented pretty easily based on the structure of the query types that have already been implemented. I appreciate any feedback and pull requests.
Another notebook in this directory, advanced_demos.ipynb, includes more in-depth usages of multiple functions, including the tutorial on graphing the popularity of CRISPR that was originally included in this notebook
In [1]:
%pylab inline
from IPython.display import HTML
from pypdb.pypdb import *
import pprint
In [2]:
search_dict = make_query('actin network')
found_pdbs = do_search(search_dict)
print(found_pdbs)
In [3]:
search_dict = make_query('3W3D',querytype='ModifiedStructuresQuery')
found_pdbs = do_search(search_dict)
print(found_pdbs)
In [4]:
search_dict = make_query('Perutz, M.F.',querytype='AdvancedAuthorQuery')
found_pdbs = do_search(search_dict)
print(found_pdbs)
In [5]:
search_dict = make_query('T[AG]AGGY',querytype='MotifQuery')
found_pdbs = do_search(search_dict)
print(found_pdbs)
In [6]:
search_dict = make_query('SOLID-STATE NMR',querytype='ExpTypeQuery')
found_pdbs = do_search(search_dict)
print(found_pdbs)
In [4]:
search_dict = make_query('', querytype='NoLigandQuery')
found_pdbs = do_search(search_dict)
print(found_pdbs[:10])
In [8]:
kk = do_protsym_search('C9', min_rmsd=0.0, max_rmsd=1.0)
print(kk[:5])
While the basic functions described in the previous section are useful for looking up and manipulating individual unique entries, these functions are intended to be more user-facing: they take search keywords and return lists of authors or dates
In [10]:
top_authors = find_authors('crispr',max_results=100)
pprint.pprint(top_authors[:5])
In [31]:
matching_papers = find_papers('crispr',max_results=3)
pprint.pprint(matching_papers)
In [24]:
pdb_file = get_pdb_file('4lza', filetype='cif', compression=True)
print(pdb_file[:200])
In [4]:
describe_pdb('4lza')
Out[4]:
In [35]:
all_info = get_all_info('4lza')
print(all_info)
In [9]:
results = get_all_info('2F5N')
first_polymer = results['polymer'][0]
first_polymer['polymerDescription']
Out[9]:
There are several options here: One function, get_blast(), returns a dict() just like every other function. However, all the metadata associated with this function leads to deeply-nested dictionaries. A simpler function, get_blast2(), uses text parsing on the raw output page, and it returns a tuple consisting of 1. a ranked list of other PDB IDs that were hits, and 2. A list of the actual BLAST alignments and similarity scores.
In [11]:
blast_results = get_blast('2F5N', chain_id='A')
just_hits = blast_results['BlastOutput_iterations']['Iteration']['Iteration_hits']['Hit']
print(just_hits[50]['Hit_hsps']['Hsp']['Hsp_hseq'])
In [12]:
blast_results = get_blast2('2F5N', chain_id='A', output_form='HTML')
print('Total Results: ' + str(len(blast_results[0])) +'\n')
pprint.pprint(blast_results[1][0])
In [37]:
pfam_info = get_pfam('2LME')
print(pfam_info)
This function takes the name of the chemical, not a PDB ID
In [39]:
chem_desc = describe_chemical('NAG')
pprint.pprint(chem_desc)
In [40]:
ligand_dict = get_ligands('100D')
pprint.pprint(ligand_dict)
In [45]:
gene_info = get_gene_onto('4Z0L ')
pprint.pprint(gene_info['term'][0])
In [41]:
sclust = get_seq_cluster('2F5N.A')
pprint.pprint(sclust['pdbChain'][:10]) # Just look at the top 10
In [46]:
clusts = get_clusters('4hhb.A')
print(clusts)
In [13]:
crispr_query = make_query('crispr')
crispr_results = do_search(crispr_query)
pprint.pprint(list_taxa(crispr_results[:10]))
In [14]:
crispr_query = make_query('crispr')
crispr_results = do_search(crispr_query)
pprint.pprint(list_types(crispr_results[:5]))
In [ ]: