Creating Weighted Moves

This notebook was created in August 2016 during exploration in how to bias different types of moves in chemical space for the development of smirky and future chemical perception move proposal tools.

Authors

Christopher I. Bayly OpenEye Scientific Software (on sabatical with Mobley Group UC Irvine)
Commentary added by Caitlin C. Bannan (Mobley Group UC Irvine) in April 2017

Generate List of Moves

The end goal of this notebook was to generate files with a list of moves in chemical space. Each parameter type (VdW, Bonds, Angles, Proper or Improper torsions) have weighted decisions for how to make changes. These lists of moves are used in the smirksEnvMoves notebook in this directory.



In [1]:

    
# generic scientific/ipython header
from __future__ import print_function
from __future__ import division
import os, sys
import copy
import numpy as np

Biasing Moves:

Using Odds to make some moves more likely than others within a class such as atomLabel, ie which atom in e.g. atmOrBond: Odds for changing an atom versus a bond is 10:1
Following the biasing probability so that after a series of biased choices we know what is the overall probability



In [2]:

    
# Parent dictionary of in-common Movetypes-with-Odds to be used as the basis for each parameter's moves
parentMovesWithOdds = {}
parentMovesWithOdds['atmOrBond'] = [ ('atom',10), ('bond',1)]
parentMovesWithOdds['actionChoices'] = [('add',1), ('swap',1), ('delete',1), ('joinAtom',1)]
parentMovesWithOdds['ORorANDType'] = [('ORtype',3), ('ANDtype',1)]

Make 'moves with weights' dictionaries specialized for each parameter type

Here we give the odds for performing a specific move within its class, where we will make it less probable to perform some moves in preference to others within a class. For example, within the bond angle parameter, we will set odds in the bondLabel moveType to make it less probable to change the central atom of the bond compared to the end atoms, and those in turn less probable compared to changing attached substituent atoms.

The movesWithOdds data structure is a list of tuples so that it is more easily for a human to read and modify. Then that is processed by function movesWithWeightsFromOdds to turn it into a probabilities-based format usable by numpy's random.choice() function.



In [3]:

    
def movesWithWeightsFromOdds( MovesWithOdds):
    '''Processes a dictionary of movesWithOdds (lists of string/integer tuples)
    into a dictionary of movesWithWeights usable to perform weighted
    random choices with numpy's random.choice() function.
    Argument:   a MovesWithOdds dictionary of lists of string/integer tuples
    Returns: a MovesWithWeights dictionary of pairs of a moveType-list with a 
            probabilites-list, the latter used by numpy's random.choice() function.'''
    movesWithWeights = {}
    for key in MovesWithOdds.keys():
        moves = [ item[0] for item in MovesWithOdds[key] ]
        odds =  [ item[1] for item in MovesWithOdds[key] ]
        weights = odds/np.sum(odds)
        #print( key, moves, odds, weights)
        movesWithWeights[key] = ( moves, weights)
    return movesWithWeights



In [4]:

    
# make 'moves with weights' dictionary for vdW
movesWithOddsVdW = copy.deepcopy( parentMovesWithOdds)
movesWithOddsVdW['atomLabel'] = [ ('unIndexed',10), ('atom1',1)]
movesWithOddsVdW['bondLabel'] = [ ('unIndexed',1)]
movesWithWeightsVdW = movesWithWeightsFromOdds( movesWithOddsVdW)



In [5]:

    
# make 'moves with weights' dictionary for bonds
movesWithOddsBonds = copy.deepcopy( parentMovesWithOdds)
movesWithOddsBonds['atomLabel'] = [ ('unIndexed',10), ('atom1',1),('atom2',1)]
movesWithOddsBonds['bondLabel'] = [ ('unIndexed',10), ('bond1',1)]
movesWithWeightsBonds = movesWithWeightsFromOdds( movesWithOddsBonds)



In [6]:

    
# make 'moves with weights' dictionary for angles
movesWithOddsAngles = copy.deepcopy( parentMovesWithOdds)
movesWithOddsAngles['atomLabel'] = [ ('unIndexed',20), ('atom1',10),('atom2',1), ('atom3',10)]
movesWithOddsAngles['bondLabel'] = [ ('unIndexed',10), ('bond1',1),('bond2',1)]
movesWithWeightsAngles = movesWithWeightsFromOdds( movesWithOddsAngles)



In [7]:

    
# make 'moves with weights' dictionary for torsions
movesWithOddsTorsions = copy.deepcopy( parentMovesWithOdds)
movesWithOddsTorsions['atomLabel'] = [ ('unIndexed',20), ('atom1',10),('atom2',1), ('atom3',1),('atom4',10)]
movesWithOddsTorsions['bondLabel'] = [ ('unIndexed',20), ('bond1',10),('bond2',1), ('bond3',10)]
movesWithWeightsTorsions = movesWithWeightsFromOdds( movesWithOddsTorsions)



In [8]:

    
# make 'moves with weights' dictionary for impropers
movesWithOddsImpropers = copy.deepcopy( parentMovesWithOdds)
movesWithOddsImpropers['atomLabel'] = [ ('unIndexed',20), ('atom1',10),('atom2',1), ('atom3',10),('atom4',10)]
movesWithOddsImpropers['bondLabel'] = [ ('unIndexed',20), ('bond1',1),('bond2',1), ('bond3',1)]
movesWithWeightsImpropers = movesWithWeightsFromOdds( movesWithOddsImpropers)



In [9]:

    
testWeights = movesWithWeightsImpropers
for key in testWeights.keys():
    print( key, testWeights[key][0], testWeights[key][1])









    



atomLabel ['unIndexed', 'atom1', 'atom2', 'atom3', 'atom4'] [ 0.39215686  0.19607843  0.01960784  0.19607843  0.19607843]
actionChoices ['add', 'swap', 'delete', 'joinAtom'] [ 0.25  0.25  0.25  0.25]
ORorANDType ['ORtype', 'ANDtype'] [ 0.75  0.25]
bondLabel ['unIndexed', 'bond1', 'bond2', 'bond3'] [ 0.86956522  0.04347826  0.04347826  0.04347826]
atmOrBond ['atom', 'bond'] [ 0.90909091  0.09090909]

Make master dict-of-dicts so that parameter type can choose the correct movesWithWeights dict



In [10]:

    
# 'VdW', 'Bond', 'Angle', 'Torsion', 'Improper'
movesWithWeightsMaster = {}
movesWithWeightsMaster['VdW'] = movesWithWeightsVdW
movesWithWeightsMaster['Bond'] = movesWithWeightsBonds
movesWithWeightsMaster['Angle'] = movesWithWeightsAngles
movesWithWeightsMaster['Torsion'] = movesWithWeightsTorsions
movesWithWeightsMaster['Improper'] = movesWithWeightsImpropers



In [11]:

    
def PickMoveItemWithProb( moveType, moveWithWeights):
    '''Picks a moveItem based on a moveType and a dictionary of moveTypes with associated probabilities
       Arguments:
         moveType: string corresponding to a key in the moveWithWeights dictionary, e.g. atomTor
         moveWithWeights: a dictionary based on moveType keys which each point to a list of probabilites
           associated with the position in the list
        Returns:
          the randomly-chosen position in the list, based on the probability, together with the probability'''
    listOfIndexes = range(0, len( moveWithWeights[moveType][1]) )
    listIndex = np.random.choice(listOfIndexes, p= moveWithWeights[moveType][1])
    return moveWithWeights[moveType][0][listIndex], moveWithWeights[moveType][1][listIndex]

Test to see if actual picks by PickMoveItemWithProb match target probabilities



In [12]:

    
# NBVAL_SKIP
movesWithWeightsTest = movesWithWeightsMaster['Torsion']
key = np.random.choice( movesWithWeightsTest.keys() )
nSamples = 10000
print( nSamples, 'samples on moveType', key)
print( key, '  Moves:  ', movesWithWeightsTest[key][0])
print( key, '  Weights:', movesWithWeightsTest[key][1])
counts = [0]*len(movesWithWeightsTest[key][1])
for turn in range(0, nSamples):
    choice, prob = PickMoveItemWithProb( key, movesWithWeightsTest)
    idx = movesWithWeightsTest[key][0].index(choice)
    counts[ idx] += 1
print( key, '  Counts: ', counts)









    



10000 samples on moveType atomLabel
atomLabel   Moves:   ['unIndexed', 'atom1', 'atom2', 'atom3', 'atom4']
atomLabel   Weights: [ 0.47619048  0.23809524  0.02380952  0.02380952  0.23809524]
atomLabel   Counts:  [4768, 2419, 238, 234, 2341]



In [13]:

    
def PropagateMoveTree( moveType, movesWithWeights, accumMoves, accumProb):
    '''Expands a moveList by the input moveType, randomly picking a move
    of that type from the list in movesWithWeights, biased by the weight
    (probability) also from movesWithWeights. It incorporates that probability
    into the accumulated probability that was passed it with the existing list
    Arguments:
        moveType: a string which is a key in the movesWithWeights dictionary
        movesWithWeights: a dictionary of a list of allowed moves of a certain
            moveType paired with a list of a probability associated with each move.
        accumMoves: the list of moves (being strings) to be expanded by this function.
        accumProb: the accumulated probability so far of the moves in accumMoves
    Returns:
        accumMoves: the list of moves (being strings) expanded by this function
        accumProb: the revised accumulated probability of the moves in accumMoves
    '''
    choice, prob = PickMoveItemWithProb( moveType, movesWithWeights)
    #print( 'before', choice, prob, accumProb)
    accumMoves.append( choice)
    accumProb *= prob
    #print( 'after', choice, prob, accumProb)
    return accumMoves, accumProb



In [14]:

    
def GenerateMoveTree( parameterType):
    '''Generates a list of micro-moves describing how to attempt to change the chemical
    graph associated with a parameter type. Each micro-move makes a weighted random
    decision on some aspect of the overall move, which will be made by effecting each
    of the micro-moves in the list.
    Argument:
        parameterType: this string refers to a force-field parameter type (e.g. 'Torsion')
            and determines what kind of moveTypes, moves, and weights will be used in
            weight random micro-moves
    Returns:
        moveTree: the list of micro-moves describing how to attempt to change the chemical
            graph associated with a parameter type.
        cumProb: the weights-biased probability of making the overall move, i.e. effecting
            the list of micro-moves.'''
    cumProb = 1.0
    moveTree = []
    paramType = parameterType
    movesWithWeights = movesWithWeightsMaster[paramType]
    cumProb = 1.0
    moveTree = []
    moveFlow = ['actionChoices', 'atmOrBond', 'whichLabel', 'ORorANDType']
    for stage in moveFlow:
        if stage=='whichLabel' and moveTree[-1]=='atom':
            moveTree, cumProb = PropagateMoveTree( 'atomLabel', movesWithWeights, moveTree, cumProb)
        elif stage=='whichLabel' and moveTree[-1]=='bond':
            moveTree, cumProb = PropagateMoveTree( 'bondLabel', movesWithWeights, moveTree, cumProb)
        else:
            moveTree, cumProb = PropagateMoveTree( stage, movesWithWeights, moveTree, cumProb)
    return moveTree, cumProb

Test GenerateMoveTree



In [15]:

    
parameterType = 'Torsion'
nSamples = 10
moveTree, cumProb = GenerateMoveTree( parameterType)
for i in range(0,nSamples):
    print( GenerateMoveTree( parameterType) )









    



(['swap', 'bond', 'unIndexed', 'ANDtype'], 0.0027716186252771621)
(['joinAtom', 'atom', 'atom1', 'ORtype'], 0.040584415584415577)
(['swap', 'atom', 'atom1', 'ORtype'], 0.040584415584415577)
(['delete', 'atom', 'atom1', 'ORtype'], 0.040584415584415577)
(['add', 'bond', 'unIndexed', 'ORtype'], 0.0083148558758314867)
(['joinAtom', 'atom', 'unIndexed', 'ORtype'], 0.081168831168831154)
(['add', 'atom', 'atom4', 'ORtype'], 0.040584415584415577)
(['joinAtom', 'atom', 'atom1', 'ORtype'], 0.040584415584415577)
(['add', 'atom', 'atom1', 'ORtype'], 0.040584415584415577)
(['joinAtom', 'atom', 'atom1', 'ORtype'], 0.040584415584415577)

Write a bunch of GenerateMoveTree moves to a file



In [16]:

    
parameterType = 'Torsion'
nSamples = 10000
ofs = open('moveTrees.'+parameterType+'.txt','w')
moveTree, cumProb = GenerateMoveTree( parameterType)
for i in range(0,nSamples):
    moveTree, prob = GenerateMoveTree( parameterType)
    ofs.write( '%.6f ' % prob )
    for microMove in moveTree:
        ofs.write( '%s ' % microMove )
    ofs.write( '\n' )
ofs.close()



In [ ]: