This notebook was created in August 2016 during exploration in how to bias different types of moves in chemical space for the development of smirky and future chemical perception move proposal tools.
Authors
Generate List of Moves
The end goal of this notebook was to generate files with a list of moves in chemical space. Each parameter type (VdW, Bonds, Angles, Proper or Improper torsions) have weighted decisions for how to make changes. These lists of moves are used in the smirksEnvMoves
notebook in this directory.
In [1]:
# generic scientific/ipython header
from __future__ import print_function
from __future__ import division
import os, sys
import copy
import numpy as np
In [2]:
# Parent dictionary of in-common Movetypes-with-Odds to be used as the basis for each parameter's moves
parentMovesWithOdds = {}
parentMovesWithOdds['atmOrBond'] = [ ('atom',10), ('bond',1)]
parentMovesWithOdds['actionChoices'] = [('add',1), ('swap',1), ('delete',1), ('joinAtom',1)]
parentMovesWithOdds['ORorANDType'] = [('ORtype',3), ('ANDtype',1)]
Here we give the odds for performing a specific move within its class, where we will make it less probable to perform some moves in preference to others within a class. For example, within the bond angle parameter, we will set odds in the bondLabel moveType to make it less probable to change the central atom of the bond compared to the end atoms, and those in turn less probable compared to changing attached substituent atoms.
The movesWithOdds data structure is a list of tuples so that it is more easily for a human to read and modify. Then that is processed by function movesWithWeightsFromOdds to turn it into a probabilities-based format usable by numpy's random.choice() function.
In [3]:
def movesWithWeightsFromOdds( MovesWithOdds):
'''Processes a dictionary of movesWithOdds (lists of string/integer tuples)
into a dictionary of movesWithWeights usable to perform weighted
random choices with numpy's random.choice() function.
Argument: a MovesWithOdds dictionary of lists of string/integer tuples
Returns: a MovesWithWeights dictionary of pairs of a moveType-list with a
probabilites-list, the latter used by numpy's random.choice() function.'''
movesWithWeights = {}
for key in MovesWithOdds.keys():
moves = [ item[0] for item in MovesWithOdds[key] ]
odds = [ item[1] for item in MovesWithOdds[key] ]
weights = odds/np.sum(odds)
#print( key, moves, odds, weights)
movesWithWeights[key] = ( moves, weights)
return movesWithWeights
In [4]:
# make 'moves with weights' dictionary for vdW
movesWithOddsVdW = copy.deepcopy( parentMovesWithOdds)
movesWithOddsVdW['atomLabel'] = [ ('unIndexed',10), ('atom1',1)]
movesWithOddsVdW['bondLabel'] = [ ('unIndexed',1)]
movesWithWeightsVdW = movesWithWeightsFromOdds( movesWithOddsVdW)
In [5]:
# make 'moves with weights' dictionary for bonds
movesWithOddsBonds = copy.deepcopy( parentMovesWithOdds)
movesWithOddsBonds['atomLabel'] = [ ('unIndexed',10), ('atom1',1),('atom2',1)]
movesWithOddsBonds['bondLabel'] = [ ('unIndexed',10), ('bond1',1)]
movesWithWeightsBonds = movesWithWeightsFromOdds( movesWithOddsBonds)
In [6]:
# make 'moves with weights' dictionary for angles
movesWithOddsAngles = copy.deepcopy( parentMovesWithOdds)
movesWithOddsAngles['atomLabel'] = [ ('unIndexed',20), ('atom1',10),('atom2',1), ('atom3',10)]
movesWithOddsAngles['bondLabel'] = [ ('unIndexed',10), ('bond1',1),('bond2',1)]
movesWithWeightsAngles = movesWithWeightsFromOdds( movesWithOddsAngles)
In [7]:
# make 'moves with weights' dictionary for torsions
movesWithOddsTorsions = copy.deepcopy( parentMovesWithOdds)
movesWithOddsTorsions['atomLabel'] = [ ('unIndexed',20), ('atom1',10),('atom2',1), ('atom3',1),('atom4',10)]
movesWithOddsTorsions['bondLabel'] = [ ('unIndexed',20), ('bond1',10),('bond2',1), ('bond3',10)]
movesWithWeightsTorsions = movesWithWeightsFromOdds( movesWithOddsTorsions)
In [8]:
# make 'moves with weights' dictionary for impropers
movesWithOddsImpropers = copy.deepcopy( parentMovesWithOdds)
movesWithOddsImpropers['atomLabel'] = [ ('unIndexed',20), ('atom1',10),('atom2',1), ('atom3',10),('atom4',10)]
movesWithOddsImpropers['bondLabel'] = [ ('unIndexed',20), ('bond1',1),('bond2',1), ('bond3',1)]
movesWithWeightsImpropers = movesWithWeightsFromOdds( movesWithOddsImpropers)
In [9]:
testWeights = movesWithWeightsImpropers
for key in testWeights.keys():
print( key, testWeights[key][0], testWeights[key][1])
In [10]:
# 'VdW', 'Bond', 'Angle', 'Torsion', 'Improper'
movesWithWeightsMaster = {}
movesWithWeightsMaster['VdW'] = movesWithWeightsVdW
movesWithWeightsMaster['Bond'] = movesWithWeightsBonds
movesWithWeightsMaster['Angle'] = movesWithWeightsAngles
movesWithWeightsMaster['Torsion'] = movesWithWeightsTorsions
movesWithWeightsMaster['Improper'] = movesWithWeightsImpropers
In [11]:
def PickMoveItemWithProb( moveType, moveWithWeights):
'''Picks a moveItem based on a moveType and a dictionary of moveTypes with associated probabilities
Arguments:
moveType: string corresponding to a key in the moveWithWeights dictionary, e.g. atomTor
moveWithWeights: a dictionary based on moveType keys which each point to a list of probabilites
associated with the position in the list
Returns:
the randomly-chosen position in the list, based on the probability, together with the probability'''
listOfIndexes = range(0, len( moveWithWeights[moveType][1]) )
listIndex = np.random.choice(listOfIndexes, p= moveWithWeights[moveType][1])
return moveWithWeights[moveType][0][listIndex], moveWithWeights[moveType][1][listIndex]
In [12]:
# NBVAL_SKIP
movesWithWeightsTest = movesWithWeightsMaster['Torsion']
key = np.random.choice( movesWithWeightsTest.keys() )
nSamples = 10000
print( nSamples, 'samples on moveType', key)
print( key, ' Moves: ', movesWithWeightsTest[key][0])
print( key, ' Weights:', movesWithWeightsTest[key][1])
counts = [0]*len(movesWithWeightsTest[key][1])
for turn in range(0, nSamples):
choice, prob = PickMoveItemWithProb( key, movesWithWeightsTest)
idx = movesWithWeightsTest[key][0].index(choice)
counts[ idx] += 1
print( key, ' Counts: ', counts)
In [13]:
def PropagateMoveTree( moveType, movesWithWeights, accumMoves, accumProb):
'''Expands a moveList by the input moveType, randomly picking a move
of that type from the list in movesWithWeights, biased by the weight
(probability) also from movesWithWeights. It incorporates that probability
into the accumulated probability that was passed it with the existing list
Arguments:
moveType: a string which is a key in the movesWithWeights dictionary
movesWithWeights: a dictionary of a list of allowed moves of a certain
moveType paired with a list of a probability associated with each move.
accumMoves: the list of moves (being strings) to be expanded by this function.
accumProb: the accumulated probability so far of the moves in accumMoves
Returns:
accumMoves: the list of moves (being strings) expanded by this function
accumProb: the revised accumulated probability of the moves in accumMoves
'''
choice, prob = PickMoveItemWithProb( moveType, movesWithWeights)
#print( 'before', choice, prob, accumProb)
accumMoves.append( choice)
accumProb *= prob
#print( 'after', choice, prob, accumProb)
return accumMoves, accumProb
In [14]:
def GenerateMoveTree( parameterType):
'''Generates a list of micro-moves describing how to attempt to change the chemical
graph associated with a parameter type. Each micro-move makes a weighted random
decision on some aspect of the overall move, which will be made by effecting each
of the micro-moves in the list.
Argument:
parameterType: this string refers to a force-field parameter type (e.g. 'Torsion')
and determines what kind of moveTypes, moves, and weights will be used in
weight random micro-moves
Returns:
moveTree: the list of micro-moves describing how to attempt to change the chemical
graph associated with a parameter type.
cumProb: the weights-biased probability of making the overall move, i.e. effecting
the list of micro-moves.'''
cumProb = 1.0
moveTree = []
paramType = parameterType
movesWithWeights = movesWithWeightsMaster[paramType]
cumProb = 1.0
moveTree = []
moveFlow = ['actionChoices', 'atmOrBond', 'whichLabel', 'ORorANDType']
for stage in moveFlow:
if stage=='whichLabel' and moveTree[-1]=='atom':
moveTree, cumProb = PropagateMoveTree( 'atomLabel', movesWithWeights, moveTree, cumProb)
elif stage=='whichLabel' and moveTree[-1]=='bond':
moveTree, cumProb = PropagateMoveTree( 'bondLabel', movesWithWeights, moveTree, cumProb)
else:
moveTree, cumProb = PropagateMoveTree( stage, movesWithWeights, moveTree, cumProb)
return moveTree, cumProb
In [15]:
parameterType = 'Torsion'
nSamples = 10
moveTree, cumProb = GenerateMoveTree( parameterType)
for i in range(0,nSamples):
print( GenerateMoveTree( parameterType) )
In [16]:
parameterType = 'Torsion'
nSamples = 10000
ofs = open('moveTrees.'+parameterType+'.txt','w')
moveTree, cumProb = GenerateMoveTree( parameterType)
for i in range(0,nSamples):
moveTree, prob = GenerateMoveTree( parameterType)
ofs.write( '%.6f ' % prob )
for microMove in moveTree:
ofs.write( '%s ' % microMove )
ofs.write( '\n' )
ofs.close()
In [ ]: