This notebook reads and plots the data generated from the benchmarking runs found in the file 'raw_benchmark_data'.
The aim of benchmarking is to optimize the node count and configuration in order to get the compromise between efficiency and speed. Often though more nodes will provide faster speed, it does so at the cost of efficiecy.
This notebook will allow you to visualize benchmarking data you generate. You still have to manually set the appropriate node configuration in the master_config_file and populate the directories. To generate representative benchmarking data you first have to run your job to create some input files.
In [84]:
%pylab inline
import pylab
import numpy as np
In [85]:
# Reading in raw timing data from benchmark runs.
# raw data consist of lines: nodes, ns/d, ns/d per node
filename = "raw_benchmark_data.txt"
data = pylab.loadtxt(filename,delimiter=',', skiprows=2)
In [88]:
# Find data ranges:
min_nodes = p[0,0]
min_ns = p[0,1]/min_nodes # ballpark efficient ns/d /node
max_nodes = data[:,0].max()
max_ns = float(data[:,1].max())
print (max_nodes, max_ns)
nl = data[:,0] # make node list
for d in p:
pylab.plot( data[:,0], data[:,1],"ro")
pylab.plot([0,max_nodes],[0,max_nodes*min_ns], '--' , label='linear scaling')
pylab.plot([0,max_nodes],[0,max_nodes*min_ns*0.80], '-.' , label='80% scaling')
pylab.plot([0,max_nodes],[0,max_nodes*min_ns*0.60], '-..', label='60% scaling')
pylab.legend()
pylab.title("Benchmarking")
pylab.xlabel("Node count")
pylab.ylabel("nanosec/day")
pylab.xlim(0, 0.6*max_nodes)
pylab.ylim(0, 1.6*max_ns)
pylab.xticks(nl)
pylab.grid(True)
In [ ]:
In [ ]: