In this notebook we show how to use inside IPython ROOT (C++ library, de-facto standard in High Energy Physics).
This notebook is aimed to help ROOT users.
Working using ROOT-way loops is very slow in python and in most cases useless.
You're proposed to use root_numpy
— a very convenient python library to operate with ROOT (root_numpy
is included in REP docker image, but it is installed quite easily).
In [1]:
%matplotlib inline
There are two libraries to work with ROOT files
Let's show how to use the second library.
In [2]:
import numpy
import root_numpy
# generating random data
data = numpy.random.normal(size=[10000, 2])
# adding names of columns
data = data.view([('first', float), ('second', float)])
# saving to file
root_numpy.array2root(data, filename='./toy_datasets/random.root', treename='tree', mode='recreate')
In [3]:
!ls ./toy_datasets
In [4]:
from rootpy.io import root_open
with root_open('./toy_datasets/random.root', mode='a') as myfile:
new_column = numpy.array(numpy.ones([10000, 1]) , dtype=[('new', 'f8')])
root_numpy.array2tree(new_column, tree=myfile.tree)
myfile.write()
In [5]:
root_numpy.root2array('./toy_datasets/random.root', treename='tree')
Out[5]:
In [6]:
import ROOT
from rep.plotting import canvas
canvas = canvas('my_canvas')
function1 = ROOT.TF1( 'fun1', 'abs(sin(x)/x)', 0, 10)
canvas.SetGridx()
canvas.SetGridy()
function1.Draw()
# Drawing output (last line is considered as output of cell)
canvas
Out[6]:
In [7]:
File = ROOT.TFile("toy_datasets/random.root")
Tree = File.Get("tree")
Tree.Draw("first")
canvas
Out[7]:
In [8]:
# we need to keep histogram in any variable, otherwise it will be deleted automatically
h1 = ROOT.TH1F("h1","hist from tree",50, -0.25, 0.25)
Tree.Draw("first>>h1")
canvas
Out[8]:
In [9]:
data = root_numpy.root2array("toy_datasets/random.root",
treename='tree',
branches=['first', 'second', 'sin(first) * exp(second)'],
selection='first > 0')
in the example above we selected three branches (one of which is an expression and was computed on-the-fly) and selections
In [10]:
# taking, i.e. first 10 elements using python slicing:
data2 = data[:10]
In [11]:
import pandas
dataframe = pandas.DataFrame(data)
# looking at first elements
dataframe.head()
Out[11]:
In [12]:
# taking elements, that satisfy some condition, again showing only first
dataframe[dataframe['second'] > 0].head()
Out[12]:
In [13]:
# adding new column as result of some operation
dataframe['third'] = dataframe['first'] + dataframe['second']
dataframe.head()
Out[13]:
In [14]:
import matplotlib.pyplot as plt
plt.figure(figsize=(9, 7))
plt.hist(data['first'], bins=50)
plt.xlabel('first')
Out[14]:
In [15]:
plt.figure(figsize=(9, 7))
plt.hist(data['second'], bins=50)
plt.xlabel('second')
Out[15]:
root_numpy
as a very nice bridge between two worlds.