Ideally the main parameters of the CERN Accelerator complex (settings and acquisitions) are stored in CALS (CERN Accelerator Logging System,https://be-dep-co.web.cern.ch/content/cals-cern-accelerator-logging-service) for long term (LDB) or short term (MDB, 3 months presently (July, 2017)).
A new CALS platform is presently under devolpment (NXCALS).
A good strategy for the MD community is to ask to complete the present logging adding additional variables if needed (typically in the MDB). In this way one does not need to setup manual logging sessions. Each machine should have a OP-CALS link person (e.g., O. Hans for the PS).
CALS can be queried from the internet or GPN (General Purpose Network) using SWAN (Service for Web-based ANalysis, https://swan.web.cern.ch/). It is important to note that SWAN is not available (July, 2017) in the TN (Technical Network).
To log manually parameters (e.g., not already present in CALS) one can use different approaches (Matlab, Python, Mathematica, see Scripting Tool on the wikis https://wikis.cern.ch/display/ST/Scripting+Tools+Home). In addition to the logging, a similar approach can be extended to modify the machine setting (provided a valid the RBAC token if requested).
In the following we will show some example on how to use pyTIMBER together with pandas dataframe. Before doind that, we would like to comment of the difference between cycleStamp and acquisitionStamp.
Fundamentaly CALS is logging series in the 2 columns format (times, values). The time can be the timestamp of the cycle related to that acquisition (cycleStamp) or the acquisitionStamp. For the Injector Complex is much more convenient to log in cycleStamp because this allows to compare quantities related to the same cycle. In general is not interesting to compare observation related to different cycles even if their acquisition stamp is very close. In machine with very long cycle (LHC) this cycleStamp concept is not interesting anymore. In other words, we can say that the cycleStamp is useful only if the machine is intrinsecally PPM. In PPM machines one can extract data from CALS by fundamentals filters. This feature is not very attractive for LHC.
As we will see, these observations will have a strong impact on how the pandas dataframe will be organized. For instance, if we want to extend an existing dataframe with an additional variable, this is somehow trivial for LHC, but for the other machines we should maintain syncronization to the same cycleStamps. It is important to observe that the cyclestamps between different cycle in teh same machine or between machines have fixed time offset. One could use a sort of arithmetic of the cycleStamp to follow the same beams in the different machine of the injectors complex or to investigate the effect of SC composition on the beam performance of machine.
This is the JAVA API for Parameters Control. See the presentation W. Sliwinski on https://indico.cern.ch/event/404646/.
BE-CO chose to have a JAVA API for control the machine parameters. On the other hand JAVA is not very well suited for scientific computing without a major effort (at least in our opinion). Indeed MATLAB is based on JAVA but is not open source. A first succesful attempt to GET/SET/SUBSCRIBE in MATLAB was in the past within the CTF3 community (main contributor is D. Gamba). More recently a similar approach was adopted for python (pyJAPC by T. Levens and M. Betz). In parallel R. De Maria developped pyTIMBER (to access CALS) and pyLSA (together with M. Hostettler). In addition pyLogbook was developped by S. Gessner. These tools naturally complete the JMAD package and all the python code developped in BE-ABP (pyHEADTAIL, pyECOUD,...).
We will describe in future how to use pyJAPC and pyLSA respectively to GET the CALS data, to GET/SET/SUBSCRIBE data from/to the HW (or the last settings of the LSA database), to GET the historical trims in LSA.
In [1]:
import sys
sys.path.append('/eos/user/s/sterbini/MD_ANALYSIS/public/')
from myToolbox import *
In [ ]:
In [2]:
# Heavily using pyTimber we get the variable
varSet1=log.search('%TPS15%') #recorded by acqStamp since is not PPM
varSet2=log.search('CPS.TGM%')#recorded by cyclestamp
print(varSet1) # just to show what is inside
print(varSet2) # just to show what is inside
In [3]:
extractFromTime=myToolbox.time_1_hour_ago(hours=2)
extractToTime=myToolbox.time_1_hour_ago(hours=1)
# Heavily using PANDAS
# we cannot use the fundamental (recorded by acqStamp)!
myDataFrame1=myToolbox.cals2pnd(varSet1,extractFromTime,extractToTime)
# we can use the fundamental
myDataFrame2=myToolbox.cals2pnd(varSet2,extractFromTime,extractToTime,fundamental='%TOF')
In [4]:
# Now I can merge and create a postprocessing DF
rawDF=myToolbox.mergeDF(myDataFrame1,myDataFrame2)
#eventually I can define a postprocessing
def postprocess(df):
aux=pnd.DataFrame()
aux['PE.TPS15.359.CONTROLLER:ANGLE filled']= df['PE.TPS15.359.CONTROLLER:ANGLE'].ffill()
aux['PE.TPS15.359.CONTROLLER:ANGLE filled, doubled']= aux['PE.TPS15.359.CONTROLLER:ANGLE filled']*2+1
return aux;
postDF=postprocess(rawDF)
#I suggest not to merge the rawDF with the postDF, this will allow to extend the raw.
In [5]:
# starting from the original DF is trivial now to extend them.
# It is important to note that the we have somehow to remember that the second DF needs a fundamental filter
myDataFrame1=myToolbox.cals2pnd(myDataFrame1.columns,rawDF.index[-1],
rawDF.index[-1]+datetime.timedelta(hours=1))
myDataFrame2=myToolbox.cals2pnd(myDataFrame2.columns,rawDF.index[-1],
rawDF.index[-1]+datetime.timedelta(hours=1),
fundamental='%TOF')
In [6]:
# and now we can iterate with the merging
aux=myToolbox.mergeDF(myDataFrame1,myDataFrame2)
# and with the concatenation
rawDF=myToolbox.concatDF(rawDF,aux)
postDF=myToolbox.concatDF(postDF,postprocess(aux))
# we suggest to maintain well separated the raw data, postprocessing functions and postprocessed data.
In [7]:
# print the dataFrame head
rawDF.head()
Out[7]:
In [8]:
# describe the dataFrame
postDF.describe()
Out[8]:
In [9]:
# extract one column as a series
rawDF[['CPS.TGM:USER','CPS.TGM:DEST']].head()
Out[9]:
In [10]:
# extract the fourth and fifth rows
rawDF.iloc[3:5]
Out[10]:
In [11]:
# extract between time
rawDF.between_time('14:02','14:03')
Out[11]:
The pandas dateframe (DF) are a very flexible data type for postprocessing CALS data. They comes with a rich asset of methods and a very active community. Think the DF as LEGO: play with them, merge and contatenate them. Keep well separated the raw DF (extracted from CALS), the postprocessing methods and the postrprocessed DF. NB: numpy is of course more performant of pandas DF, but I tend to buy flexibility with performace.
In [12]:
CPSDF=myToolbox.cals2pnd(['CPS.TGM:USER'],
myToolbox.time_1_hour_ago(hours=1./6.),
myToolbox.time_now(),fundamental='%TOF')
CPSDF.head()
Out[12]:
In [16]:
# the important method to use is indexesFromCALS
cycleAfterDF=myToolbox.indexesFromCALS(CPSDF.index+datetime.timedelta(seconds=1.2),CPSDF.columns)
cycleAfterDF.head()
Out[16]:
Let us consider now another user case. We would like to extract the list of the cyclestamp and some basic variable from MTE cycles along the injector chains. This is an example of cycleStamp arithmetic. We have to remember that the offeset between teh C0 time of the PSB, PS and SPS is 635 ms.
In [27]:
CPSDF=myToolbox.cals2pnd(['CPS.TGM:USER','CPS.TGM:BATCH','PR.DCAFTINJ_1:INTENSITY'],myToolbox.time_1_hour_ago(hours=1./6.),myToolbox.time_now(),fundamental='%MTE%')
In [28]:
firstBatch=CPSDF['CPS.TGM:BATCH']==1
SPSDF=myToolbox.indexesFromCALS(CPSDF[firstBatch].index+datetime.timedelta(seconds=0.635), ['SPS.TGM:USER'])
PSBDF=myToolbox.indexesFromCALS(CPSDF.index-datetime.timedelta(seconds=0.635), ['PSB.TGM:USER']+log.search('%_BCT_ACC_INTENSITY'))
In [40]:
print('==================================')
print('PSB')
print(PSBDF['PSB.TGM:USER'].head(4))
print('==================================')
print('CPS')
print(CPSDF['CPS.TGM:USER'].head(4))
print('==================================')
print('SPS')
print(SPSDF['SPS.TGM:USER'].head(2))
In [79]:
postPSBDF=pnd.DataFrame()
# note the use of the filtering with regular expression and the sum done on the rows (axis=1)
postPSBDF['Total Intensity']=PSBDF.filter(regex='BR*').sum(axis=1)
postPSBDF.head()
Out[79]:
In [78]:
postCPSDF=pnd.DataFrame()
# we are now building a series using data from different DF and adopting the indexing (cycleStamp) from PS.
postCPSDF['transmission']=pnd.Series(CPSDF['PR.DCAFTINJ_1:INTENSITY'].values/postPSBDF['Total Intensity'].values,index=CPSDF.index)
plt.plot(postCPSDF['transmission'])
myToolbox.setXlabel(ax=plt.gca(), hours=1/30.)
Out[78]:
In [120]:
t1=myToolbox.time_1_hour_ago(hours=.1)
t2=myToolbox.time_now()
CPS=myToolbox.cals2pnd(log.search('CPS.TGM%'),t1,t2)
PSB=myToolbox.cals2pnd(log.search('PSB.TGM%'),t1,t2)
SPS=myToolbox.cals2pnd(log.search('SPS.TGM%'),t1,t2)
SPS.head(10)
Out[120]:
In [122]:
SCNUM=1
PSB[PSB['PSB.TGM:SCNUM']==SCNUM].head(1)
Out[122]:
In [123]:
CPS[CPS['CPS.TGM:SCNUM']==SCNUM].head(1)
Out[123]:
In [124]:
SPS[SPS['SPS.TGM:SCNUM']==SCNUM]
Out[124]:
In [125]:
CPS[CPS['CPS.TGM:SCNUM']==SCNUM].index[0]-PSB[PSB['PSB.TGM:SCNUM']==SCNUM].index[0]
Out[125]:
In [126]:
SPS[SPS['SPS.TGM:SCNUM']==SCNUM].index[0]-CPS[CPS['CPS.TGM:SCNUM']==SCNUM].index[0]
Out[126]:
Not all the machine data are recorded in CALS. For getting and recording the data not present in CALS one can automatically subscribe the data with JAPC (https://wikis.cern.ch/display/ST/Libraries+Available). I am using a lot Matlab/JAPC interface but I would like to migrate to the Python/JAPC solution. In the following we assume you have some matlab files in a folder and you want to import them. As you will see the approach is very similar to the CALS dataframe.
In [80]:
# load all files
myFiles=sorted(glob.glob('/eos/user/s/sterbini/MD_ANALYSIS/2016/MD1949/2016.11.17/Monitor1/I250/*mat'))
myFiles
Out[80]:
In [81]:
# check the content of a single file
myFileFormat=myToolbox.japcMatlabImport(myFiles[0])
plt.plot(myFileFormat.RPPBK_BA5_BBLR5177M.LOG_OASIS_I_MEAS.value.DATA)
plt.plot(myFileFormat.RPPBK_BA5_BBLR5177M.LOG_OASIS_I_REF.value.DATA)
plt.xlabel('time [ms]')
plt.ylabel('I [A]')
plt.axis('tight')
Out[81]:
In [82]:
myFileFormat.parameters
Out[82]:
In [83]:
# import all the selected file and variables in a dataFrame
MD1949=myToolbox.fromMatlabToDataFrame(myFiles,['RPPBK_BA5_BBLR5177M.LOG_OASIS_I_MEAS.value.DATA'])
MD1949.head()
Out[83]:
In [84]:
# we can add in second time additional variable from Matlab files
myToolbox.addToDataFrameFromMatlab(MD1949,['SPS_BCTDC_41435.Acquisition.value.totalIntensity'])
MD1949.head()
Out[84]:
Starting from this generic data frame we can use the method describe above to complete the information using CALS data.