Welcome to the first guided LHCb Open Data Portal project!
Before you start, you might find it helpful to find out more about matter antimatter asymmetries, what we hope to learn by studying them, and how we can detect them with experiments such as the LHCb experiment at CERN.
Here are some details that relate directly to this project:
Once you have tried the analysis please provide feedback to help us improve this project using this brief survey.
Feel free to post questions, feedback and discuss results with other users using the GitHub issue tracker.
The most important coding guidance we are providing you is in the form of an unrelated analysis. We have performed an analysis of Nobel prizes winners. That link provides you with the full code for this. The coding skills required for this Nobel analysis is very similar to that needed for the particle physics analysis. Hence by reading and understandign that analysis you can copy and adapt the lines of code to perform your particle physics analysis.
In order to get started and check the first code that you will be writing works correctly it is best to start by analysing simulated data rather than real data from the LHC. The real data contains not only the type of events that you wish to analyse, known as the 'signal', but also events that can fake these, known as 'background'. The real data measurements are also limited by the resolution of the detector. The simplified simulation data provided here contains only the signal events and provides the results that would be obtained for a perfect detector.
IMPORTANT: For every code box with code already in it, like the one below you must click in and press shift+enter to run the code.
If the In [x]:
to the left of a codebox changes to In [*]:
that means the code in that box is currently running
In [ ]:
from __future__ import print_function
from __future__ import division
%pylab inline
execfile('Data/setup_main_analysis.py')
If you want help with coding there is in addition to the example code, some hints within each section and a function reference list.
In [ ]:
# Let us now load the simulated data
sim_data = read_root('Data/PhaseSpaceSimulation.root')
The data contains information about 'events' that were observed in the detector. An event refers to the particles produced when an interaction took place when two proton are collided at the LHC. The data you have includes information about particles observed in the detector after each collision. If you think of the data as a table, each row of the table is the results from a different collision. The columns of the table are different quantities measured about the particles produced in the collision.
We are interested in analysing the decays of particles called B+ or B- mesons decaying into three other mesons called kaons (K+ or K-). The events you have been given are those in which this process may have occurred. The detector has been used to reconstruct tracks that may have come from the kaons. You are given the measured momenta, charge, and likelihood of the tracks being kaons. You are given information for three tracks in each event, the ones that could be the three kaons that a B+ or B- meson has decayed into. The following information is available about each event: information list
In [ ]:
# make a table of the data variables here
Creating a table - Use your head()
- remember to look at the example analysis to see how this was done there.
In [ ]:
# make a histogram of the H1_PX variable here
Momentum is a vector quantity, it has x,y, and z components. Try calculating the magnitude of the momentum of the first kaon candidate and plotting a histogram of this, you'll need the H1_PX
, H1_PY
and H1_PZ
variables.
In [ ]:
# calculate a variable for the magnitude of the momentum of the first kaon
# plot a histogram of this variable
Histogram plotting - You can use the hist() function. The parameters bins(n) and range(x,y) allow youto plot n bins over the range x to y.
Vector Magnitude The square magnitude of a magnitude of a vector is given by the sum of the square of its of its components in the x,y and z directions: $p^2 = p_x^2+p_y^2+p_z^2$, where $p$ is the magnitude of the momentum, and $p_x,p_y,p_z$ are the components of the momentum in the X,Y, and Z directions.
Einstein's theory of special relativity relates Energy, mass and momentum. We have measured the momentum of the kaon candidates in the detector, and have just plotted one of the components of the momentum of the kaon, and the magnitude of the momentum. The invariant mass of the kaon is well known and you can look this up. We wish to determine the energy of the kaons.
Here is a brief guide to the energy-momentum relation of special relativity. Further information can be found on wikipedia pages on Invariant Mass and the Energy-momentum relation.
Now, calculate the energy of the first kaon candidate using:
In [ ]:
# calculate the energy of the first kaon
In [ ]:
# plot a histogram of this variable
Energy calculation - Use the magnitude of momentum variable you calculated above and the known invariant mass of the kaon to work out the energy of the first hadron. Calculate the energy squared, and then the energy and plot this.
Kaon mass - you can find the kaon mass on wikipedia or in physics textbooks. There is also a reference used by particle physicists: all our knowledge of the properties of the particles are collected together by the particle data group here.
Calculate the momenta and energies of the second and third kaon candidates also.
In [ ]:
# calculate variables for the energy of the other two kaons
In this analysis we are looking for B+ or B- mesons (see B meson) that have decayed into the three charged kaons.
Energy is a conserved quantities. This means that you can use the energy of the three 'daughter' kaons, which you have calculated above, to calculate the energy that the B meson that decayed into them must have.
Momentum is also a conserved quantity. Hence you can also use the momenta of the 'daughter' kaons to calculate the momentum of the B meson. But be careful - momentum is a vector quantity.
Using the Energy of the B meson and the magnitude of the momentum of the B meson you can use the energy-momentum relationship again. This time you are applying it to the B meson. This will allow you to calculate the invariant mass of the B meson.
In [ ]:
# calculate the energy of the B meson
In [ ]:
# calculate the momentum components of the B meson
# and the magnitude of the momentum of the B meson
In [ ]:
# calculate the B meson invariant mass
# plot the B meson invariant mass in a histogram
You should have a graph that sharply peaks at the mass of the B+ meson. The mass of the B+ and B- meson are the same. Check that the peak of your graph is at the known mass of the B meson. Congratulations!
Recall that you have made this plot for simulated data. How might you expect the plots for real data to look different ? In the next section you will start to work with the real LHC data.
B Meson Energy - From energy conservation, the energy of the B meson will be the sum of the energies of the three kaons: $E_B=E_{K1}+E_{K2}+E_{K3}$, where $E_B$ is the energy of the B meson, $E_{K1}, E_{K2}, E_{K3}$ are the energies of each of the kaons.
B meson momentum - From momentum conservation, the X component of the momentum of the B meson will be the sum of the X momentum components of the three Kaons : $px_B=px_{K1}+px_{K2}+px_{K3}$, where $px$ is the X direction component of the momentum of the B meson, $px_{K1},px_{K2},px_{K3}$ are the X direction components of the momenta of the three kaons. You can then do the same with the Y and Z components. Having obtained the X,Y, and z components of the B momentum you can find the magnitude of the momentum of the B meson.
B meson invariant mass* - Rearrange the equation $E^2=p^2+m^2$ to find $m^2$. Using the values of the magnitude of the momentum of the B meson and the B meson Energy, find the mass of the B meson.
Histogram plotting - Take care that the range of your mass plot is set suitably that you can see the mass peak. Once you have found the peak you can set the range appropriately. You do not have to start your graph at a mass of 0.
Units - The data you are provided has energies in 'MeV' (106 electron volts). The mass of the B meson is often quoted in 'GeV/c2' (109 electron volts).
In the section above you have analysed the simulation data to determine the invariant mass of the B meson. Now, you can start applying the methods you have used to the real LHCb data. This data was collected by the LHCb detector at CERN during 2011, the first major year of LHC operations.
The data you are given has been filtered to select only events that are likely to have come from B+ or B- mesons decaying into three final state charged particles. You are interested in the case where these three final state paticles are charged kaons K+ or K-.
An introduction has been provided on the detector and data sample. As background information we also provide further information on the selection that has been applied to select this data sample.
You want to apply a preselection to the three final state tracks that
!H1_isMuon
where !
means not
, and similarly for H2
and H3
)H1_ProbPi < 0.5
)H1_ProbK > 0.5
)You need to find a balance between making cuts that are too loose and include too many background events and too tight and reject many of your signal events.
In order to now find the most suitable further selection cuts, make yourself familiar with how cuts can affect the significance of the final result. Feel free to come back to this stage later and adjust your cuts to see the impact.
The pre selection you create will be applied for you if give it the name 'preselection'.
We have provided an example preselection in the hints, so feel free to use that to get started if you wish. start with a loose preselection and then refine it after you have studied the plots.
In [ ]:
# Make your preselection here, this line applies no preselection
preselection = ""
This next line of code just loads the real data into a new DataFrame, this may take a few minutes. It also applies the preselection that you have created if you called it preselection.
In [ ]:
real_data = read_root(['Data/B2HHH_MagnetDown.root', 'Data/B2HHH_MagnetUp.root'], where=preselection)
Make histograms of the probability of a final state particle being a kaon or a pion. These will help guide you on suitable probability values at which to cut.
You can also consider more sophisticated options like 2-D plots of kaon and pion probabilities or different values of the cuts for the different final state particles.
In [ ]:
# plot the probability that a final state particle is a kaon
In [ ]:
# plot the probability that a final state particle is a pion
Now calculate the invariant mass of the B meson for the real data and plot a histogram of this. Compare it with the one you drew for the simulation data.
Can you explain the differences you observe?
In [ ]:
# draw a histogram for the B meson mass in the real data
Experiment with the cuts and see the impact of harsher or more lenient cuts on the invariant mass plot. You should select a set of cuts which makes the signal most prominent with respect to the background. Once you have finalised the selection on particle identification also make cuts on the reconstructed particle mass to select the events in the B meson mass peak, removing the background events which lie at lower and higher invariant masses.
This is an example string, showing the syntax, that you could use as a preselection starting point:
In [ ]:
preselection = "H1_ProbPi < 0.5 & H2_ProbPi < 0.5 & H3_ProbPi < 0.5 & H1_ProbK > 0.5 & H2_ProbK > 0.5 & H3_ProbK > 0.5 & !H1_isMuon & !H2_isMuon & !H3_isMuon"
In this section you will start to study matter antimatter differences (CP Violation). Here 'global' means that you are looking for differences across all ranges of energy and momentum (the kinematics) of the kaons into which the charge B mesons have decayed. Later we look at 'local' differences in different regions of the kinematics.
In order to quantify the matter antimatter asymmetry in this process we wish to compare the B+ and the B- particles. The B- is the anti-particle of the B+.
How can you distinguish between events that contain B+ and B- particles using H1_Charge
, H2_Charge
and H3_Charge
?
In [ ]:
# make a variable for the charge of the B mesons
Now count the numbers of events of each of the two types (N+ and N-). Also calculate the difference between these two numbers.
In [ ]:
# make variables for the numbers of positive and negative B mesons
In order to calculate the Asymmetry, you can make use of the formula:
(note you may need to run this box in order to see the image)
In [ ]:
# calculate the value of the asymmetry, by using the formula above, and then print it
Differentiating between N+ and N-
len(real_data.query('B_Charge == charge'))
to count the number of mesons, where B_Charge
is the variable you created and charge
is 1
or -1
.You will now need to calculate the statistical uncertainty of the asymmetry. You can do so using the formula:
The significance of the result, sigma, is found by dividing the value for asymmetry by its uncertainty. A value exceeding three sigma is considered "evidence" by particle physicists while a value of five sigma or more can be called an "observation" or "discovery".
In [ ]:
# calculate the statistical significance of your result and print it
Congratulations! You have performed your first search for a matter anti-matter difference.
Here you have only considered the statistical uncertainty. Your measurement will also have other sources of uncertainty known as systematic uncertainties which you have not considered at this stage.
In this stage we introduce you to an important technique for analysing decays of one particle (your charged B meson) into three bodies (the three kaons). This is known as a Dalitz plot.
The decay of the B meson can proceed either directly to the three-body final state or via an intermediate particle. For example, B+ → K+K+K−, could proceed through the decay B+ → K+R0, where R0 is a neutral particle resonance which can decay R0 → K+K-. Dalitz plots can be used to identify these resonances which are visible as bands on the Dalitz plot.
More information about these plots and why these are used in particle physics research can be found in Dalitz Plot Introduction.
The kinematics of a three-body decay can be fully described using only two variables. The energies and momenta of the three kaons are not independent of each other as they all come from the decay of a B meson and energy and momentum are conserved. The axes of the plots conventionally are the squared invariant masses of two pairs of the decay products. It is a 2D plot, the x and y axes are both squared masses and the density of points in the plot shows the structure.
Consider our decay B+ → K+1K+2K−3, where we have numbered the kaons 1,2,3 to distinguish them. We can calculate the invariant mass of three possible combinations that could correspond to intermediate resonances R++1 → K+1K+2, R02 → K+1K-3, and R03 → K+3K-3.
The potential R++1 would be a doubly charged resonance. We would not expect to see any resonances corresponding to this as mesons are composed of one quark and one anti-quark and their charges cannot add up to two units.
The potential R02 and R03 correspond to configurations in which we could see resonances. Hence you should compute the invariant mass combinations for these. The square of these masses should be used as the Dalitz variables.
We suggest you make these plots first for the simulation data. In the simulation there are no intermediate resonances and your plot should be of uniform density inside the range physically allowed by energy and momentum conservation.
In [ ]:
# calculate the invariant masses for each possible hadron pair combination
In [ ]:
# plot the invariant mass for one of these combinations
In [ ]:
# make a Dalitz plot with labelled axes for the simulation data
Calculating invariant mass - Use the same technique as you did above for the B meson, but now applying it to two-body invariant masses rather than three.
Plotting the Dalitz plot - You can use a scatter
plot from matplotlib
to plot a Dalitz plotm, see the example analysis. Remember to use the square of each two-body mass.
In [ ]:
# calculate the invariant masses for each possible hadron pair combination in the real data
In [ ]:
# make a Dalitz plot for the real data (with your preselection cuts applied)
You can make a further improvement to allow you to observe the resonances easier. Your resonances R02 and R03 are both composed of the same particle types, K+K-, and hence have the same distributions. It is useful to impose an ordering which distinguishes the resonances. We can call the resonances R0Low and R0High. In each event R0Low is the resonance with the lower mass and the other corresponds to the higher mass combination of kaons. You can now use the mass of these ordered resonances as your Dalitz plot variables, thus effectively "folding" your Dalitz plot so that one axis always has a higher value than the other.
In [ ]:
# make a new Dalitz plot with a mass ordering of the axes
Ordered Dalitz plot - You can find the maximum of the mass of R0Low vs R0High elementwise on one axis, and the minimum of on the other. You can use numpy.min(a,b)
and numpy.max(a,b)
to perform elementwise comparisons between two arrays a
and b
and return one array filled by either the individual min/max element from the elementwise comparisons.
In [ ]:
# plot a binned Dalitz Plot
# use colorbar() to make a legend for your plot at the side
You can now use your Dalitz plot to identify the intermediate resonances that you see in your plots. The resonances will have shown up as bands of higher density of points on the plots. You can use the particle data group tables of mesons to identify which particles these correspond to. The tables give the masses and widths of the particles and their decay modes. You are looking for mesons with the masses corresponding to where you see the bands and that decay into K+K-.
Congratulations! You have succesfully made a Dalitz plot and used it to observe the presence of intermediate particles in the decay of your charged B meson into three charged kaons.
In a section above you searched for global CP violation. You probably did not find a result with very high significance.
CP violation may arise from interference between decays through different resonances, and hence the magnitude and sign of the CP violation may vary across the Dalitz plot. We can apply the same equation as in the global CP violation study
The analysis performed here is to study the CP violation in the charmless B meson decays to kaons. "charmless" means that the decay does not proceed through a charm quark. However, the most frequent decay of the B mesons occur through the b quark decaying into a c quark. The majority of these events can be removed by rejecting the events that are proceeding through a D0 meson (which contains the charm quark).
In the section above you plotted a histogram of the invariant mass of the intermediate resonances and will have observed the D0 meson in this and in the Dalitz plot. You should now reject events that are around the mass range of the D0 meson to suppress this contribution. You can do this in your pre-selection on the data that you set-up earlier in the project.
This was also a simplification that we did not consider when we were calculating the global asymmetry. After you have applied this pre-selection your code will now recompute the global asymmetry with the D0 meson contribution rejected. We have not yet observed CP violation in charm mesons and searching for this is another active area of current research.
Make separate Dalitz plots for the B+ and the B- decays. Local CP Violation will show up as an asymmetry between the B+ and the B- plots.
In order that the statistical error on the asymmetry in each bin is not over large the bins need to contain a reasonable number of entries. Hence you will probably need larger bins than when you were looking for resonances in the section above. A suitable initial bin size might be $2.5~\text{GeV}^2/\text{c}^4 \times 2.5~\text{GeV}^2/\text{c}^4$.
In [ ]:
# make a Dalitz plot for the B+ events
In [ ]:
# make a Dalitz plot for the B- events
In [ ]:
# Make a plot showing the asymmetry between these two Daltz plots
# i.e. calculate the asymmetry between each bin of the B+ and B- Dalitz plots and show the result in another 2D plot
Observing a large asymmetry in some regions of the plot does not necessarily mean you have observed CP violation. If there are very few events in that region of the plot the uncertainty on that large asymmetry may be large. Hence, the value may still be compatible with zero.
You can calculate the statistical uncertainty on the asymmetry, for each bin of the plot, using the same formulas as you used in the global asymmetry section. You can then make a plot showing the uncertainty on the asymmetry.
Dividing the plot showing the asymmetry by the plot showing the statistical uncertainty you can then obtain the significance of the asymmetry in each bin. You can then plot the significance of the asymmetry to see if there is any evidence for CP violation.
In [ ]:
# Make a plot showing the uncertainty on the asymmetry
In [ ]:
# Make a plot showing the statistical significance of the asymmetry
From your studies of the asymmetry plot, and the plot of its significance, you will be able to identify regions in the Dalitz plots that show indications of sizeable and significant CP Violation. You may find you have several consecutive bins with significant positive, or negative, asymmetries. You may wish to try alternative binnings of the Dalitz plots to best isolate the regions in which the significant asymmetries occur.
You can select events that are in these regions of the Dalitz plots where you observe signs of CP Violation. You can then plot a simple 1D histogram of the invariant mass distribution of the B+ and the B- events, just as you did at the start of the project, but only for events that lie in the region of the Dalitz plot that you are interested in. Make the plots of the B+ and the B- events with the same scale, or superimpose the two plots, so that you can observe if the particle and anti-particle decay processes are occurring at the same rate.
In [ ]:
# Make a plot showing the invariant mass of the B+ meson particles
# using events from a region of the Dalitz plot showing sizeable CP asymmetries
In [ ]:
# Make a plot showing the invariant mass of the B- meson particles using events from the same region
Congratulations! You should now have succesfully observed significant evidence for CP Violation. You should have plots that clearly show that particle and anti-particle decay processes occur at different rates in local regions of the Dalitz plot. You may wish to comapre your results with those published by the LHCb collaboration in this paper.
Well Done you have succesfully completed your first particle physics analysis project. There are many more analyses that can be conducted witht the data set that you have been provided and the skills that you have gained. Ideas for some of these are explored in the section below. Maybe you can discover something new!
Now you've finished the analysis please provide feedback to help us improve this project using this brief survey.
The data set you have been provided is the full set of data recorded by LHCb preselected for decays of charged B mesons into three final state tracks. This data set has been used for two important publications, here and here.
We discuss here:
In this analysis you considered the statistical uncertainty on the result. This occurs as a result of having only a limited number of events. In addition there are systematic uncertainties, these arise from biases in your measurement. Here we discuss three sources of these for this analysis.
One source of 'background' events arises from random combinations of tracks in events that happen to fake the 'signal' characteristics. These events will not peak in the mass distribution at the mass of the B meson but rtaher will have a smoothly varying distribution. Looking at the number and distribution of of events away from the mass peak can allow you to estimate the number of background events under the mass peak.
The next level of sophistication in the analysis requires fitting the distributions of events that are observed in the B mass distribution in order to estimate the yield of signal events and background events. You can see how this is done in the LHCb paper on the analysis. Fitting can be performed using the CERN root framework.
The LHCb papers using this data set that you are using analysed four decay channels of the charged B mesons. You can perform any of these analyses.