Challenge 01: Particle Counting

The Timepix hybrid silicon pixel detector is an impressive bit of kit. Developed at CERN by the Medipix Collaboration, and used in the Large Hadron Collider (LHC), it can detect and visualise ionising radiation in real time. In this challenge, we will use code to count and identify different types of radiation emitted by a radioactive notebook at the Royal Institution of Great Britain.

Figure 1: A CERN@school MX-10 detector (left) mounted above one of William Crookes' notebooks at the Royal Institution of Great Britain. You can also see a Geiger counter to the right.

Ionising radiation, such as alphas (helium nuclei), betas (electrons and positrons) and gammas (photons), registers as different patterns of pixels in each frame of data. The pixel information is recorded in a text file with the x and y position of the hit on the sensor and the count value C of the hit, which is related to the energy of the radiation.

We have provided data from a frame recorded in one second from the above experimental setup. We can read this data into Python as follows:



In [1]:

    
# Import the data analysis tools we need from the Pandas toolkit.
import pandas as pd

## Our pixel data stored as a Panda DataFrame.
df000 = pd.read_table('./data/crookes/data000.txt', header=None, names=["x", "y", "C"])

# Show the first five rows of pixel data.
df000.head()

The DataFrame object provides a convenient wrapper for the pixel data read from the raw text file the detector produced. For example, we can find out how many pixels there are in total in the frame:



In [2]:

    
print("There are %d pixels in the data frame." % (len(df000)))









    



There are 735 pixels in the data frame.

Of course, what we're really interested in is what the pixels look like as a whole so we can see the patterns and start to identify what has been detected. We've written some code to do this for you:



In [3]:

    
# Import a function that extracts a pixel dictionary from the data frame.
from cernatschool.helpers import get_pixel_dictionary_from_dataframe

# Import a function that makes an image from a pixel dictionary.
from cernatschool.visualisation import make_frame_image

# (This line just allows us to show images as we go...)
%matplotlib inline

## The pixel dictionary - contains the pixel information in the format {X:C}.
pixel_dict = get_pixel_dictionary_from_dataframe(df000)

# Make the frame image from the pixel dictionary.
make_frame_image(pixel_dict)

And there you go! A 256 x 256 pixel image of the radiation detected in one second of recording data from above Crookes' notebook. The colours indicate how much energy was deposited in each pixel using the scale on the right of the image.



In [4]:

    
# Exercise 1: Manual particle counting
#
# How many particles can you count in the above image? Write your answer below.
#
#

Now, you should be able to make out three sorts of clusters - groups of pixels adjacent pixels - in the image, corresponding to the three types of radiation we'd expect to see from this measurement.

Firstly, there are the alphas. Alpha radiation consists of helium nuclei - two protons and two neutrons with an overall charge of 2+. These deposit large amounts of energy in the detector. To cut a long story short, this results in many pixels being activated over a large area in the image, resulting in "blobs" like these:



In [5]:

    
# Find the clusters - groups of adjacent pixels - in the data.
from cernatschool.kluster import KlusterFinder

# Get one of the alpha clusters.
alpha_cluster_1 = KlusterFinder(pixel_dict, 256, 256).getListOfKlusters()[4]

# Import a function for making an image of just one cluster.
from cernatschool.visualisation import make_kluster_image

# Make the image.
make_kluster_image(alpha_cluster_1)

Notice too how all of the energy is centred in the middle of the cluster.

Secondly, we have beta radiation - electrons and positrons - that scatters through the silicon of the detector leaving curly tracks of pixels like in the following cluster:



In [6]:

    
# An example of a beta cluster from the data.
beta_cluster_1 = KlusterFinder(pixel_dict, 256, 256).getListOfKlusters()[11]

# Make the image!
make_kluster_image(beta_cluster_1)

Finally, we have gamma radiation - photons that set off single pixels, or two, three of four when they hit the boundaries between pixels:



In [7]:

    
# An example of a tripixel gamma candidate.
gamma_cluster_1 = KlusterFinder(pixel_dict, 256, 256).getListOfKlusters()[25]

# Make the image!
make_kluster_image(gamma_cluster_1)

We can use the KlusterFinder Python object to find the clusters of pixels in the data, and the make_kluster_image function to make an image of the cluster that lets us look at it in more detail. Why not try looking at more clusters yourself?



In [8]:

    
# Exercise 2: Making your own cluster images
#
# Use code like the following to make your own cluster images from the data!
#
# First, let's make a Python list of the clusters:

klusters = KlusterFinder(pixel_dict, 256, 256).getListOfKlusters()

# We can now make an image by specifying the index in the list.
# Try changing the number 12 to another number to see the different
# clusters!

make_kluster_image(klusters[12])

It's fun to make the pictures of the clusters, but we can actually use code to do some important - but, dare we say, tedious - work for us. For example, how did you get on with the particle counting in Exercise 1? Our code makes tasks like this a lot easier...



In [9]:

    
print("Number of clusters (particle) in the data frame: %d" % (len(klusters)))









    



Number of clusters (particle) in the data frame: 34

Did you get it right?

Now, it turns out KlusterFinder is a bit cleverer than just counting the particles. It also does a bunch of calculations on the clusters that tell us something about the clusters themselves. For example, it calculates:

The size of the cluster (the number of pixels in the cluster);
The position of the centre of the cluster;
The radius of the cluster (the distance from the centre to the furthest pixel from the centre);
The density of the cluster (the number of pixels divided by the area of the circle defined by the cluster radius);
Whether or not the cluster is on the edge of the frame;
The fraction of pixels completely surrounded by other pixels in the cluster.

We can make a DataFrame of these properties of the clusters and present it as a table:



In [10]:

    
# Import a function that makes a DataFrame of kluster information.
from cernatschool.helpers import get_dataframe_of_klusters

## The cluster DataFrame.
kdf = get_dataframe_of_klusters(klusters)

# Show the whole table, representing all 34 clusters.
kdf









    Out[10]:






  
    
      
      size
      x_uw
      y_uw
      radius_uw
      density_uw
      isedgekluster
      innerfrac
    
  
  
    
      0
      104
      146.673077
      237.375000
      8.550607
      0.452782
      False
      0.451923
    
    
      1
      92
      59.771739
      250.130435
      6.772995
      0.638376
      False
      0.543478
    
    
      2
      73
      184.657534
      200.698630
      5.056112
      0.908949
      False
      0.520548
    
    
      3
      63
      132.571429
      208.142857
      20.033645
      0.049966
      False
      0.000000
    
    
      4
      50
      27.620000
      178.740000
      4.036335
      0.976890
      False
      0.440000
    
    
      5
      48
      112.750000
      116.000000
      3.881044
      1.014365
      False
      0.458333
    
    
      6
      44
      163.863636
      208.909091
      3.911469
      0.915426
      False
      0.431818
    
    
      7
      43
      105.651163
      49.325581
      3.712927
      0.992855
      False
      0.372093
    
    
      8
      39
      168.538462
      148.102564
      3.575981
      0.970788
      False
      0.384615
    
    
      9
      26
      96.192308
      253.384615
      3.683684
      0.609900
      True
      0.230769
    
    
      10
      19
      170.315789
      198.526316
      3.357301
      0.536566
      False
      0.000000
    
    
      11
      17
      120.764706
      161.058824
      5.246848
      0.196563
      False
      0.000000
    
    
      12
      16
      25.687500
      11.125000
      2.977126
      0.574613
      False
      0.062500
    
    
      13
      12
      43.333333
      213.083333
      3.779513
      0.267399
      False
      0.000000
    
    
      14
      12
      232.666667
      148.416667
      3.019612
      0.418918
      False
      0.000000
    
    
      15
      10
      40.600000
      13.100000
      1.941649
      0.844323
      False
      0.000000
    
    
      16
      8
      173.625000
      235.875000
      1.976424
      0.651899
      False
      0.000000
    
    
      17
      7
      28.714286
      69.714286
      2.622509
      0.323977
      False
      0.000000
    
    
      18
      6
      62.000000
      143.500000
      1.118034
      1.527887
      False
      0.000000
    
    
      19
      6
      63.166667
      159.166667
      1.178511
      1.375099
      False
      0.000000
    
    
      20
      6
      110.166667
      199.833333
      1.178511
      1.375099
      False
      0.000000
    
    
      21
      6
      48.333333
      13.833333
      1.863390
      0.550039
      False
      0.000000
    
    
      22
      4
      34.500000
      101.500000
      0.707107
      2.546479
      False
      0.000000
    
    
      23
      4
      170.000000
      186.500000
      1.118034
      1.018592
      False
      0.000000
    
    
      24
      3
      0.000000
      108.000000
      1.000000
      0.954930
      True
      0.000000
    
    
      25
      3
      131.666667
      243.333333
      0.745356
      1.718873
      False
      0.000000
    
    
      26
      3
      214.333333
      189.333333
      0.745356
      1.718873
      False
      0.000000
    
    
      27
      3
      134.666667
      14.666667
      0.745356
      1.718873
      False
      0.000000
    
    
      28
      2
      94.000000
      17.500000
      0.500000
      2.546479
      False
      0.000000
    
    
      29
      2
      200.500000
      37.000000
      0.500000
      2.546479
      False
      0.000000
    
    
      30
      1
      16.000000
      138.000000
      0.000000
      0.000000
      False
      0.000000
    
    
      31
      1
      166.000000
      91.000000
      0.000000
      0.000000
      False
      0.000000
    
    
      32
      1
      138.000000
      230.000000
      0.000000
      0.000000
      False
      0.000000
    
    
      33
      1
      97.000000
      135.000000
      0.000000
      0.000000
      False
      0.000000

That's quite a lot of information to take in - but you can use code to help. Want to find only the particles above a certain size? No problem.



In [11]:

    
## Let's look for particles with a minimum size of 50 pixels.
min_size = 50

# Apply a filter to the table using the `loc` method and a conditional statement.
kdf.loc[kdf['size'] >= min_size]









    Out[11]:






  
    
      
      size
      x_uw
      y_uw
      radius_uw
      density_uw
      isedgekluster
      innerfrac
    
  
  
    
      0
      104
      146.673077
      237.375000
      8.550607
      0.452782
      False
      0.451923
    
    
      1
      92
      59.771739
      250.130435
      6.772995
      0.638376
      False
      0.543478
    
    
      2
      73
      184.657534
      200.698630
      5.056112
      0.908949
      False
      0.520548
    
    
      3
      63
      132.571429
      208.142857
      20.033645
      0.049966
      False
      0.000000
    
    
      4
      50
      27.620000
      178.740000
      4.036335
      0.976890
      False
      0.440000

Which particles are on the edge of the frame?



In [12]:

    
kdf.loc[(kdf['isedgekluster']==True)]









    Out[12]:






  
    
      
      size
      x_uw
      y_uw
      radius_uw
      density_uw
      isedgekluster
      innerfrac
    
  
  
    
      9
      26
      96.192308
      253.384615
      3.683684
      0.60990
      True
      0.230769
    
    
      24
      3
      0.000000
      108.000000
      1.000000
      0.95493
      True
      0.000000

Which we can check with:



In [13]:

    
make_kluster_image(klusters[9])

We can also combine our search criteria by using boolean operators in the loc method like so:



In [14]:

    
## The minimum particle size.
min_size = 25

## The minimum density.
min_density = 0.5

# Find clusters that aren't on the edge, have a minimum size of 25 pixels and
# a minimum density of 0.8 using multiple conditions linked by AND & operators.
kdf.loc[(kdf['isedgekluster']==False ) & ((kdf['size'] >= min_size) & (kdf['density_uw'] >= min_density))]









    Out[14]:






  
    
      
      size
      x_uw
      y_uw
      radius_uw
      density_uw
      isedgekluster
      innerfrac
    
  
  
    
      1
      92
      59.771739
      250.130435
      6.772995
      0.638376
      False
      0.543478
    
    
      2
      73
      184.657534
      200.698630
      5.056112
      0.908949
      False
      0.520548
    
    
      4
      50
      27.620000
      178.740000
      4.036335
      0.976890
      False
      0.440000
    
    
      5
      48
      112.750000
      116.000000
      3.881044
      1.014365
      False
      0.458333
    
    
      6
      44
      163.863636
      208.909091
      3.911469
      0.915426
      False
      0.431818
    
    
      7
      43
      105.651163
      49.325581
      3.712927
      0.992855
      False
      0.372093
    
    
      8
      39
      168.538462
      148.102564
      3.575981
      0.970788
      False
      0.384615



In [15]:

    
# Exercise 3: Particle spotting
#
# Use the make_kluster_image function to examine these particles more closely.
#
# What do you notice?

Hmm... so we seem to be able to sort the particles into groups by requiring that they meet certain criteria. We can make use of this to automatically create lists of clusters that represent these groups, and then count them:



In [16]:

    
## A list of big particles!
big_clusters = []

# Loop over the clusters.
for kluster in klusters:
    # Apply the cut on cluster size.
    if kluster.getNumberOfPixels() > 50:
        # If it passes, add it to the list!
        big_clusters.append(kluster)
        
print("%d big clusters found!" % (len(big_clusters)))









    



4 big clusters found!



In [17]:

    
# Exercise 4: Checking our working
#
# Use the DataFrame of clusters to check that there are indeed four "big clusters"
# as defined here.

The other methods to get the cluster properties from a cluster are:

getNumberOfPixels() - the cluster size;
getRadiusUW() - the cluster radius;
getDensityUW() - the cluster density;
getInnerPixelFraction() - the fraction of pixels that are completely surrounded by other pixels;
isEdgeCluster() - returns True if the cluster is on the edge of the frame.

So we can use multiple if statements and these values to sort the clusters in our for loop.



In [18]:

    
# Exercise 5: GRAND CHALLENGE - COUNT THE ALPHAS, BETAS, AND GAMMAS
#
# Can you come up with a for loop containing a series of if statements
# that will sort the clusters into four lists?
#
# We've made a start for you here - good luck!

## A list of single alpha particles.
alphas = []

## A list of beta particles.
betas = []

## A list of gammas.
gammas = []

## A list of "other" particles.
others = []

# Add to the loop here:

for kluster in klusters:
    # Dummy if statement - you can do better!
    if kluster.getNumberOfPixels() >= 1:
        alphas.append(kluster)
        
print("Number of alphas   = % 3d" % (len(alphas)))
print("Number of betas    = % 3d" % (len(betas)))
print("Number of gammas   = % 3d" % (len(gammas)))
print("Number of 'others' = % 3d" % (len(others)))
print("------------------------")
print("TOTAL              = % 3d" % (len(alphas) + len(betas) + len(gammas) + len(others)))









    



Number of alphas   =  34
Number of betas    =   0
Number of gammas   =   0
Number of 'others' =   0
------------------------
TOTAL              =  34

	size	x_uw	y_uw	radius_uw	density_uw	isedgekluster	innerfrac
0	104	146.673077	237.375000	8.550607	0.452782	False	0.451923
1	92	59.771739	250.130435	6.772995	0.638376	False	0.543478
2	73	184.657534	200.698630	5.056112	0.908949	False	0.520548
3	63	132.571429	208.142857	20.033645	0.049966	False	0.000000
4	50	27.620000	178.740000	4.036335	0.976890	False	0.440000
5	48	112.750000	116.000000	3.881044	1.014365	False	0.458333
6	44	163.863636	208.909091	3.911469	0.915426	False	0.431818
7	43	105.651163	49.325581	3.712927	0.992855	False	0.372093
8	39	168.538462	148.102564	3.575981	0.970788	False	0.384615
9	26	96.192308	253.384615	3.683684	0.609900	True	0.230769
10	19	170.315789	198.526316	3.357301	0.536566	False	0.000000
11	17	120.764706	161.058824	5.246848	0.196563	False	0.000000
12	16	25.687500	11.125000	2.977126	0.574613	False	0.062500
13	12	43.333333	213.083333	3.779513	0.267399	False	0.000000
14	12	232.666667	148.416667	3.019612	0.418918	False	0.000000
15	10	40.600000	13.100000	1.941649	0.844323	False	0.000000
16	8	173.625000	235.875000	1.976424	0.651899	False	0.000000
17	7	28.714286	69.714286	2.622509	0.323977	False	0.000000
18	6	62.000000	143.500000	1.118034	1.527887	False	0.000000
19	6	63.166667	159.166667	1.178511	1.375099	False	0.000000
20	6	110.166667	199.833333	1.178511	1.375099	False	0.000000
21	6	48.333333	13.833333	1.863390	0.550039	False	0.000000
22	4	34.500000	101.500000	0.707107	2.546479	False	0.000000
23	4	170.000000	186.500000	1.118034	1.018592	False	0.000000
24	3	0.000000	108.000000	1.000000	0.954930	True	0.000000
25	3	131.666667	243.333333	0.745356	1.718873	False	0.000000
26	3	214.333333	189.333333	0.745356	1.718873	False	0.000000
27	3	134.666667	14.666667	0.745356	1.718873	False	0.000000
28	2	94.000000	17.500000	0.500000	2.546479	False	0.000000
29	2	200.500000	37.000000	0.500000	2.546479	False	0.000000
30	1	16.000000	138.000000	0.000000	0.000000	False	0.000000
31	1	166.000000	91.000000	0.000000	0.000000	False	0.000000
32	1	138.000000	230.000000	0.000000	0.000000	False	0.000000
33	1	97.000000	135.000000	0.000000	0.000000	False	0.000000