Inclusive B-tagging

Authors:

  • Tatiana Likhomanenko (contact)
  • Alexey Rogozhnikov
  • Denis Derkach

Data (from working group):

  • real data $B^{\pm} \to J/\psi K^{\pm}$ (RECO 14), 2012
  • real data $B_d \to J/\psi K^*$ (RECO 14), 2012 (use EPM for assymetry estimation)

Apply sPlot to obtain sWeight ~ P(B)

Monte Carlo:

  • MC $B^{\pm} \to J/\psi K^{\pm}$ for training
  • MC for cross check
    • $B_d \to J\psi K_s$
    • $B_d \to J\psi K^*$

In [2]:
from IPython.display import Image
import pandas

Old tagging

https://github.com/tata-antares/tagging_LHCb/blob/master/old-tagging.ipynb

We first tested the current algorithm (OS taggers: muon, electron, kaon, vertex). TMVA original method was compared with XGBoost.

  • isotonic symmetric calibration
  • use different train-test divisions to calculate $D^2$
  • compute mean and std
  • detail see below (the same formulas)

Data

Taggers: electron, muon, kaon and vertex


In [3]:
pandas.set_option('display.precision', 4)
pandas.read_csv('img/old-tagging-parts.csv').drop(['AUC, with untag', '$\Delta$ AUC, with untag'], axis=1)


Out[3]:
name $\epsilon_{tag}, \%$ $\Delta \epsilon_{tag}, \%$ $D^2$ $\Delta D^2$ $\epsilon, \%$ $\Delta \epsilon, \%$
0 vtx_xgboost 18.2008 0.0495 0.0499 0.0013 0.9084 0.0234
1 vtx_tmva 18.2008 0.0495 0.0425 0.0008 0.7727 0.0150
2 $K$_xgboost 19.2642 0.0509 0.0520 0.0009 1.0009 0.0173
3 $K$_tmva 19.2642 0.0509 0.0480 0.0011 0.9237 0.0214
4 $e$_xgboost 1.8382 0.0157 0.1674 0.0068 0.3077 0.0127
5 $e$_tmva 1.8382 0.0157 0.1609 0.0068 0.2957 0.0127
6 $\mu$_xgboost 5.7366 0.0278 0.1661 0.0038 0.9527 0.0224
7 $\mu$_tmva 5.7366 0.0278 0.1610 0.0032 0.9234 0.0191

MC

Taggers: electron, muon, kaon and vertex


In [4]:
pandas.set_option('display.precision', 4)
pandas.read_csv('img/old-tagging-parts-MC.csv').drop(['AUC, with untag', '$\Delta$ AUC, with untag'], axis=1)


Out[4]:
name $\epsilon_{tag}, \%$ $\Delta \epsilon_{tag}, \%$ $D^2$ $\Delta D^2$ $\epsilon, \%$ $\Delta \epsilon, \%$
0 vtx_xgboost 9.8330 0.0257 0.1150 8.7538e-05 1.1306 0.0031
1 vtx_tmva 9.8330 0.0257 0.1080 7.1570e-04 1.0618 0.0076
2 $K$_xgboost 17.7597 0.0345 0.1124 5.4507e-05 1.9958 0.0040
3 $K$_tmva 17.7597 0.0345 0.1064 3.0603e-05 1.8905 0.0037
4 $e$_xgboost 2.0230 0.0117 0.1303 2.1707e-04 0.2636 0.0016
5 $e$_tmva 2.0230 0.0117 0.1212 2.3217e-04 0.2452 0.0015
6 $\mu$_xgboost 5.0538 0.0184 0.1639 1.1755e-04 0.8283 0.0031
7 $\mu$_tmva 5.0538 0.0184 0.1597 7.0126e-05 0.8070 0.0030

Taggers combination

We then tested a combination with two calibrations for individual taggers:

  • isotonic regression
  • logistic regression.

Combination was calibrated using isotonic regression.


In [5]:
pandas.set_option('display.precision', 4)
pandas.read_csv('img/old-tagging.csv').drop(['$\Delta$ AUC, with untag'], axis=1)


Out[5]:
name $\epsilon_{tag}, \%$ $\Delta \epsilon_{tag}, \%$ $D^2$ $\Delta D^2$ $\epsilon, \%$ $\Delta \epsilon, \%$ AUC, with untag
0 iso-xgb_combined 34.4404 0.0681 0.0683 0.0016 2.3531 0.0539 56.9479
1 iso-tmva_combined 34.4405 0.0681 0.0666 0.0019 2.2941 0.0643 56.8452
2 log-xgb_combined 34.4405 0.0681 0.0717 0.0008 2.4710 0.0289 56.9369
3 log-tmva_combined 34.4405 0.0681 0.0672 0.0009 2.3137 0.0319 56.8070

In [6]:
pandas.set_option('display.precision', 4)
pandas.read_csv('img/old-tagging-MC.csv').drop(['$\Delta$ AUC, with untag'], axis=1)


Out[6]:
name $\epsilon_{tag}, \%$ $\Delta \epsilon_{tag}, \%$ $D^2$ $\Delta D^2$ $\epsilon, \%$ $\Delta \epsilon, \%$ AUC, with untag
0 mu 5.0538 0.0184 0.1581 0.0014 0.7993 0.0075 51.8453
1 vtx 12.2650 0.0287 0.0843 0.0004 1.0344 0.0053 53.2321
2 K 17.7597 0.0345 0.1055 0.0004 1.8743 0.0078 55.1437
3 e 2.0230 0.0117 0.1165 0.0013 0.2357 0.0029 50.6600
4 tmva combination 29.0435 0.0442 0.1092 0.0004 3.1703 0.0129 57.7786
5 mu 5.0538 0.0184 0.1627 0.0016 0.8221 0.0084 51.8434
6 vtx 12.2650 0.0287 0.1000 0.0005 1.2261 0.0065 53.2433
7 K 17.7597 0.0345 0.1120 0.0004 1.9883 0.0082 55.1665
8 e 2.0230 0.0117 0.1278 0.0013 0.2585 0.0031 50.6583
9 xgboost combination 29.0435 0.0442 0.1163 0.0005 3.3789 0.0147 57.8737
10 K* K 17.9499 0.0645 0.1135 0.0000 2.0373 0.0073 55.1643
11 K* e 2.0657 0.0219 0.1259 0.0000 0.2600 0.0028 50.6415
12 K* mu 5.0497 0.0342 0.1623 0.0000 0.8196 0.0056 51.6945
13 K* vtx 12.4130 0.0536 0.1002 0.0000 1.2440 0.0054 53.2801
14 K* combination 29.3082 0.0824 0.1170 0.0000 3.4291 0.0096 57.7637
15 Ks K 17.3906 0.1153 0.1150 0.0000 1.9997 0.0133 54.8647
16 Ks e 1.9698 0.0388 0.1307 0.0000 0.2575 0.0051 50.6315
17 Ks mu 4.9352 0.0614 0.1658 0.0000 0.8185 0.0102 51.8175
18 Ks vtx 11.9824 0.0957 0.1036 0.0000 1.2411 0.0099 52.8970
19 Ks combination 28.3874 0.1473 0.1190 0.0000 3.3776 0.0175 57.3558

Additional information

Details see in the previous presentation: https://indico.cern.ch/event/369520/contribution/3/attachments/1178333/1704665/15.10.28.Tagging.pdf

$\epsilon_{tag}$ calculation

$$N (\text{B events, passed selection}) = \sum_{\text{B events, passed selection}} sw_i$$$$N (\text{all B events}) = \sum_{\text{all B events}} sw_i,$$

where $sw_i$ - sPLot weight

$$\epsilon_{tag} = \frac{N (\text{passed selection})} {N (\text{all events})}$$$$\Delta\epsilon_{tag} = \frac{\sqrt{N (\text{passed selection})}} {N (\text{all events})}$$

Data for training

  • data_sw_passed - tracks/vertices with B-sWeight > 1, are used for training
  • data_sw_not_passed - tracks/vertices with B-sWeight <= 1, are tagged after training

Training

Tracks Features (sig = signal, part = tagger track):

  • cos_diff_phi = $\cos(\phi^{sig} - \phi^{\rm part})$
  • diff_pt = $\max(p_T^{part}) - p_T^{part}$
  • partPt= $p_T^{part}$
  • max_PID_e_mu = $\max(PIDNN(e), PIDNN(\mu))^{part}$
  • partP = $p^{part}$
  • nnkrec = Number of reconstructed vertices
  • diff_eta = $(\eta^{sig} - \eta^{\rm part})$
  • EOverP = E/P (from CALO)
  • sum_PID_k_mu = $\sum\limits_{i\in part}(PIDNN(K)+PIDNN(\mu))$
  • ptB = $p_T^{sig}$
  • sum_PID_e_mu = $\sum\limits_{i\in part}(PIDNN(e)+PIDNN(\mu))$
  • sum_PID_k_e = $\sum\limits_{i\in part}(PIDNN(K)+PIDNN(e))$
  • proj = $(\vec{p}^{sig},\vec{p}^{part})$
  • PIDNNe = $PIDNN(e)$
  • PIDNNk = $PIDNN(K)$
  • PIDNNm = $PIDNN(\mu)$
  • phi = $\phi^{part}$
  • IP = number of IPs in the event
  • max_PID_k_mu = $max(PIDNN(K)+PIDNN(\mu))$
  • IPerr = error of IP
  • IPs = IP/IPerr
  • veloch = dE/dx track charge from the VELO system
  • max_PID_k_e = $max(PIDNN(K)+PIDNN(e))$
  • diff_phi = $(\phi^{sig} - \phi^{\rm part})$
  • ghostProb = ghost probability
  • IPPU = impact parameter with respect to any other reconstructed primary vertex.
  • eta = pseudorapity of track particle
  • partlcs = chi2PerDoF for a track

Vertex Selections

  • All selection are removed except DaVinci probability cuts

Vertex Features:

  • mult = multiplicity in the event
  • nnkrec = number of reconstructed vertices
  • ptB = signal B transverse momentum
  • vflag = number of tracks in the vertex
  • ipsmean = mean of tracks IPs
  • ptmean = mean pt of the tracks
  • vcharge = charge of the vertex weigthed by pt
  • svm = mass of the vertex
  • svp = momentum of the vertex
  • BDphiDir = angle betwen B and vertex
  • svtau = lifetime of the vertex
  • docamax = mean DOCA of the tracks

Classifier

Try to define B sign using track/vertex sign (to define they have the same signs or opposite).

target = signB * signTrack/signVertex > 0

  • classifier returns
$$P(\text{track/vertex same sign as B| B sign}) = $$$$ =P(\text{B same sign as track/vertex| track/vertex sign})$$

Calibration of $P(\text{track/vertex same sign as B| B sign})$

  • use 2-folding logistic/isotonic calibration for track/vertex classifier's prediction
  • compare with isotonic/logistic calibration
  • compare with absent calibration (bad, have shift predictions)

Computation of $p(B^+)$ using $P(\text{track/vertex same sign as B| B sign})$

Compute $p(B^+)$ using this probabilistic model representation (similar to the previous tagging combination):

$$ \frac{P(B^+)}{P(B^-)} = \prod_{track, vertex} \frac{P(\text{track/vertex}|B^+)} {P(\text{track/vertex} |B^-)} = \alpha \qquad $$$$\Rightarrow\qquad P(B^+) = \frac {\alpha}{1+\alpha}, \qquad \qquad [1] $$

where

$$ \frac{P(B^+)}{P(B^-)} = \prod_{track, vertex} \begin{cases} \frac{P(\text{track/vertex same sign as } B| B)}{P(\text{track/vertex opposite sign as } B| B)}, \text{if track/vertex}^+ \\ \\ \frac{P(\text{track/vertex opposite sign as } B| B)}{P(\text{track/vertex same sign as } B| B)}, \text{if track/vertex}^- \end{cases} $$$$p_{mistag} = min(p(B^+), p(B^-))$$

Intermediate estimation $ < D^2 > $ for tracking

Do calibration of $p(B^+)$ and compute $ < D^2 > $ :

  • use Isotonic calibration (generalization of bins fitting by linear function) - piecewise-constant monotonic function
  • randomly divide events into two parts (1-train, 2-calibrate)
  • symmetric isotonic fitting on train and $ < D^2 > $ computation on test
  • take mean and std for computed $ < D^2 > $

$ < D^2 > $ formula for sample: $$ < D^2 > = \frac{\sum_i[2(p^{mistag}_i - 0.5)]^2 * sw_i}{\sum_i sw_i} = $$ $$ = \frac{\sum_i[2(p_i(B^+) - 0.5)]^2 * sw_i}{\sum_i sw_i}$$

Formula is symmetric and it is not necessary to compute mistag probability

Preliminary estimation

$\epsilon$ calculation

$$\epsilon = < D^2 > * \epsilon_{tag}$$$$\Delta \epsilon = \sqrt{ \left(\frac{\Delta < D^2 > }{ < D^2 > }\right)^2 + \left(\frac{\Delta \epsilon_{tag} }{\epsilon_{tag}} \right)^2 }$$
  • Combine track-based and vertex-based tagging using formula [1]
  • symmetric isotonic calibration on random subsample with $D^2$ calculation
  • take mean and std for computed $ < D^2 > $

Full estimation of systematic error

  • set random state
  • train the best model (track and vertex taggers with 2-folding with fixed random state)
  • do calibration for track and vertex taggers with 2-folding with fixed random state
  • compute $p(B^+)$
  • do calibration with isotonic 2-folding (random state is fixed)
  • compute $ < D^2 > $

This procedure is repeated (from the scratch) for 30 different random states and then we compute mean and std for these 30 values of $ < D^2 > $.

Check calibration of mistag

  • axis x: predicted mistag probability $$p_{mistag} = min(p(B^+), p(B^-))$$
  • axis y: true mistag probability (computed for bin) $$p_{mistag} = \frac{N_{wrong}} {N_{wrong} + N_{right}}$$
$$\Delta p_{mistag} = \frac{\sqrt{N_{wrong} N_{right}}} {(N_{wrong} + N_{right})^{1.5}}$$

Stability of calibration

Add random noise after isotonic calibration of $p(B^+)$ for stability:

$$ 0.001 * normal(0, 1)$$

Inclusive tagging (NEW)

  • Check "OS" and "SS" regions separately (to check that tagging includes "SS" and "OS")
  • Check dependences on lifetime, lifetime error, number of tracks, momentum, transverse momentum, mass
  • Asymmetry of charges in events: understanding of high tagging quality or what information we use

Tracking "OS" tagging

https://github.com/tata-antares/tagging_LHCb/blob/master/track-based-tagging-OS.ipynb

Take all possible tracks for all B-events.

Apply:

  • (IPs > 3) & ((abs(diff_eta) > 0.6) | (abs(diff_phi) > 0.825)) - geometrical cuts
  • (PIDNNp < 0.5) & (PIDNNpi < 0.5) & (ghostProb < 0.4)
  • ((PIDNNk > trk) | (PIDNNm > trm) | (PIDNNe > tre)), trk=0., trm=0., tre=0.

B mass before sWeight cut

B mass after sWeight cut

Number of tracks in event

PIDNN distributions after selection


In [4]:
pandas.set_option('display.precision', 5)
pandas.read_csv('img/eff_OS.csv').drop(['$\Delta$ AUC, with untag'], axis=1)


Out[4]:
name $\epsilon_{tag}, \%$ $\Delta \epsilon_{tag}, \%$ $D^2$ $\Delta D^2$ $\epsilon, \%$ $\Delta \epsilon, \%$ AUC, with untag
0 Inclusive tagging, PID less 86.70309 0.10803 0.02494 0.00033 2.16214 0.02875 57.20222

Check calibration of mistag

before calibration

Symmetric isotonic calibration + random noise * 0.001 (noise for stability of bins)

Tracking "SS" tagging

https://github.com/tata-antares/tagging_LHCb/blob/master/track-based-tagging-SS.ipynb

Take all possible tracks for all B-events.

Apply:

  • (IPs < 3) & (abs(diff_eta) < 0.6) & (abs(diff_phi) < 0.825) & (ghostProb < 0.4)
  • ((PIDNNk > {trk}) | (PIDNNm > {trm}) | (PIDNNe > {tre}) | (PIDNNpi > {trpi}) | (PIDNNp > {trp})), trk=0, trm=0, tre=0, trpi=0, trp=0

B mass before sWeight cut

B mass after sWeight cut

PIDNN distributions after selection


In [5]:
pandas.set_option('display.precision', 5)
pandas.read_csv('img/eff_tracking_SS.csv').drop(['$\Delta$ AUC, with untag'], axis=1)


Out[5]:
name $\epsilon_{tag}, \%$ $\Delta \epsilon_{tag}, \%$ $D^2$ $\Delta D^2$ $\epsilon, \%$ $\Delta \epsilon, \%$ AUC, with untag
0 Inclusive tagging, PID less 72.39764 0.09872 0.03077 0.00035 2.22756 0.02573 57.419

Check calibration of mistag

before calibration

Symmetric isotonic calibration + random noise * 0.001 (noise for stability of bins)

Tracking inclusive tagging

https://github.com/tata-antares/tagging_LHCb/blob/master/track-based-tagging-PID-less.ipynb

Take all possible tracks for all B-events.

Apply:

  • (ghostProb < 0.4)
  • ((PIDNNk > {trk}) | (PIDNNm > {trm}) | (PIDNNe > {tre}) | (PIDNNpi > {trpi}) | (PIDNNp > {trp})), trk=0, trm=0, tre=0, trpi=0, trp=0

B mass before sWeight cut

B mass after sWeight cut

Number of tracks in event

PIDNN distributions after selection

Dependence on PIDNN cuts

  • (PIDNNp < 0.6) & (PIDNNpi < 0.6) & (ghostProb < 0.4)
  • ( (PIDNNk > 0.7) | (PIDNNm > 0.4) | (PIDNNe > 0.6) )

In [6]:
pandas.set_option('display.precision', 5)
pandas.read_csv('img/new-tagging.csv').drop(['$\Delta$ AUC, with untag'], axis=1)


Out[6]:
name $\epsilon_{tag}, \%$ $\Delta \epsilon_{tag}, \%$ $D^2$ $\Delta D^2$ $\epsilon, \%$ $\Delta \epsilon, \%$ AUC, with untag
0 Inclusive tagging 77.78995 0.10233 0.03449 0.00046 2.68331 0.03576 57.92576
  • (PIDNNp < 0.6) & (PIDNNpi < 0.6) & (ghostProb < 0.4)
  • ( (PIDNNk > 0.1) | (PIDNNm > 0.1) | (PIDNNe > 0.1) )

In [7]:
pandas.set_option('display.precision', 5)
pandas.read_csv('img/new-tagging_relax1.csv')


Out[7]:
name $\epsilon_{tag}, \%$ $\Delta \epsilon_{tag}, \%$ $D^2$ $\Delta D^2$ $\epsilon, \%$ $\Delta \epsilon, \%$ AUC, with untag
0 Inclusive tagging 97.0983 0.1143 0.0384 0.0003 3.7256 0.0306 60.5811
  • (PIDNNpi < 0.6) & (ghostProb < 0.4)
  • ( (PIDNNk > 0.) | (PIDNNm > 0.) | (PIDNNe > 0.) )

In [8]:
pandas.set_option('display.precision', 5)
pandas.read_csv('img/new-tagging_relax2.csv')


Out[8]:
name $\epsilon_{tag}, \%$ $\Delta \epsilon_{tag}, \%$ $D^2$ $\Delta D^2$ $\epsilon, \%$ $\Delta \epsilon, \%$ AUC, with untag
0 Inclusive tagging 99.208 0.1156 0.0408 0.0004 4.05 0.0356 61.2362

In [9]:
pandas.set_option('display.precision', 5)
pandas.read_csv('img/new-tagging-PID-less.csv').drop(['$\Delta$ AUC, with untag'], axis=1)


Out[9]:
name $\epsilon_{tag}, \%$ $\Delta \epsilon_{tag}, \%$ $D^2$ $\Delta D^2$ $\epsilon, \%$ $\Delta \epsilon, \%$ AUC, with untag
0 Inclusive tagging, PID less 99.98595 0.11601 0.05873 0.00043 5.87239 0.04359 64.08899

In [10]:
pandas.set_option('display.precision', 5)
pandas.read_csv('img/new-tagging_full_tracks.csv')


Out[10]:
name $\epsilon_{tag}, \%$ $\Delta \epsilon_{tag}, \%$ $D^2$ $\Delta D^2$ $\epsilon, \%$ $\Delta \epsilon, \%$ AUC, with untag
0 Inclusive tagging 99.98595 0.11601 0.06303 0.00051 6.30254 0.05125 64.43919

Checks on track: OS+SS, OS vertex model

Check calibration of mistag

for signal (B-like events)

for background

before calibration

Symmetric isotonic calibration + random noise * 0.001 (noise for stability of bins)

Tagging power dependency on ...

  • For B mass, B momentum, B transverse momentum, B lifetime use sidebands as bck and peak region as signal:
    • mask_signal = ((Bmass > 5.27) & (Bmass < 5.3))
    • mask_bck = ((Bmass < 5.25) | (Bmass > 5.32))
  • For B lifetime error and number of tracks use sWeights

Procedure:

  • divide variable into 5 percentile bins
  • for each bin plot mistag vs true mistag

Signal dependence

Bck dependence

Why effective efficiency is so high for this model (model of tracks probability combination to obtain B probability)?

Let's see on the following characteristic of the event:

$$ -\sum_{track} charge_{track}$$

It seems, that for $B^+$ event it should be around +1 + constant (because we exclude signal part)

Regions:

  • 'OS' region: (IP > 3) & ((abs(diff_eta) > 0.6) | (abs(diff_phi) > 0.825))
  • 'SS' region: (IP < 3) & (abs(diff_eta) < 0.6) & (abs(diff_phi) < 0.825)
  • full data

"OS" data

"SS" data

Full sample

Add signal track

"OS" data

"SS" data

Full sample

Means of distributions (with signal track and without it)


In [11]:
pandas.set_option('display.precision', 5)
pandas.read_csv('img/track_signs_assymetry_means.csv', index_col='name')


Out[11]:
$B^+$ $B^+$, with signal part $B^-$ $B^-$, with signal part ROC AUC ROC AUC, with signal part
name
full 0.44341 -0.55659 -0.57216 0.42784 0.57158 0.56915
OS 0.11117 -0.88883 -0.15727 0.84273 0.52953 0.68460
SS 0.17597 -0.82403 -0.17810 0.82190 0.56770 0.77769

This ROC AUC score is similar to the current tagging implementation

Charges asymmetry checks on MC sample

"OS" sample

"SS" sample

Full sample

Means of distributions for MC and data (with signal track and without it)


In [12]:
pandas.set_option('display.precision', 5)
pandas.concat([pandas.read_csv('img/track_signs_assymetry_means.csv', index_col='name'),
               pandas.read_csv('img/track_signs_assymetry_means_mc.csv', index_col='name')])


Out[12]:
$B^+$ $B^+$, with signal part $B^-$ $B^-$, with signal part ROC AUC ROC AUC, with signal part
name
full 0.44341 -0.55659 -0.57216 0.42784 0.57158 0.56915
OS 0.11117 -0.88883 -0.15727 0.84273 0.52953 0.68460
SS 0.17597 -0.82403 -0.17810 0.82190 0.56770 0.77769
full_mc 0.31778 -0.68222 -0.77006 0.22994 0.58490 0.57159
OS_mc 0.04728 -0.95272 -0.28497 0.71503 0.54030 0.69439
SS_mc 0.15520 -0.84480 -0.18414 0.81586 0.56807 0.78832

Algorithm uses this information during tracks probabilities combination:

$$ \frac{P(B^+)}{P(B^-)} = \prod_{track, vertex} \begin{cases} \frac{P(\text{track/vertex same sign as } B| B)}{P(\text{track/vertex opposite sign as } B| B)}, \text{if track/vertex}^+ \\ \\ \frac{P(\text{track/vertex opposite sign as } B| B)}{P(\text{track/vertex same sign as } B| B)}, \text{if track/vertex}^- \end{cases} $$
  • #### Can we use it indeed?
  • #### What is the source of this asymmetry?

Current tagging algorithm implicitly also use this information!

The asymmery plays a discriminative role even if we choose the random track in the event!

Random is not random!

Checked

  • modify loss function during vertex training to use track tagging output as baseline (try to correct track predictions using vertex information): doesn't help, but maybe need all vertices for event
  • different calibrations: bins, logistic, isotonic
  • normalize number of positive and negative tracks for $B^+$ and $B^-$ separately to avoid track charges asymmetry, the quality drops:

    • ROC AUC: 0.62
    • $\epsilon: 4.5$
  • use track sign as feature during training (bad quality):

    • ROC AUC: 0.611
    • $\epsilon: 3.6$
  • check as discriminative variable sum of weighted with Pt charges of tracks:
$$ -\frac{\sum_{track} charge_{track} Pt_{track}} {\sum_{track} Pt_{track}}$$
* ROC AUC for all regions ("OS", "SS", full sample) < 0.5006
* doesn't discriminate

TODO

  • check efficiency for inclusive tagging on the other decays (please, send us your tuples with Flavour tagging checker info)
  • understand the asymmetry of sum of charges (maybe somebody understands?)
  • get all vertices from DaVinci (to check if we indeed need vertex to improve tagger)