Learning-To-Rank

Datasets

SVM-Rank format

3 qid:1 1:1 2:1 3:0 4:0.2 5:0 # 1A
2 qid:1 1:0 2:0 3:1 4:0.1 5:1 # 1B 
1 qid:1 1:0 2:1 3:0 4:0.4 5:0 # 1C
1 qid:1 1:0 2:0 3:1 4:0.3 5:0 # 1D   
2 qid:2 1:1 2:0 3:1 4:0.4 5:0 # 2B

Metrics

DCG

$$DCG@T=\sum_{i=1}^{T}\frac{2^{rel_{i}} - 1}{log_{2}(i+1)}$$

NDCG

$$NDCG@T=\frac{DCG@T}{maxDCG@T}$$

ERR

$$R(rel) = \frac{2^{rel}-1}{2^{rel_{max}}}$$$$ERR@T=\sum_{r=1}^{T}\frac{1}{r}P(stop)=\sum_{r=1}^{T}\frac{1}{r}P_{r}\prod_{i=1}^{r-1}(1-P_{i})$$

Methods

Pointwise

Pairwise

RankNet

$$P_{ij}=P(U_{i}\prec U_{j})=\frac{1}{1+e^{-\sigma (s_{i}-s_{j})}}$$

$$C_{ij}=-\bar{P}_{ij}logP_{ij}-(1-\bar{P}_{ij})log(1-P_{ij})$$$$\bar{P}_{ij}=\frac{1}{2}(1+S_{ij})$$$$C_{ij}=\frac{1}{2}(1-S_{ij})\sigma(s_{i}-s_{j})+log(1+e^{-\sigma(s_{i}-s_{j})})$$$$\frac{\partial C}{\partial s_{i}}=\sigma(\frac{1}{2}(1-S_{ij})-\frac{1}{1+e^{\sigma(s_{i}-s_{j})}})=-\frac{\partial C}{\partial s_{j}}=\lambda _{ij}$$

Listwise

LambdaRank

$$\lambda _{ij}=\frac{-\sigma}{1+e^{\sigma(s_{i}-s_{j})}}|\Delta_{NDCG}|$$$$\lambda_{i}=\sum_{j: U_{i}\prec U_{j}}\lambda_{ij}-\sum_{j: U_{j}\prec U_{i}}\lambda_{ij}$$

LambdaMART

MART - Multiple Additive Regression Tree (gradient boosted regression tree) $$F_{N}(x)=\sum_{i=1}^{N}\alpha _{i}f_{i}(x)$$

Libraries