Osaka:
RankSVM: F1 (0.678, 0.037), pairsF1 (0.425, 0.058), Tau (0.646, 0.040), perfectF1 15/46, perfectPairsF1 13/46
RankSVM-New: F1 (0.656, 0.036), pairsF1 (0.389, 0.058), Tau (0.625, 0.039), perfectF1 13/46, perfectPairsF1 12/46
The performance of multi-label SSVM with top-5 recommendation and new features (SSVM-T5-NEW
) performs better than multi-label SSVM with top-5 recommendation (without new features) (SSVM-T5-OLD
).
Run experiments on dataset using both methods.
Glasgow:
SSVM-T5-OLD: F1 (0.856, 0.024), pairsF1 (0.700, 0.043), Tau (0.828, 0.029), perfectF1 39/64, perfectPairsF1 33/64
SSVM-T5-NEW: F1 (0.835, 0.025), pairsF1 (0.667, 0.044), Tau (0.809, 0.028), perfectF1 35/64, perfectPairsF1 30/64
SSVM-T5-NEW-NOLOG:F1 (0.841, 0.025) pairsF1 (0.688, 0.045) Tau (0.816, 0.029) perfectF1 37/64 perfectPairsF1 35/64
Osaka:
SSVM-T5-OLD: F1 (0.755, 0.037), pairsF1 (0.566, 0.059), Tau (0.728, 0.040), perfectF1 21/46, perfectPairsF1 20/46
SSVM-T5-NEW: F1 (0.729, 0.040), pairsF1 (0.543, 0.062), Tau (0.700, 0.044), perfectF1 21/46, perfectPairsF1 20/46
SSVM-T5-NEW-NOLOG:F1 (0.744, 0.038) pairsF1 (0.560, 0.060) Tau (0.714, 0.042) perfectF1 21/46 perfectPairsF1 20/46
The conjecture is probably FALSE.
Take the log of POI popularity, the total number of visit and the visit duration when construct POI features does NOT help RankSVM.
Run RankSVM with (+Log-RankSVM
) and without (-Log-RankSVM
) taking the log of POI popularity, the total number of visit and the visit duration when construct POI features.
Glasgow:
-Log-RankSVM: F1 (0.754, 0.027), pairsF1 (0.538, 0.047), Tau (0.726, 0.030), perfectF1 24/64, perfectPairsF1 24/64
+Log-RankSVM: F1 (0.767, 0.026), pairsF1 (0.553, 0.047), Tau (0.739, 0.030), perfectF1 25/64, perfectPairsF1 25/64
Osaka:
-Log-RankSVM: F1 (0.633, 0.033), pairsF1 (0.343, 0.055), Tau (0.603, 0.036), perfectF1 10/46, perfectPairsF1 10/46
+Log-RankSVM: F1 (0.678, 0.037), pairsF1 (0.433, 0.059), Tau (0.647, 0.040), perfectF1 15/46, perfectPairsF1 14/46
The conjecture is FALSE.
The performance of RankSVM does NOT change with/without trajectory length in features.
Run RankSVM with (+TrajLen
) and without (-TrajLen
) in features.
Glasgow:
-TrajLen: F1 (0.764, 0.027), pairsF1 (0.550, 0.047), Tau (0.736, 0.030), perfectF1: 25/64, perfectPairsF1: 25/64
+TrajLen: F1 (0.767, 0.026), pairsF1 (0.553, 0.047), Tau (0.739, 0.030), perfectF1: 25/64, perfectPairsF1: 25/64
Toronto:
-TrajLen: F1 (0.759, 0.021), pairsF1 (0.512, 0.036), Tau (0.721, 0.024), perfectF1: 32/99, perfectPairsF1: 29/99
+TrajLen: F1 (0.759, 0.021), pairsF1 (0.512, 0.036), Tau (0.721, 0.024), perfectF1: 32/99, perfectPairsF1: 29/99
Osaka:
-TrajLen: F1 (0.678, 0.037), pairsF1 (0.425, 0.058), Tau (0.646, 0.040), perfectF1: 15/46, perfectPairsF1: 13/46
+TrajLen: F1 (0.678, 0.037), pairsF1 (0.433, 0.059), Tau (0.647, 0.040), perfectF1: 15/46, perfectPairsF1: 14/46
Trajectory length as a feature helps a tiny bit on some dataset, so keep it as a feature.
The performance of multi-label SSVM will improve a lot if the top-k trajectories are recommended, given that the performance evaluated as $\max_{\bar{y_k}} \max_{y_k} \text{metric}(\bar{y_k}, y_k)$.
The training of multi-label SSVM is achieved by the list Viterbi with sub-tours, and the top-k ($k=1$ SSVM-ML-T1
or $k=5$ SSVM-ML-T5
) prediction are achieved by the list Viterbi without sub-tours.
Glasgow:
RankSVM: F1 (0.764, 0.027), pairsF1 (0.550, 0.047), Tau (0.736, 0.030), perfectF1 25/64, perfectPairsF1 25/64
SSVM-SL-T5: F1 (0.739, 0.030), pairsF1 (0.516, 0.049), Tau (0.702, 0.034), perfectF1 26/64, perfectPairsF1 23/64
SSVM-ML-T1: F1 (0.770, 0.030), pairsF1 (0.576, 0.050), Tau (0.740, 0.034), perfectF1 30/64, perfectPairsF1 26/64
SSVM-ML-T5: F1 (0.856, 0.024), pairsF1 (0.700, 0.043), Tau (0.828, 0.029), perfectF1 39/64, perfectPairsF1 33/64
Toronto:
RankSVM: F1 (0.759, 0.021), pairsF1 (0.512, 0.036), Tau (0.721, 0.024), perfectF1 32/99, perfectPairsF1 29/99
SSVM-SL-T5: F1 (0.720, 0.024), pairsF1 (0.499, 0.038), Tau (0.670, 0.028), perfectF1 37/99, perfectPairsF1 33/99
SSVM-ML-T1: F1 (0.742, 0.022), pairsF1 (0.507, 0.037), Tau (0.703, 0.026), perfectF1 34/99, perfectPairsF1 31/99
SSVM-ML-T5: F1 (0.794, 0.021), pairsF1 (0.583, 0.036), Tau (0.755, 0.024), perfectF1 42/99, perfectPairsF1 37/99
Osaka:
RankSVM: F1 (0.678, 0.037), pairsF1 (0.425, 0.058), Tau (0.646, 0.040), perfectF1 15/46, perfectPairsF1 13/46
SSVM-SL-T5: F1 (0.746, 0.037), pairsF1 (0.543, 0.058), Tau (0.722, 0.040), perfectF1 20/46, perfectPairsF1 18/46
SSVM-ML-T1: F1 (0.669, 0.036), pairsF1 (0.414, 0.057), Tau (0.637, 0.040), perfectF1 13/46, perfectPairsF1 13/46
SSVM-ML-T5: F1 (0.755, 0.037), pairsF1 (0.566, 0.059), Tau (0.728, 0.040), perfectF1 21/46, perfectPairsF1 20/46
Edinburgh:
RankSVM: F1 (0.725, 0.017), pairsF1 (0.451, 0.026), Tau (0.663, 0.020), perfectF1 34/147, perfectPairsF1 28/147
SSVM-SL-T5: F1 (0.653, 0.019), pairsF1 (0.391, 0.027), Tau (0.576, 0.023), perfectF1 34/147, perfectPairsF1 27/147
SSVM-ML-T5: F1 (0.689, 0.018), pairsF1 (0.426, 0.027), Tau (0.618, 0.022), perfectF1 39/147, perfectPairsF1 30/147
The performance of multi-label SSVM is similar with (+Derived
) or without derived (node) features (-Derived
).
Evaluate multi-label SSVM (leave-one-out cross validation) with and without turning on the derived (node) features. Loss augmented inference is the list Viterbi algorithm (allow sub-tours), prediction inference is list Viterbi algorithm (do NOT allow sub-tours)
Glasgow:
-Derived: F1 (0.665, 0.030), pairsF1 (0.413, 0.050), Tau (0.628, 0.034), perfectF1: 19/64, perfectPairsF1: 18/64
+Derived: F1 (0.770, 0.030), pairsF1 (0.576, 0.050), Tau (0.740, 0.034), perfectF1: 30/64, perfectPairsF1: 26/64
Osaka:
-Derived: F1 (0.562, 0.035), pairsF1 (0.230, 0.050), Tau (0.525, 0.037), perfectF1: 8/46, perfectPairsF1: 6/46
+Derived: F1 (0.670, 0.040), pairsF1 (0.433, 0.060), Tau (0.637, 0.044), perfectF1: 15/46, perfectPairsF1: 14/46
The conjecture is false.
The derived features helps a lot for RankSVM.
Unary features:
Derived features:
Evaluate RankSVM (leave-one-out cross validation) with and without turning on the derived features.
Glasgow:
Unary+Derived: F1 (0.764, 0.027), pairsF1 (0.550, 0.047), Tau (0.736, 0.030)
Unary: F1 (0.659, 0.030), pairsF1 (0.388, 0.049), Tau (0.624, 0.033)
Toronto:
Unary+Derived: F1 (0.759, 0.021), pairsF1 (0.512, 0.036), Tau (0.721, 0.024)
Unary: F1 (0.676, 0.021), pairsF1 (0.370, 0.033), Tau (0.629, 0.023)
Osaka:
Unary+Derived: F1 (0.678, 0.037), pairsF1 (0.425, 0.058), Tau (0.646, 0.040)
Unary: F1 (0.622, 0.033), pairsF1 (0.307, 0.053), Tau (0.590, 0.035)
The conjecture is true.
We would like to formulate a multi-user multi-label SSVM as \begin{align} \min_{\mathbf{w},\xi\ge0} ~& \frac{1}{2} \sum_{i=1}^m \|\mathbf{w}_i - \bar{\mathbf{w}}\|^2 + C \sum_i \sum_j \sum_k \xi_{ijk} \\ s.t. ~& \langle \mathbf{w}_i, \Psi(\mathbf{x}_{ij}, \mathbf{y}_{ijk}) \rangle - \langle \mathbf{w}_i, \Psi(\mathbf{x}_{ij}, \bar{\mathbf{y}}) \rangle \ge \Delta(\mathbf{y}_{ijk}, \bar{\mathbf{y}}) - \xi_{ijk},~ \bar{\mathbf{y}} \in \mathcal{Y} \setminus \{\mathbf{y}_{ijk}\}_{k=1}^{m_{ij}},~ \forall k,~\forall j,~\forall i. \end{align} where $i$ indexes users, $j$ indexes queries of the $i$-th user and $k$ indexes the (multi-) labels of the $j$-th query of the $i$-th user.
& A x = b
\end{align}When multi-label SSVM is trained using the greedy algorithm, it will perform better than when trained using the list Viterbi algorithm (allow sub-tours), the prediction inference is the list Viterbi algorithm (do NOT allow sub-tours) in both case.
SSVM-LV-LV
)SSVM-GR-LV
)Glasgow:
SSVM-GR-LV: F1 (0.728, 0.029), pairsF1 (0.496, 0.048), Tau (0.696, 0.032), perfectF1 22/64, perfectPairsF1 21/64
SSVM-LV-LV: F1 (0.770, 0.030), pairsF1 (0.576, 0.050), Tau (0.740, 0.034), perfectF1 30/64, perfectPairsF1 26/64
MEMM: F1 (0.699, 0.028), pairsF1 (0.436, 0.047), Tau (0.668, 0.031), perfectF1 19/64, perfectPairsF1 18/64
Toronto:
SSVM-GR-LV: F1 (0.752, 0.023), pairsF1 (0.520, 0.037), Tau (0.712, 0.026), perfectF1 37/99, perfectPairsF1 32/99
SSVM-LV-LV: F1 (0.742, 0.022), pairsF1 (0.507, 0.037), Tau (0.703, 0.026), perfectF1 34/99, perfectPairsF1 31/99
Osaka:
SSVM-GR-LV: F1 (0.609, 0.041), pairsF1 (0.348, 0.061), Tau (0.576, 0.045), perfectF1 13/46, perfectPairsF1 12/46
SSVM-LV-LV: F1 (0.670, 0.040), pairsF1 (0.433, 0.060), Tau (0.637, 0.044), perfectF1 15/46, perfectPairsF1 14/46
MEMM: F1 (0.652, 0.035), pairsF1 (0.387, 0.057), Tau (0.624, 0.038), perfectF1 12/46, perfectPairsF1 12/46
There's no clear conclusion if the conjecture is true.
Train the multi-label SSVM using the list Viterbi algorithm but allow sub-tours (the loss augmented inference) will make it perform better than without allowing sub-tours when training.
SSVM-ML-ST
)SSVM-ML-NST
)Glasgow:
SSVM-ML-ST: F1 (0.770, 0.030), pairsF1 (0.576, 0.050), Tau (0.740, 0.034), perfectF1 30/64, perfectPairsF1 26/64
SSVM-ML-NST: F1 (0.752, 0.028), pairsF1 (0.532, 0.048), Tau (0.721, 0.031), perfectF1 25/64, perfectPairsF1 22/64
RankSVM: F1 (0.767, 0.026), pairsF1 (0.553, 0.047), Tau (0.739, 0.030), perfectF1 25/64, perfectPairsF1 25/64
Toronto:
SSVM-ML-ST: F1 (0.742, 0.022), pairsF1 (0.507, 0.037), Tau (0.703, 0.026), perfectF1 34/99, perfectPairsF1 31/99
RankSVM: F1 (0.759, 0.021), pairsF1 (0.512, 0.036), Tau (0.721, 0.024), perfectF1 32/99, perfectPairsF1 29/99
Edinburgh:
SSVM-ML-ST: F1 (0.629, 0.017), pairsF1 (0.337, 0.025), Tau (0.557, 0.020), perfectF1 22/147, perfectPairsF1 20/147
RankSVM: F1 (0.727, 0.016), pairsF1 (0.454, 0.026), Tau (0.664, 0.020), perfectF1 34/147, perfectPairsF1 28/147
Osaka:
SSVM-ML-ST: F1 (0.669, 0.036), pairsF1 (0.414, 0.057), Tau (0.637, 0.040), perfectF1 13/46, perfectPairsF1 13/46
SSVM-ML-NST: F1 (0.670, 0.040), pairsF1 (0.433, 0.060), Tau (0.637, 0.044), perfectF1 15/46, perfectPairsF1 14/46
RankSVM: F1 (0.678, 0.037), pairsF1 (0.433, 0.059), Tau (0.647, 0.040), perfectF1 15/46, perfectPairsF1 14/46
It seems the conjecture is probably neither true or false. Nevertheless, allowing sub-tours in loss augmented inference makes the training much more efficient.
SSVM can be trained better if the list Viterbi algorithm (instead of the Viterbi algorithm) is used to do the loss augmented inference.
SSVM-V-LV-SL
: single label SSVM, loss augmented inference: Viterbi, prediction inference: list ViterbiSSVM-LV-LV-SL
: single label SSVM, loss augmented inference: list Viterbi, prediction inference: list ViterbiSSVM-V-LV-SL:
F1 (0.632, 0.041), pairsF1 (0.386, 0.061), Tau (0.603, 0.044), perfectF1: 13/46, perfectPairsF1: 13/46
SSVM-LV-LV-SL:
F1 (0.557, 0.036), pairsF1 (0.249, 0.055), Tau (0.523, 0.038), perfectF1: 8/46, perfectPairsF1: 8/46
The conjecture is false.
Multi-label SSVM can compete with RankSVM.
Do leave-one-out cross validation of multi-label SSVM (SSVM-ML
) and rankSVM (as well as single-label SSVM SSVM-SL
) on Glasgow and Toronto dataset. Note that here query is $(s, K)$ and we share parameters among POIs/transitions, besides, we do NOT take the log of transitions features (factorised probabilities).
Glasgow:
RankSVM: F1 (0.767, 0.026), pairsF1 (0.553, 0.047), Tau (0.739, 0.030), perfectF1: 25/64, perfectPairsF1: 25/64
SSVM-ML: F1 (0.752, 0.028), pairsF1 (0.532, 0.048), Tau (0.721, 0.031), perfectF1: 25/64, perfectPairsF1: 22/64
SSVM-SL: F1 (0.629, 0.031), pairsF1 (0.362, 0.047), Tau (0.589, 0.034), perfectF1: 15/64, perfectPairsF1: 15/64
Osaka:
RankSVM: F1 (0.678, 0.037), pairsF1 (0.433, 0.059), Tau (0.647, 0.040), perfectF1: 15/46, perfectPairsF1: 14/46
SSVM-ML: F1 (0.670, 0.040), pairsF1 (0.433, 0.060), Tau (0.637, 0.044), perfectF1: 15/46, perfectPairsF1: 14/46
SSVM-SL: F1 (0.599, 0.033), pairsF1 (0.292, 0.052), Tau (0.566, 0.036), perfectF1: 8/46, perfectPairsF1: 8/46
Only a small fraction of users have more than one label for a given query.
Design a new query $(s, K, u)$ where
s
: start POI of a trajectoryK
: the number of desired POIsu
: user ID
then check how many of these queries have multi-labels.when query is $(s, K, u)$,
(#queries with multi-label) / (total #queries)
Osaka: 4/181 ~2%
Glasgow: 18/325 ~6%
Edinburgh: 78/1260 ~6%
Toronto: 68/813 ~8%
Melbourne: 39/965 ~4%
and if the query contains only $(s, K)$, the fraction of multi-labels are:
Osaka: 30/47 ~64%
Glasgow: 41/64 ~64%
Edinburgh: 104/147 ~71%
Toronto: 69/99 ~70%
Melbourne: 139/280 ~50%
Lastly, if the query is $(s, u)$, the fraction of multi-labels are:
Osaka: 8/177 ~5%
Glasgow: 29/309 ~9%
Edinburgh: 121/1165 ~10%
Toronto: 79/751 ~11%
Melbourne: 69/925 ~7%
Experiment
ssvm_ml.ipynb
Conclusion
Tie-Log
: share parameters among POIs and transitions, and take the log of (factorised) transition probabilitiesTie-NoLog
: share parameters among POIs and transitions, but do NOT take the log of (factorised) transition probabilitiesNoTie-Log
: do NOT share parameters among POIs and transitions, and take the log of (factorised) transition probabilitiesNoTie-NoLog
: do NOT share parameters among POIs and transitions, and do NOT take the log of (factorised) transition probabilities
SSVM-MultiLabel-Tie-Log: F1 (0.624, 0.038), pairsF1 (0.363, 0.059), Tau (0.591, 0.042)
SSVM-MultiLabel-Tie-NoLog: F1 (0.659, 0.040), pairsF1 (0.423, 0.060), Tau (0.626, 0.044)
SSVM-MultiLabel-NoTie-NoLog: F1 (0.649, 0.035), pairsF1 (0.373, 0.057), Tau (0.619, 0.038)
SSVM-MultiLabel-NoTie-Log: F1 (0.655, 0.035), pairsF1 (0.377, 0.057), Tau (0.624, 0.037)
SSVM-Tie-Log: F1 (0.611, 0.037), pairsF1 (0.339, 0.058), Tau (0.580, 0.040)
SSVM-Tie-NoLog: F1 (0.613, 0.037), pairsF1 (0.330, 0.057), Tau (0.579, 0.041)
SSVM-NoTie-NoLog: F1 (0.637, 0.039), pairsF1 (0.378, 0.061), Tau (0.605, 0.043)
SSVM-NoTie-Log: F1 (0.626, 0.039), pairsF1 (0.361, 0.060), Tau (0.593, 0.042)
RankSVM: F1 (0.678, 0.037), pairsF1 (0.433, 0.059), Tau (0.647, 0.040)
Experiment
Conclusion
RankSVM: F1 (0.536, 0.017), pairsF1 (0.145, 0.020), Tau (0.484, 0.018)
SSVM: F1 (0.612, 0.029), pairsF1 (0.317, 0.047), Tau (0.573, 0.032)
SSVM-NoTransition: F1 (0.610, 0.029), pairsF1 (0.309, 0.044), Tau (0.571, 0.032)
RankSVM: F1 (0.705, 0.030), pairsF1 (0.414, 0.047), Tau (0.667, 0.033)
SSVM: F1 (0.629, 0.029), pairsF1 (0.334, 0.046), Tau (0.588, 0.032)
SSVM-NoTransition: F1 (0.588, 0.026), pairsF1 (0.265, 0.040), Tau (0.546, 0.029)
Experiment
numpy.random.randn()
generated_data.ipynb
Conclusion
SSVM: F1 (0.812, 0.016), pairsF1 (0.574, 0.028), Tau (0.757, 0.019)
SSVM: F1 (0.745, 0.018), pairsF1 (0.500, 0.029), Tau (0.697, 0.020)
SSVM: F1 (0.772, 0.019), pairsF1 (0.544, 0.030), Tau (0.726, 0.021)
Some thoughts:
Experiment
generated_data.ipynb
Conclusion
RankSVM: F1 (0.597, 0.011), pairsF1 (0.248, 0.012), Tau (0.507, 0.011)
SSVM: F1 (0.883, 0.014), pairsF1 (0.667, 0.027), Tau (0.843, 0.017)
SSVM-NoTransition: F1 (0.827, 0.015), pairsF1 (0.510, 0.021), Tau (0.775, 0.015)
RankSVM: F1 (0.903, 0.012), pairsF1 (0.612, 0.023), Tau (0.858, 0.013)
SSVM: F1 (0.865, 0.016), pairsF1 (0.607, 0.026), Tau (0.822, 0.017)
SSVM-NoTransition: F1 (0.869, 0.017), pairsF1 (0.589, 0.026), Tau (0.819, 0.018)
Experiment
Conclusion
Experiment
ssvm_ranksvm_weights.ipynb
is created for this experiment.Conclusion
As POI features for both RankSVM and SSVM are the same, so we can use the inference procedure of SSVM with RankSVM weights. (All transition features are set to zero)
The weights of RankSVM are computed from the leave-one-out, i.e., a vector of weights for each query in training set.
The dataset used here is Glasgow.
The performance of RankSVM:
F1 (0.767, 0.026), pairsF1 (0.553, 0.047), Tau (0.739, 0.030)
The performance of SSVM using the weights of the above RankSVM:
SSVM-Viterbi: F1 (0.713, 0.032), pairsF1 (0.575, 0.046), Tau (0.740, 0.031)
SSVM-listViterbi: F1 (0.765, 0.026), pairsF1 (0.550, 0.046), Tau (0.739, 0.030)
Experiment
generated_data.ipynb
is created for this experiment.Conclusion
SSVM: F1 (0.699, 0.015), pairsF1 (0.416, 0.021), Tau (0.643, 0.016)
RankSVM: F1 (0.881, 0.011), pairsF1 (0.572, 0.019), Tau (0.829, 0.012)