NOTE: Please view this page via IPython Notebook Viewer Service, otherwise the within-page links may not work properly.
Photos:
The photos were selected from YFCC100M dataset, while Melbourne's Geo-Coordinates is 37°48′49″S 144°57′47″E, data in a square, from (39.5S, 140.9E) to (35.5S, 148.5E) with accuracy = 16
are used, the total number of photos is 87,362.
POIs:
POIs are from OpenStreeMap, e.g. downloading data from one of these mirrors
In [ ]:
$ wget ftp://ftp.spline.de/pub/openstreetmap/pbf/planet-latest.osm.pbf
clipping a bounding box of Melbourne: [140.9,-38.7, 148.5, -35.5]
, e.g. clipping using one of these tools
In [ ]:
$ osmconvert planet-latest.osm.pbf -b=140.9,-38.7,148.5,-35.5 -o=melbourne.osm
then filtering interested POI tags (described in the table below) from the tag list, the total number of POIs is 3360.
Python scripts for filtering POI tags from the clipped data using this python library is filter_node.py, e.g.
In [ ]:
python2 filter_node.py Melb_tags.list
file Melb_tags.list
is available here.
key | values |
```amenity``` | ```college, library, school, university, arts_centre, cinema, fountain, planetarium, theatre, clock, place_of_worship, ranger_station, townhall``` |
```building``` | ```farm, cathedral, chapel, church, mosque, temple, synagogue, shrine, school, stadium, university, bridge``` |
```geological``` | ```_ALL_```(indicating all values) |
```historic``` | ```_ALL_``` (indicating all values) |
```leisure``` | ```garden, nature_reserve, park, pitch, sports_centre, stadium, swimming_area, track, wildlife_hide``` |
```man_made``` | ```beacon, breakwater, bridge, communications_tower, embankment, dyke, groyne, lighthouse, pier, tower, windmill``` |
```natural``` | ```_ALL_``` (indicating all values) |
```tourism``` | ```attraction, artwork, gallery, museum, picnic_site, theme_park, viewpoint, zoo``` |
```waterway``` | ```river, riverbank, stream, dam, weir, waterfall``` |
Some simple facts of Melbourne data as well as data of four other cities used in ijcai15 paper are summaried in the table below.
City | ΔLongtitude (degree) | ΔLatitude (degree) | #POIs | #Users | #POI_Visits | #Travel_Sequences | Min_Distance_between_POI (km) | Max_Distance_between_POI (km) |
Edinburgh | 0.25 | 0.08 | 28 | 1,454 | 33,944 | 5,028 | 0.088 | 16.354 |
Toronto | 0.28 | 0.20 | 29 | 1,395 | 39,419 | 6,057 | 0.147 | 29.655 |
Glasgow | 0.39 | 0.37 | 27 | 601 | 11,434 | 2,227 | 0.182 | 45.344 |
Osaka | 4.34 | 1.07 | 27 | 450 | 7,747 | 1,115 | 0.216 | 410.46 |
Melbourne | 6.84 | 2.81 | 270 | 1,306 | 44,748 | 10,599 | 2.01 | 616.80 |
The distribution of sequence length for each city was shown below.
City | #Length 1 | #Length 2 | #Length 3 | #Length 4 | #Length 5 | #Length 6 | #Length 7 | #Length 8 | #Length 9 | #Length 10 | #Length 11 | #Length 12 | #Length 13 |
Edinburgh | 3616 | 778 | 300 | 146 | 76 | 48 | 30 | 15 | 7 | 5 | 0 | 5 | 2 |
Toronto | 5080 | 642 | 216 | 60 | 33 | 9 | 9 | 4 | 2 | 1 | 0 | 0 | 1 |
Glasgow | 1876 | 239 | 77 | 20 | 10 | 2 | 2 | 1 | 0 | 0 | 0 | 0 | 0 |
Osaka | 929 | 139 | 32 | 7 | 7 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Melbourne | 9817 | 672 | 81 | 22 | 4 | 2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
Q: Picking POIs is a somewhat hard task
POIs picked manually according to photo scatter plot are much better than the results of k-means clustering/kernel density estimation, but still not good enough
A: With the help of OpenStreeMap and NationalMap/Google Maps, it would be much easier to select and visualize POIs.
Further processing the POI data:
NOTES:
Q: How to deal with POIs that are too close, e.g. 0-10m?
A:
Q: POIs are generally associated with multiple labels
how to define these labels? how to label each POI?
A:
Q: Assign photo to a POI
if their distance is less than 200m according to paper seems not to be a good idea, as
A: Assign a photo to the nearest POI if the distance between the two is less than, say 500m?
Q: Travel sequences independence assumption seems to be implausible
Users' travel sequences are generated by splitting travel history of users if their consecutive POI visits occur more than 8 hours, while a common travelling spans several days, which could be represented by several travel sequences with dependence (e.g. user preference patterns: beach-park-shopping, beach-beach-shopping etc.)
A:
Settings: Melbourne, $\eta$=0.5 with time-based user interest and POI popularity, 28/110 ≈ 25.5% solutions are suboptimal, leave-one-out
Recall | Precision | F1-score |
0.735±0.177 | 0.735±0.177 | 0.735±0.177 |
Value(Recall/Precision/F1-score) | 1.0 | 0.75 | 0.67 | 0.60 | 0.57 | 0.50 | 0.40 |
Frequency | 30/110 | 7/110 | 54/110 | 2/110 | 1/110 | 14/110 | 2/110 |
Settings: Melbourne, $\eta$=0.0 with POI popularity only, 29/110 ≈ 26.4% solutions are suboptimal, leave-one-out
Recall | Precision | F1-score |
0.732±0.176 | 0.732±0.176 | 0.732±0.176 |
Value(Recall/Precision/F1-score) | 1.0 | 0.83 | 0.75 | 0.67 | 0.60 | 0.57 | 0.50 | 0.40 |
Frequency | 29/110 | 1/110 | 6/110 | 55/110 | 2/110 | 1/110 | 14/110 | 2/110 |
Transition matrix for recommended sequences, $\eta$ = 0.5:
Beach | Cultural | Education | Forest | Leisure | ManMade | Natural | Park | Religion | Shopping | WaterBody | |
Beach | 0.176 | 0.020 | 0.078 | 0.000 | 0.020 | 0.000 | 0.000 | 0.196 | 0.216 | 0.294 | 0.000 |
Cultural | 0.429 | 0.143 | 0.000 | 0.000 | 0.143 | 0.000 | 0.000 | 0.000 | 0.143 | 0.143 | 0.000 |
Education | 0.353 | 0.000 | 0.118 | 0.000 | 0.000 | 0.000 | 0.000 | 0.059 | 0.176 | 0.294 | 0.000 |
Forest | 0.000 | 0.000 | 0.000 | 1.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 |
Leisure | 0.100 | 0.000 | 0.000 | 0.000 | 0.100 | 0.000 | 0.100 | 0.000 | 0.100 | 0.600 | 0.000 |
ManMade | 0.000 | 0.000 | 1.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 |
Natural | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 1.000 | 0.000 |
Park | 0.316 | 0.000 | 0.026 | 0.000 | 0.079 | 0.000 | 0.000 | 0.053 | 0.132 | 0.395 | 0.000 |
Religion | 0.240 | 0.160 | 0.000 | 0.040 | 0.040 | 0.040 | 0.000 | 0.080 | 0.120 | 0.280 | 0.000 |
Shopping | 0.186 | 0.010 | 0.039 | 0.010 | 0.059 | 0.000 | 0.010 | 0.167 | 0.069 | 0.451 | 0.000 |
WaterBody | 0.143 | 0.143 | 0.286 | 0.143 | 0.000 | 0.000 | 0.000 | 0.143 | 0.000 | 0.143 | 0.000 |
Transition matrix for recommended sequences, $\eta$ = 0.0:
Beach | Cultural | Education | Forest | Leisure | ManMade | Natural | Park | Religion | Shopping | WaterBody | |
Beach | 0.143 | 0.020 | 0.061 | 0.000 | 0.020 | 0.000 | 0.000 | 0.265 | 0.204 | 0.286 | 0.000 |
Cultural | 0.571 | 0.143 | 0.000 | 0.000 | 0.143 | 0.000 | 0.000 | 0.000 | 0.143 | 0.000 | 0.000 |
Education | 0.278 | 0.000 | 0.056 | 0.000 | 0.056 | 0.000 | 0.000 | 0.056 | 0.167 | 0.389 | 0.000 |
Forest | 0.000 | 0.000 | 0.000 | 1.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 |
Leisure | 0.000 | 0.000 | 0.000 | 0.000 | 0.125 | 0.000 | 0.062 | 0.125 | 0.062 | 0.625 | 0.000 |
ManMade | 0.000 | 0.000 | 0.333 | 0.333 | 0.000 | 0.000 | 0.000 | 0.333 | 0.000 | 0.000 | 0.000 |
Natural | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 1.000 | 0.000 |
Park | 0.342 | 0.000 | 0.105 | 0.000 | 0.105 | 0.026 | 0.000 | 0.026 | 0.053 | 0.316 | 0.026 |
Religion | 0.174 | 0.174 | 0.000 | 0.043 | 0.043 | 0.043 | 0.000 | 0.087 | 0.174 | 0.261 | 0.000 |
Shopping | 0.219 | 0.010 | 0.042 | 0.000 | 0.094 | 0.000 | 0.010 | 0.115 | 0.083 | 0.427 | 0.000 |
WaterBody | 0.125 | 0.125 | 0.250 | 0.125 | 0.000 | 0.125 | 0.000 | 0.250 | 0.000 | 0.000 | 0.000 |
Transition matrix for actual sequences:
Beach | Cultural | Education | Forest | Leisure | ManMade | Natural | Park | Religion | Shopping | WaterBody | |
Beach | 0.454 | 0.008 | 0.008 | 0.000 | 0.042 | 0.000 | 0.000 | 0.042 | 0.092 | 0.353 | 0.000 |
Cultural | 0.027 | 0.027 | 0.054 | 0.000 | 0.108 | 0.000 | 0.000 | 0.189 | 0.189 | 0.378 | 0.027 |
Education | 0.038 | 0.000 | 0.170 | 0.000 | 0.057 | 0.000 | 0.000 | 0.113 | 0.038 | 0.528 | 0.057 |
Forest | 0.000 | 0.100 | 0.000 | 0.500 | 0.000 | 0.000 | 0.100 | 0.100 | 0.000 | 0.000 | 0.200 |
Leisure | 0.045 | 0.164 | 0.060 | 0.000 | 0.119 | 0.000 | 0.015 | 0.075 | 0.060 | 0.448 | 0.015 |
ManMade | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.500 | 0.500 | 0.000 |
Natural | 0.000 | 0.000 | 0.091 | 0.000 | 0.000 | 0.000 | 0.818 | 0.091 | 0.000 | 0.000 | 0.000 |
Park | 0.031 | 0.087 | 0.094 | 0.008 | 0.055 | 0.000 | 0.000 | 0.039 | 0.055 | 0.614 | 0.016 |
Religion | 0.091 | 0.073 | 0.073 | 0.018 | 0.127 | 0.000 | 0.000 | 0.073 | 0.073 | 0.436 | 0.036 |
Shopping | 0.137 | 0.045 | 0.033 | 0.002 | 0.073 | 0.000 | 0.000 | 0.104 | 0.043 | 0.545 | 0.017 |
WaterBody | 0.172 | 0.000 | 0.138 | 0.172 | 0.069 | 0.000 | 0.000 | 0.069 | 0.000 | 0.345 | 0.034 |