The Kinetics Human Action Video Dataset
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset (CVPR2017)
Contribution Points
- New Large Dataset
- New Video Model
## dataset
### New dataset
160,000 clips
- 400 human action classes, 400 video clips for each action = 400x400 = 160,000 clips
- Each clip lasts around 10s and is taken from a different YouTube video.
Previous datasets
HMDB-51
UCF-101
- total 14,xxx ~ 15,xxx clips = 101 actions x 25 groups x 4-7 videos of an action.
- The videos from the same group may share some common features, such as similar background, similar viewpoint, etc.
- 5 types
- 1)Human-Object Interaction 2) Body-Motion Only 3) Human-Human Interaction 4) Playing Musical Instruments 5) Sports.
- Ucf101: A dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:1212.0402, 2012.
- URL: http://crcv.ucf.edu/data/UCF101.php
Previous Models
Optical Flow Frames
- The flow field is visualized using hue to indicate the direction and intensity for the magnitude.
- Bi-directional optical flow. backward(past) & forward(future)
New Model
Bootstrapping 3D filters from 2D Filters.