MMDetection Tutorial

Welcome to MMDetection! This is the official colab tutorial for using MMDetection. In this tutorial, you will learn

  • Perform inference with a MMDet detector.
  • Train a new detector with a new dataset.

Let's start!


In [1]:
# Check nvcc version
!nvcc -V
# Check GCC version
!gcc --version


nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243
gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.


In [2]:
# install dependencies: (use cu101 because colab has CUDA 10.1)
!pip install -U torch==1.5.1+cu101 torchvision==0.6.1+cu101 -f https://download.pytorch.org/whl/torch_stable.html

# install mmcv-full thus we could use CUDA operators
!pip install mmcv-full

# Install mmdetection
!rm -rf mmdetection
!git clone https://github.com/open-mmlab/mmdetection.git
%cd mmdetection

!pip install -e .

# install Pillow 7.0.0 back in order to avoid bug in colab
!pip install Pillow==7.0.0


Looking in links: https://download.pytorch.org/whl/torch_stable.html
Collecting torch==1.5.1+cu101
  Downloading https://download.pytorch.org/whl/cu101/torch-1.5.1%2Bcu101-cp36-cp36m-linux_x86_64.whl (704.4MB)
     |████████████████████████████████| 704.4MB 26kB/s 
Collecting torchvision==0.6.1+cu101
  Downloading https://download.pytorch.org/whl/cu101/torchvision-0.6.1%2Bcu101-cp36-cp36m-linux_x86_64.whl (6.6MB)
     |████████████████████████████████| 6.6MB 60.4MB/s 
Requirement already satisfied, skipping upgrade: numpy in /usr/local/lib/python3.6/dist-packages (from torch==1.5.1+cu101) (1.19.5)
Requirement already satisfied, skipping upgrade: future in /usr/local/lib/python3.6/dist-packages (from torch==1.5.1+cu101) (0.16.0)
Requirement already satisfied, skipping upgrade: pillow>=4.1.1 in /usr/local/lib/python3.6/dist-packages (from torchvision==0.6.1+cu101) (7.0.0)
Installing collected packages: torch, torchvision
  Found existing installation: torch 1.7.0+cu101
    Uninstalling torch-1.7.0+cu101:
      Successfully uninstalled torch-1.7.0+cu101
  Found existing installation: torchvision 0.8.1+cu101
    Uninstalling torchvision-0.8.1+cu101:
      Successfully uninstalled torchvision-0.8.1+cu101
Successfully installed torch-1.5.1+cu101 torchvision-0.6.1+cu101
Collecting mmcv-full
  Downloading https://files.pythonhosted.org/packages/30/f6/763845494c67ec6469992c8196c2458bdc12ff9c749de14d20a000da765d/mmcv-full-1.2.6.tar.gz (226kB)
     |████████████████████████████████| 235kB 15.8MB/s 
Collecting addict
  Downloading https://files.pythonhosted.org/packages/6a/00/b08f23b7d7e1e14ce01419a467b583edbb93c6cdb8654e54a9cc579cd61f/addict-2.4.0-py3-none-any.whl
Requirement already satisfied: numpy in /usr/local/lib/python3.6/dist-packages (from mmcv-full) (1.19.5)
Requirement already satisfied: Pillow in /usr/local/lib/python3.6/dist-packages (from mmcv-full) (7.0.0)
Requirement already satisfied: pyyaml in /usr/local/lib/python3.6/dist-packages (from mmcv-full) (3.13)
Collecting yapf
  Downloading https://files.pythonhosted.org/packages/c1/5d/d84677fe852bc5e091739acda444a9b6700ffc6b11a21b00dd244c8caef0/yapf-0.30.0-py2.py3-none-any.whl (190kB)
     |████████████████████████████████| 194kB 53.0MB/s 
Building wheels for collected packages: mmcv-full
  Building wheel for mmcv-full (setup.py) ... done
  Created wheel for mmcv-full: filename=mmcv_full-1.2.6-cp36-cp36m-linux_x86_64.whl size=20243694 sha256=8742a849334b62e8e3f7b695fd546b033111501586a94fe5612aab54f7edebfa
  Stored in directory: /root/.cache/pip/wheels/40/39/64/7c5ab43621826eb41d31f1df14a8acabf74d879fdf33dc9d79
Successfully built mmcv-full
Installing collected packages: addict, yapf, mmcv-full
Successfully installed addict-2.4.0 mmcv-full-1.2.6 yapf-0.30.0
Cloning into 'mmdetection'...
remote: Enumerating objects: 50, done.
remote: Counting objects: 100% (50/50), done.
remote: Compressing objects: 100% (49/49), done.
remote: Total 15882 (delta 7), reused 5 (delta 1), pack-reused 15832
Receiving objects: 100% (15882/15882), 16.93 MiB | 33.41 MiB/s, done.
Resolving deltas: 100% (10915/10915), done.
/content/mmdetection
Obtaining file:///content/mmdetection
Requirement already satisfied: matplotlib in /usr/local/lib/python3.6/dist-packages (from mmdet==2.9.0) (3.2.2)
Collecting mmpycocotools
  Downloading https://files.pythonhosted.org/packages/99/51/1bc1d79f296347eeb2d1a2e0606885ab1e4682833bf275fd39c189952e26/mmpycocotools-12.0.3.tar.gz
Requirement already satisfied: numpy in /usr/local/lib/python3.6/dist-packages (from mmdet==2.9.0) (1.19.5)
Requirement already satisfied: six in /usr/local/lib/python3.6/dist-packages (from mmdet==2.9.0) (1.15.0)
Collecting terminaltables
  Downloading https://files.pythonhosted.org/packages/9b/c4/4a21174f32f8a7e1104798c445dacdc1d4df86f2f26722767034e4de4bff/terminaltables-3.1.0.tar.gz
Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.6/dist-packages (from matplotlib->mmdet==2.9.0) (2.8.1)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /usr/local/lib/python3.6/dist-packages (from matplotlib->mmdet==2.9.0) (2.4.7)
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.6/dist-packages (from matplotlib->mmdet==2.9.0) (1.3.1)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.6/dist-packages (from matplotlib->mmdet==2.9.0) (0.10.0)
Requirement already satisfied: setuptools>=18.0 in /usr/local/lib/python3.6/dist-packages (from mmpycocotools->mmdet==2.9.0) (53.0.0)
Requirement already satisfied: cython>=0.27.3 in /usr/local/lib/python3.6/dist-packages (from mmpycocotools->mmdet==2.9.0) (0.29.21)
Building wheels for collected packages: mmpycocotools, terminaltables
  Building wheel for mmpycocotools (setup.py) ... done
  Created wheel for mmpycocotools: filename=mmpycocotools-12.0.3-cp36-cp36m-linux_x86_64.whl size=265912 sha256=1e5525c4339f76072ed09fecd12765fe7544e94745b91fb76fca95658e3dea7b
  Stored in directory: /root/.cache/pip/wheels/a2/b0/8d/3307912785a42bc80f673946fac676d5c596eee537af7a599c
  Building wheel for terminaltables (setup.py) ... done
  Created wheel for terminaltables: filename=terminaltables-3.1.0-cp36-none-any.whl size=15358 sha256=93fdde0610537c38e16b17f6df08bbc2be3c1b19e266b5d4e5fd7aef039bb218
  Stored in directory: /root/.cache/pip/wheels/30/6b/50/6c75775b681fb36cdfac7f19799888ef9d8813aff9e379663e
Successfully built mmpycocotools terminaltables
Installing collected packages: mmpycocotools, terminaltables, mmdet
  Running setup.py develop for mmdet
Successfully installed mmdet mmpycocotools-12.0.3 terminaltables-3.1.0
Requirement already satisfied: Pillow==7.0.0 in /usr/local/lib/python3.6/dist-packages (7.0.0)

In [3]:
# Check Pytorch installation
import torch, torchvision
print(torch.__version__, torch.cuda.is_available())

# Check MMDetection installation
import mmdet
print(mmdet.__version__)

# Check mmcv installation
from mmcv.ops import get_compiling_cuda_version, get_compiler_version
print(get_compiling_cuda_version())
print(get_compiler_version())


1.5.1+cu101 True
2.9.0
10.1
GCC 7.5

Perform inference with a MMDet detector

MMDetection already provides high level APIs to do inference and training.


In [4]:
!mkdir checkpoints
!wget -c https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco/mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco_bbox_mAP-0.408__segm_mAP-0.37_20200504_163245-42aa3d00.pth \
      -O checkpoints/mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco_bbox_mAP-0.408__segm_mAP-0.37_20200504_163245-42aa3d00.pth


--2021-02-20 03:03:09--  https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco/mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco_bbox_mAP-0.408__segm_mAP-0.37_20200504_163245-42aa3d00.pth
Resolving download.openmmlab.com (download.openmmlab.com)... 47.252.96.35
Connecting to download.openmmlab.com (download.openmmlab.com)|47.252.96.35|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 177867103 (170M) [application/octet-stream]
Saving to: ‘checkpoints/mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco_bbox_mAP-0.408__segm_mAP-0.37_20200504_163245-42aa3d00.pth’

checkpoints/mask_rc 100%[===================>] 169.63M  8.44MB/s    in 21s     

2021-02-20 03:03:32 (8.19 MB/s) - ‘checkpoints/mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco_bbox_mAP-0.408__segm_mAP-0.37_20200504_163245-42aa3d00.pth’ saved [177867103/177867103]


In [5]:
from mmdet.apis import inference_detector, init_detector, show_result_pyplot

# Choose to use a config and initialize the detector
config = 'configs/mask_rcnn/mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco.py'
# Setup a checkpoint file to load
checkpoint = 'checkpoints/mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco_bbox_mAP-0.408__segm_mAP-0.37_20200504_163245-42aa3d00.pth'
# initialize the detector
model = init_detector(config, checkpoint, device='cuda:0')

In [6]:
# Use the detector to do inference
img = 'demo/demo.jpg'
result = inference_detector(model, img)


/content/mmdetection/mmdet/datasets/utils.py:66: UserWarning: "ImageToTensor" pipeline is replaced by "DefaultFormatBundle" for batch inference. It is recommended to manually replace it in the test data pipeline in your config file.
  'data pipeline in your config file.', UserWarning)

In [7]:
# Let's plot the result
show_result_pyplot(model, img, result, score_thr=0.3)


/content/mmdetection/mmdet/apis/inference.py:205: UserWarning: "block" will be deprecated in v2.9.0,Please use "wait_time"
  warnings.warn('"block" will be deprecated in v2.9.0,'
/content/mmdetection/mmdet/apis/inference.py:207: UserWarning: "fig_size" are deprecated and takes no effect.
  warnings.warn('"fig_size" are deprecated and takes no effect.')
/content/mmdetection/mmdet/core/visualization/image.py:75: UserWarning: "font_scale" will be deprecated in v2.9.0,Please use "font_size"
  warnings.warn('"font_scale" will be deprecated in v2.9.0,'

Train a detector on customized dataset

To train a new detector, there are usually three things to do:

  1. Support a new dataset
  2. Modify the config
  3. Train a new detector

Support a new dataset

There are three ways to support a new dataset in MMDetection:

  1. reorganize the dataset into COCO format.
  2. reorganize the dataset into a middle format.
  3. implement a new dataset.

Usually we recommend to use the first two methods which are usually easier than the third.

In this tutorial, we gives an example that converting the data into the format of existing datasets like COCO, VOC, etc. Other methods and more advanced usages can be found in the doc.

Firstly, let's download a tiny dataset obtained from KITTI. We select the first 75 images and their annotations from the 3D object detection dataset (it is the same dataset as the 2D object detection dataset but has 3D annotations). We convert the original images from PNG to JPEG format with 80% quality to reduce the size of dataset.


In [8]:
# download, decompress the data
!wget https://download.openmmlab.com/mmdetection/data/kitti_tiny.zip
!unzip kitti_tiny.zip > /dev/null


--2021-02-20 03:04:04--  https://download.openmmlab.com/mmdetection/data/kitti_tiny.zip
Resolving download.openmmlab.com (download.openmmlab.com)... 47.252.96.35
Connecting to download.openmmlab.com (download.openmmlab.com)|47.252.96.35|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 6918271 (6.6M) [application/zip]
Saving to: ‘kitti_tiny.zip’

kitti_tiny.zip      100%[===================>]   6.60M  8.44MB/s    in 0.8s    

2021-02-20 03:04:06 (8.44 MB/s) - ‘kitti_tiny.zip’ saved [6918271/6918271]


In [9]:
# Check the directory structure of the tiny data

# Install tree first
!apt-get -q install tree
!tree kitti_tiny


Reading package lists...
Building dependency tree...
Reading state information...
The following NEW packages will be installed:
  tree
0 upgraded, 1 newly installed, 0 to remove and 10 not upgraded.
Need to get 40.7 kB of archives.
After this operation, 105 kB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu bionic/universe amd64 tree amd64 1.7.0-5 [40.7 kB]
Fetched 40.7 kB in 0s (165 kB/s)
Selecting previously unselected package tree.
(Reading database ... 146442 files and directories currently installed.)
Preparing to unpack .../tree_1.7.0-5_amd64.deb ...
Unpacking tree (1.7.0-5) ...
Setting up tree (1.7.0-5) ...
Processing triggers for man-db (2.8.3-2ubuntu0.1) ...
kitti_tiny
├── training
│   ├── image_2
│   │   ├── 000000.jpeg
│   │   ├── 000001.jpeg
│   │   ├── 000002.jpeg
│   │   ├── 000003.jpeg
│   │   ├── 000004.jpeg
│   │   ├── 000005.jpeg
│   │   ├── 000006.jpeg
│   │   ├── 000007.jpeg
│   │   ├── 000008.jpeg
│   │   ├── 000009.jpeg
│   │   ├── 000010.jpeg
│   │   ├── 000011.jpeg
│   │   ├── 000012.jpeg
│   │   ├── 000013.jpeg
│   │   ├── 000014.jpeg
│   │   ├── 000015.jpeg
│   │   ├── 000016.jpeg
│   │   ├── 000017.jpeg
│   │   ├── 000018.jpeg
│   │   ├── 000019.jpeg
│   │   ├── 000020.jpeg
│   │   ├── 000021.jpeg
│   │   ├── 000022.jpeg
│   │   ├── 000023.jpeg
│   │   ├── 000024.jpeg
│   │   ├── 000025.jpeg
│   │   ├── 000026.jpeg
│   │   ├── 000027.jpeg
│   │   ├── 000028.jpeg
│   │   ├── 000029.jpeg
│   │   ├── 000030.jpeg
│   │   ├── 000031.jpeg
│   │   ├── 000032.jpeg
│   │   ├── 000033.jpeg
│   │   ├── 000034.jpeg
│   │   ├── 000035.jpeg
│   │   ├── 000036.jpeg
│   │   ├── 000037.jpeg
│   │   ├── 000038.jpeg
│   │   ├── 000039.jpeg
│   │   ├── 000040.jpeg
│   │   ├── 000041.jpeg
│   │   ├── 000042.jpeg
│   │   ├── 000043.jpeg
│   │   ├── 000044.jpeg
│   │   ├── 000045.jpeg
│   │   ├── 000046.jpeg
│   │   ├── 000047.jpeg
│   │   ├── 000048.jpeg
│   │   ├── 000049.jpeg
│   │   ├── 000050.jpeg
│   │   ├── 000051.jpeg
│   │   ├── 000052.jpeg
│   │   ├── 000053.jpeg
│   │   ├── 000054.jpeg
│   │   ├── 000055.jpeg
│   │   ├── 000056.jpeg
│   │   ├── 000057.jpeg
│   │   ├── 000058.jpeg
│   │   ├── 000059.jpeg
│   │   ├── 000060.jpeg
│   │   ├── 000061.jpeg
│   │   ├── 000062.jpeg
│   │   ├── 000063.jpeg
│   │   ├── 000064.jpeg
│   │   ├── 000065.jpeg
│   │   ├── 000066.jpeg
│   │   ├── 000067.jpeg
│   │   ├── 000068.jpeg
│   │   ├── 000069.jpeg
│   │   ├── 000070.jpeg
│   │   ├── 000071.jpeg
│   │   ├── 000072.jpeg
│   │   ├── 000073.jpeg
│   │   └── 000074.jpeg
│   └── label_2
│       ├── 000000.txt
│       ├── 000001.txt
│       ├── 000002.txt
│       ├── 000003.txt
│       ├── 000004.txt
│       ├── 000005.txt
│       ├── 000006.txt
│       ├── 000007.txt
│       ├── 000008.txt
│       ├── 000009.txt
│       ├── 000010.txt
│       ├── 000011.txt
│       ├── 000012.txt
│       ├── 000013.txt
│       ├── 000014.txt
│       ├── 000015.txt
│       ├── 000016.txt
│       ├── 000017.txt
│       ├── 000018.txt
│       ├── 000019.txt
│       ├── 000020.txt
│       ├── 000021.txt
│       ├── 000022.txt
│       ├── 000023.txt
│       ├── 000024.txt
│       ├── 000025.txt
│       ├── 000026.txt
│       ├── 000027.txt
│       ├── 000028.txt
│       ├── 000029.txt
│       ├── 000030.txt
│       ├── 000031.txt
│       ├── 000032.txt
│       ├── 000033.txt
│       ├── 000034.txt
│       ├── 000035.txt
│       ├── 000036.txt
│       ├── 000037.txt
│       ├── 000038.txt
│       ├── 000039.txt
│       ├── 000040.txt
│       ├── 000041.txt
│       ├── 000042.txt
│       ├── 000043.txt
│       ├── 000044.txt
│       ├── 000045.txt
│       ├── 000046.txt
│       ├── 000047.txt
│       ├── 000048.txt
│       ├── 000049.txt
│       ├── 000050.txt
│       ├── 000051.txt
│       ├── 000052.txt
│       ├── 000053.txt
│       ├── 000054.txt
│       ├── 000055.txt
│       ├── 000056.txt
│       ├── 000057.txt
│       ├── 000058.txt
│       ├── 000059.txt
│       ├── 000060.txt
│       ├── 000061.txt
│       ├── 000062.txt
│       ├── 000063.txt
│       ├── 000064.txt
│       ├── 000065.txt
│       ├── 000066.txt
│       ├── 000067.txt
│       ├── 000068.txt
│       ├── 000069.txt
│       ├── 000070.txt
│       ├── 000071.txt
│       ├── 000072.txt
│       ├── 000073.txt
│       └── 000074.txt
├── train.txt
└── val.txt

3 directories, 152 files

In [10]:
# Let's take a look at the dataset image
import mmcv
import matplotlib.pyplot as plt

img = mmcv.imread('kitti_tiny/training/image_2/000073.jpeg')
plt.figure(figsize=(15, 10))
plt.imshow(mmcv.bgr2rgb(img))
plt.show()


After downloading the data, we need to implement a function to convert the kitti annotation format into the middle format. In this tutorial we choose to convert them in load_annotations function in a newly implemented KittiTinyDataset.

Let's take a look at the annotation txt file.


In [11]:
# Check the label of a single image
!cat kitti_tiny/training/label_2/000000.txt


Pedestrian 0.00 0 -0.20 712.40 143.00 810.73 307.92 1.89 0.48 1.20 1.84 1.47 8.41 0.01

According to the KITTI's documentation, the first column indicates the class of the object, and the 5th to 8th columns indicates the bboxes. We need to read annotations of each image and convert them into middle format MMDetection accept is as below:

[
    {
        'filename': 'a.jpg',
        'width': 1280,
        'height': 720,
        'ann': {
            'bboxes': <np.ndarray> (n, 4),
            'labels': <np.ndarray> (n, ),
            'bboxes_ignore': <np.ndarray> (k, 4), (optional field)
            'labels_ignore': <np.ndarray> (k, 4) (optional field)
        }
    },
    ...
]

In [12]:
import copy
import os.path as osp

import mmcv
import numpy as np

from mmdet.datasets.builder import DATASETS
from mmdet.datasets.custom import CustomDataset

@DATASETS.register_module()
class KittiTinyDataset(CustomDataset):

    CLASSES = ('Car', 'Pedestrian', 'Cyclist')

    def load_annotations(self, ann_file):
        cat2label = {k: i for i, k in enumerate(self.CLASSES)}
        # load image list from file
        image_list = mmcv.list_from_file(self.ann_file)
    
        data_infos = []
        # convert annotations to middle format
        for image_id in image_list:
            filename = f'{self.img_prefix}/{image_id}.jpeg'
            image = mmcv.imread(filename)
            height, width = image.shape[:2]
    
            data_info = dict(filename=f'{image_id}.jpeg', width=width, height=height)
    
            # load annotations
            label_prefix = self.img_prefix.replace('image_2', 'label_2')
            lines = mmcv.list_from_file(osp.join(label_prefix, f'{image_id}.txt'))
    
            content = [line.strip().split(' ') for line in lines]
            bbox_names = [x[0] for x in content]
            bboxes = [[float(info) for info in x[4:8]] for x in content]
    
            gt_bboxes = []
            gt_labels = []
            gt_bboxes_ignore = []
            gt_labels_ignore = []
    
            # filter 'DontCare'
            for bbox_name, bbox in zip(bbox_names, bboxes):
                if bbox_name in cat2label:
                    gt_labels.append(cat2label[bbox_name])
                    gt_bboxes.append(bbox)
                else:
                    gt_labels_ignore.append(-1)
                    gt_bboxes_ignore.append(bbox)

            data_anno = dict(
                bboxes=np.array(gt_bboxes, dtype=np.float32).reshape(-1, 4),
                labels=np.array(gt_labels, dtype=np.long),
                bboxes_ignore=np.array(gt_bboxes_ignore,
                                       dtype=np.float32).reshape(-1, 4),
                labels_ignore=np.array(gt_labels_ignore, dtype=np.long))

            data_info.update(ann=data_anno)
            data_infos.append(data_info)

        return data_infos

Modify the config

In the next step, we need to modify the config for the training. To accelerate the process, we finetune a detector using a pre-trained detector.


In [13]:
from mmcv import Config
cfg = Config.fromfile('./configs/faster_rcnn/faster_rcnn_r50_caffe_fpn_mstrain_1x_coco.py')

Given a config that trains a Faster R-CNN on COCO dataset, we need to modify some values to use it for training Faster R-CNN on KITTI dataset.


In [14]:
from mmdet.apis import set_random_seed

# Modify dataset type and path
cfg.dataset_type = 'KittiTinyDataset'
cfg.data_root = 'kitti_tiny/'

cfg.data.test.type = 'KittiTinyDataset'
cfg.data.test.data_root = 'kitti_tiny/'
cfg.data.test.ann_file = 'train.txt'
cfg.data.test.img_prefix = 'training/image_2'

cfg.data.train.type = 'KittiTinyDataset'
cfg.data.train.data_root = 'kitti_tiny/'
cfg.data.train.ann_file = 'train.txt'
cfg.data.train.img_prefix = 'training/image_2'

cfg.data.val.type = 'KittiTinyDataset'
cfg.data.val.data_root = 'kitti_tiny/'
cfg.data.val.ann_file = 'val.txt'
cfg.data.val.img_prefix = 'training/image_2'

# modify num classes of the model in box head
cfg.model.roi_head.bbox_head.num_classes = 3
# We can still use the pre-trained Mask RCNN model though we do not need to
# use the mask branch
cfg.load_from = 'checkpoints/mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco_bbox_mAP-0.408__segm_mAP-0.37_20200504_163245-42aa3d00.pth'

# Set up working dir to save files and logs.
cfg.work_dir = './tutorial_exps'

# The original learning rate (LR) is set for 8-GPU training.
# We divide it by 8 since we only use one GPU.
cfg.optimizer.lr = 0.02 / 8
cfg.lr_config.warmup = None
cfg.log_config.interval = 10

# Change the evaluation metric since we use customized dataset.
cfg.evaluation.metric = 'mAP'
# We can set the evaluation interval to reduce the evaluation times
cfg.evaluation.interval = 12
# We can set the checkpoint saving interval to reduce the storage cost
cfg.checkpoint_config.interval = 12

# Set seed thus the results are more reproducible
cfg.seed = 0
set_random_seed(0, deterministic=False)
cfg.gpu_ids = range(1)


# We can initialize the logger for training and have a look
# at the final config used for training
print(f'Config:\n{cfg.pretty_text}')


Config:
model = dict(
    type='FasterRCNN',
    pretrained='open-mmlab://detectron2/resnet50_caffe',
    backbone=dict(
        type='ResNet',
        depth=50,
        num_stages=4,
        out_indices=(0, 1, 2, 3),
        frozen_stages=1,
        norm_cfg=dict(type='BN', requires_grad=False),
        norm_eval=True,
        style='caffe'),
    neck=dict(
        type='FPN',
        in_channels=[256, 512, 1024, 2048],
        out_channels=256,
        num_outs=5),
    rpn_head=dict(
        type='RPNHead',
        in_channels=256,
        feat_channels=256,
        anchor_generator=dict(
            type='AnchorGenerator',
            scales=[8],
            ratios=[0.5, 1.0, 2.0],
            strides=[4, 8, 16, 32, 64]),
        bbox_coder=dict(
            type='DeltaXYWHBBoxCoder',
            target_means=[0.0, 0.0, 0.0, 0.0],
            target_stds=[1.0, 1.0, 1.0, 1.0]),
        loss_cls=dict(
            type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
        loss_bbox=dict(type='L1Loss', loss_weight=1.0)),
    roi_head=dict(
        type='StandardRoIHead',
        bbox_roi_extractor=dict(
            type='SingleRoIExtractor',
            roi_layer=dict(type='RoIAlign', output_size=7, sampling_ratio=0),
            out_channels=256,
            featmap_strides=[4, 8, 16, 32]),
        bbox_head=dict(
            type='Shared2FCBBoxHead',
            in_channels=256,
            fc_out_channels=1024,
            roi_feat_size=7,
            num_classes=3,
            bbox_coder=dict(
                type='DeltaXYWHBBoxCoder',
                target_means=[0.0, 0.0, 0.0, 0.0],
                target_stds=[0.1, 0.1, 0.2, 0.2]),
            reg_class_agnostic=False,
            loss_cls=dict(
                type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
            loss_bbox=dict(type='L1Loss', loss_weight=1.0))),
    train_cfg=dict(
        rpn=dict(
            assigner=dict(
                type='MaxIoUAssigner',
                pos_iou_thr=0.7,
                neg_iou_thr=0.3,
                min_pos_iou=0.3,
                match_low_quality=True,
                ignore_iof_thr=-1),
            sampler=dict(
                type='RandomSampler',
                num=256,
                pos_fraction=0.5,
                neg_pos_ub=-1,
                add_gt_as_proposals=False),
            allowed_border=-1,
            pos_weight=-1,
            debug=False),
        rpn_proposal=dict(
            nms_across_levels=False,
            nms_pre=2000,
            nms_post=1000,
            max_num=1000,
            nms_thr=0.7,
            min_bbox_size=0),
        rcnn=dict(
            assigner=dict(
                type='MaxIoUAssigner',
                pos_iou_thr=0.5,
                neg_iou_thr=0.5,
                min_pos_iou=0.5,
                match_low_quality=False,
                ignore_iof_thr=-1),
            sampler=dict(
                type='RandomSampler',
                num=512,
                pos_fraction=0.25,
                neg_pos_ub=-1,
                add_gt_as_proposals=True),
            pos_weight=-1,
            debug=False)),
    test_cfg=dict(
        rpn=dict(
            nms_across_levels=False,
            nms_pre=1000,
            nms_post=1000,
            max_num=1000,
            nms_thr=0.7,
            min_bbox_size=0),
        rcnn=dict(
            score_thr=0.05,
            nms=dict(type='nms', iou_threshold=0.5),
            max_per_img=100)))
dataset_type = 'KittiTinyDataset'
data_root = 'kitti_tiny/'
img_norm_cfg = dict(
    mean=[103.53, 116.28, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False)
train_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(type='LoadAnnotations', with_bbox=True),
    dict(
        type='Resize',
        img_scale=[(1333, 640), (1333, 672), (1333, 704), (1333, 736),
                   (1333, 768), (1333, 800)],
        multiscale_mode='value',
        keep_ratio=True),
    dict(type='RandomFlip', flip_ratio=0.5),
    dict(
        type='Normalize',
        mean=[103.53, 116.28, 123.675],
        std=[1.0, 1.0, 1.0],
        to_rgb=False),
    dict(type='Pad', size_divisor=32),
    dict(type='DefaultFormatBundle'),
    dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels'])
]
test_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(
        type='MultiScaleFlipAug',
        img_scale=(1333, 800),
        flip=False,
        transforms=[
            dict(type='Resize', keep_ratio=True),
            dict(type='RandomFlip'),
            dict(
                type='Normalize',
                mean=[103.53, 116.28, 123.675],
                std=[1.0, 1.0, 1.0],
                to_rgb=False),
            dict(type='Pad', size_divisor=32),
            dict(type='ImageToTensor', keys=['img']),
            dict(type='Collect', keys=['img'])
        ])
]
data = dict(
    samples_per_gpu=2,
    workers_per_gpu=2,
    train=dict(
        type='KittiTinyDataset',
        ann_file='train.txt',
        img_prefix='training/image_2',
        pipeline=[
            dict(type='LoadImageFromFile'),
            dict(type='LoadAnnotations', with_bbox=True),
            dict(
                type='Resize',
                img_scale=[(1333, 640), (1333, 672), (1333, 704), (1333, 736),
                           (1333, 768), (1333, 800)],
                multiscale_mode='value',
                keep_ratio=True),
            dict(type='RandomFlip', flip_ratio=0.5),
            dict(
                type='Normalize',
                mean=[103.53, 116.28, 123.675],
                std=[1.0, 1.0, 1.0],
                to_rgb=False),
            dict(type='Pad', size_divisor=32),
            dict(type='DefaultFormatBundle'),
            dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels'])
        ],
        data_root='kitti_tiny/'),
    val=dict(
        type='KittiTinyDataset',
        ann_file='val.txt',
        img_prefix='training/image_2',
        pipeline=[
            dict(type='LoadImageFromFile'),
            dict(
                type='MultiScaleFlipAug',
                img_scale=(1333, 800),
                flip=False,
                transforms=[
                    dict(type='Resize', keep_ratio=True),
                    dict(type='RandomFlip'),
                    dict(
                        type='Normalize',
                        mean=[103.53, 116.28, 123.675],
                        std=[1.0, 1.0, 1.0],
                        to_rgb=False),
                    dict(type='Pad', size_divisor=32),
                    dict(type='ImageToTensor', keys=['img']),
                    dict(type='Collect', keys=['img'])
                ])
        ],
        data_root='kitti_tiny/'),
    test=dict(
        type='KittiTinyDataset',
        ann_file='train.txt',
        img_prefix='training/image_2',
        pipeline=[
            dict(type='LoadImageFromFile'),
            dict(
                type='MultiScaleFlipAug',
                img_scale=(1333, 800),
                flip=False,
                transforms=[
                    dict(type='Resize', keep_ratio=True),
                    dict(type='RandomFlip'),
                    dict(
                        type='Normalize',
                        mean=[103.53, 116.28, 123.675],
                        std=[1.0, 1.0, 1.0],
                        to_rgb=False),
                    dict(type='Pad', size_divisor=32),
                    dict(type='ImageToTensor', keys=['img']),
                    dict(type='Collect', keys=['img'])
                ])
        ],
        data_root='kitti_tiny/'))
evaluation = dict(interval=12, metric='mAP')
optimizer = dict(type='SGD', lr=0.0025, momentum=0.9, weight_decay=0.0001)
optimizer_config = dict(grad_clip=None)
lr_config = dict(
    policy='step',
    warmup=None,
    warmup_iters=500,
    warmup_ratio=0.001,
    step=[8, 11])
runner = dict(type='EpochBasedRunner', max_epochs=12)
checkpoint_config = dict(interval=12)
log_config = dict(interval=10, hooks=[dict(type='TextLoggerHook')])
custom_hooks = [dict(type='NumClassCheckHook')]
dist_params = dict(backend='nccl')
log_level = 'INFO'
load_from = 'checkpoints/mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco_bbox_mAP-0.408__segm_mAP-0.37_20200504_163245-42aa3d00.pth'
resume_from = None
workflow = [('train', 1)]
work_dir = './tutorial_exps'
seed = 0
gpu_ids = range(0, 1)

Train a new detector

Finally, lets initialize the dataset and detector, then train a new detector!


In [15]:
from mmdet.datasets import build_dataset
from mmdet.models import build_detector
from mmdet.apis import train_detector


# Build dataset
datasets = [build_dataset(cfg.data.train)]

# Build the detector
model = build_detector(
    cfg.model, train_cfg=cfg.get('train_cfg'), test_cfg=cfg.get('test_cfg'))
# Add an attribute for visualization convenience
model.CLASSES = datasets[0].CLASSES

# Create work_dir
mmcv.mkdir_or_exist(osp.abspath(cfg.work_dir))
train_detector(model, datasets, cfg, distributed=False, validate=True)


/content/mmdetection/mmdet/datasets/custom.py:155: UserWarning: CustomDataset does not support filtering empty gt images.
  'CustomDataset does not support filtering empty gt images.')
2021-02-20 03:04:44,198 - mmdet - INFO - load model from: open-mmlab://detectron2/resnet50_caffe
Downloading: "https://download.openmmlab.com/pretrain/third_party/resnet50_msra-5891d200.pth" to /root/.cache/torch/checkpoints/resnet50_msra-5891d200.pth
2021-02-20 03:04:57,872 - mmdet - WARNING - The model and loaded state dict do not match exactly

unexpected key in source state_dict: conv1.bias


2021-02-20 03:04:58,180 - mmdet - INFO - load checkpoint from checkpoints/mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco_bbox_mAP-0.408__segm_mAP-0.37_20200504_163245-42aa3d00.pth
2021-02-20 03:04:58,313 - mmdet - WARNING - The model and loaded state dict do not match exactly

size mismatch for roi_head.bbox_head.fc_cls.weight: copying a param with shape torch.Size([81, 1024]) from checkpoint, the shape in current model is torch.Size([4, 1024]).
size mismatch for roi_head.bbox_head.fc_cls.bias: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([4]).
size mismatch for roi_head.bbox_head.fc_reg.weight: copying a param with shape torch.Size([320, 1024]) from checkpoint, the shape in current model is torch.Size([12, 1024]).
size mismatch for roi_head.bbox_head.fc_reg.bias: copying a param with shape torch.Size([320]) from checkpoint, the shape in current model is torch.Size([12]).
unexpected key in source state_dict: roi_head.mask_head.convs.0.conv.weight, roi_head.mask_head.convs.0.conv.bias, roi_head.mask_head.convs.1.conv.weight, roi_head.mask_head.convs.1.conv.bias, roi_head.mask_head.convs.2.conv.weight, roi_head.mask_head.convs.2.conv.bias, roi_head.mask_head.convs.3.conv.weight, roi_head.mask_head.convs.3.conv.bias, roi_head.mask_head.upsample.weight, roi_head.mask_head.upsample.bias, roi_head.mask_head.conv_logits.weight, roi_head.mask_head.conv_logits.bias

2021-02-20 03:04:58,316 - mmdet - INFO - Start running, host: root@f0e5be20007b, work_dir: /content/mmdetection/tutorial_exps
2021-02-20 03:04:58,317 - mmdet - INFO - workflow: [('train', 1)], max: 12 epochs
2021-02-20 03:05:03,791 - mmdet - INFO - Epoch [1][10/25]	lr: 2.500e-03, eta: 0:02:34, time: 0.531, data_time: 0.222, memory: 2133, loss_rpn_cls: 0.0286, loss_rpn_bbox: 0.0177, loss_cls: 0.5962, acc: 80.5273, loss_bbox: 0.3859, loss: 1.0284
2021-02-20 03:05:06,998 - mmdet - INFO - Epoch [1][20/25]	lr: 2.500e-03, eta: 0:01:59, time: 0.321, data_time: 0.021, memory: 2133, loss_rpn_cls: 0.0214, loss_rpn_bbox: 0.0122, loss_cls: 0.1736, acc: 94.0332, loss_bbox: 0.3017, loss: 0.5089
2021-02-20 03:05:13,968 - mmdet - INFO - Epoch [2][10/25]	lr: 2.500e-03, eta: 0:01:44, time: 0.530, data_time: 0.221, memory: 2133, loss_rpn_cls: 0.0183, loss_rpn_bbox: 0.0148, loss_cls: 0.1515, acc: 94.8535, loss_bbox: 0.2882, loss: 0.4728
2021-02-20 03:05:17,195 - mmdet - INFO - Epoch [2][20/25]	lr: 2.500e-03, eta: 0:01:36, time: 0.323, data_time: 0.021, memory: 2133, loss_rpn_cls: 0.0115, loss_rpn_bbox: 0.0129, loss_cls: 0.1297, acc: 95.3516, loss_bbox: 0.1971, loss: 0.3512
2021-02-20 03:05:24,202 - mmdet - INFO - Epoch [3][10/25]	lr: 2.500e-03, eta: 0:01:29, time: 0.533, data_time: 0.221, memory: 2133, loss_rpn_cls: 0.0075, loss_rpn_bbox: 0.0107, loss_cls: 0.0982, acc: 96.3672, loss_bbox: 0.1558, loss: 0.2722
2021-02-20 03:05:27,479 - mmdet - INFO - Epoch [3][20/25]	lr: 2.500e-03, eta: 0:01:24, time: 0.327, data_time: 0.021, memory: 2133, loss_rpn_cls: 0.0071, loss_rpn_bbox: 0.0145, loss_cls: 0.1456, acc: 94.5801, loss_bbox: 0.2525, loss: 0.4197
2021-02-20 03:05:34,565 - mmdet - INFO - Epoch [4][10/25]	lr: 2.500e-03, eta: 0:01:18, time: 0.538, data_time: 0.222, memory: 2133, loss_rpn_cls: 0.0082, loss_rpn_bbox: 0.0143, loss_cls: 0.1099, acc: 95.8789, loss_bbox: 0.2154, loss: 0.3477
2021-02-20 03:05:37,889 - mmdet - INFO - Epoch [4][20/25]	lr: 2.500e-03, eta: 0:01:14, time: 0.332, data_time: 0.021, memory: 2133, loss_rpn_cls: 0.0056, loss_rpn_bbox: 0.0124, loss_cls: 0.1216, acc: 95.4492, loss_bbox: 0.2074, loss: 0.3470
2021-02-20 03:05:45,023 - mmdet - INFO - Epoch [5][10/25]	lr: 2.500e-03, eta: 0:01:08, time: 0.544, data_time: 0.221, memory: 2133, loss_rpn_cls: 0.0034, loss_rpn_bbox: 0.0104, loss_cls: 0.1065, acc: 95.8496, loss_bbox: 0.2072, loss: 0.3275
2021-02-20 03:05:48,367 - mmdet - INFO - Epoch [5][20/25]	lr: 2.500e-03, eta: 0:01:04, time: 0.334, data_time: 0.021, memory: 2133, loss_rpn_cls: 0.0043, loss_rpn_bbox: 0.0109, loss_cls: 0.0918, acc: 96.7285, loss_bbox: 0.1882, loss: 0.2952
2021-02-20 03:05:55,575 - mmdet - INFO - Epoch [6][10/25]	lr: 2.500e-03, eta: 0:00:59, time: 0.548, data_time: 0.222, memory: 2133, loss_rpn_cls: 0.0028, loss_rpn_bbox: 0.0085, loss_cls: 0.0843, acc: 97.1582, loss_bbox: 0.1765, loss: 0.2721
2021-02-20 03:05:58,963 - mmdet - INFO - Epoch [6][20/25]	lr: 2.500e-03, eta: 0:00:55, time: 0.339, data_time: 0.022, memory: 2133, loss_rpn_cls: 0.0037, loss_rpn_bbox: 0.0105, loss_cls: 0.0833, acc: 96.8359, loss_bbox: 0.1700, loss: 0.2675
2021-02-20 03:06:06,144 - mmdet - INFO - Epoch [7][10/25]	lr: 2.500e-03, eta: 0:00:50, time: 0.545, data_time: 0.221, memory: 2133, loss_rpn_cls: 0.0030, loss_rpn_bbox: 0.0095, loss_cls: 0.0806, acc: 96.9238, loss_bbox: 0.1642, loss: 0.2573
2021-02-20 03:06:09,550 - mmdet - INFO - Epoch [7][20/25]	lr: 2.500e-03, eta: 0:00:46, time: 0.340, data_time: 0.022, memory: 2133, loss_rpn_cls: 0.0019, loss_rpn_bbox: 0.0115, loss_cls: 0.0867, acc: 96.6602, loss_bbox: 0.1727, loss: 0.2728
2021-02-20 03:06:16,846 - mmdet - INFO - Epoch [8][10/25]	lr: 2.500e-03, eta: 0:00:41, time: 0.553, data_time: 0.223, memory: 2133, loss_rpn_cls: 0.0021, loss_rpn_bbox: 0.0087, loss_cls: 0.0701, acc: 96.9141, loss_bbox: 0.1364, loss: 0.2174
2021-02-20 03:06:20,318 - mmdet - INFO - Epoch [8][20/25]	lr: 2.500e-03, eta: 0:00:37, time: 0.347, data_time: 0.022, memory: 2133, loss_rpn_cls: 0.0008, loss_rpn_bbox: 0.0083, loss_cls: 0.0689, acc: 97.3926, loss_bbox: 0.1634, loss: 0.2414
2021-02-20 03:06:27,654 - mmdet - INFO - Epoch [9][10/25]	lr: 2.500e-04, eta: 0:00:32, time: 0.555, data_time: 0.221, memory: 2133, loss_rpn_cls: 0.0034, loss_rpn_bbox: 0.0080, loss_cls: 0.0632, acc: 97.5488, loss_bbox: 0.1285, loss: 0.2031
2021-02-20 03:06:31,136 - mmdet - INFO - Epoch [9][20/25]	lr: 2.500e-04, eta: 0:00:28, time: 0.348, data_time: 0.022, memory: 2133, loss_rpn_cls: 0.0008, loss_rpn_bbox: 0.0065, loss_cls: 0.0539, acc: 97.9004, loss_bbox: 0.1013, loss: 0.1625
2021-02-20 03:06:38,476 - mmdet - INFO - Epoch [10][10/25]	lr: 2.500e-04, eta: 0:00:23, time: 0.554, data_time: 0.221, memory: 2133, loss_rpn_cls: 0.0026, loss_rpn_bbox: 0.0082, loss_cls: 0.0621, acc: 97.6172, loss_bbox: 0.1304, loss: 0.2033
2021-02-20 03:06:41,997 - mmdet - INFO - Epoch [10][20/25]	lr: 2.500e-04, eta: 0:00:19, time: 0.352, data_time: 0.022, memory: 2133, loss_rpn_cls: 0.0011, loss_rpn_bbox: 0.0059, loss_cls: 0.0596, acc: 97.8223, loss_bbox: 0.1199, loss: 0.1866
2021-02-20 03:06:49,368 - mmdet - INFO - Epoch [11][10/25]	lr: 2.500e-04, eta: 0:00:14, time: 0.557, data_time: 0.221, memory: 2133, loss_rpn_cls: 0.0036, loss_rpn_bbox: 0.0064, loss_cls: 0.0631, acc: 97.5000, loss_bbox: 0.1242, loss: 0.1973
2021-02-20 03:06:52,881 - mmdet - INFO - Epoch [11][20/25]	lr: 2.500e-04, eta: 0:00:10, time: 0.351, data_time: 0.022, memory: 2133, loss_rpn_cls: 0.0016, loss_rpn_bbox: 0.0072, loss_cls: 0.0570, acc: 97.9199, loss_bbox: 0.1263, loss: 0.1921
2021-02-20 03:07:00,207 - mmdet - INFO - Epoch [12][10/25]	lr: 2.500e-05, eta: 0:00:05, time: 0.554, data_time: 0.222, memory: 2134, loss_rpn_cls: 0.0009, loss_rpn_bbox: 0.0063, loss_cls: 0.0606, acc: 97.6953, loss_bbox: 0.1232, loss: 0.1910
2021-02-20 03:07:03,655 - mmdet - INFO - Epoch [12][20/25]	lr: 2.500e-05, eta: 0:00:01, time: 0.345, data_time: 0.022, memory: 2134, loss_rpn_cls: 0.0010, loss_rpn_bbox: 0.0056, loss_cls: 0.0486, acc: 98.1641, loss_bbox: 0.0882, loss: 0.1433
2021-02-20 03:07:05,260 - mmdet - INFO - Saving checkpoint at 12 epochs
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 25/25, 11.4 task/s, elapsed: 2s, ETA:     0s
---------------iou_thr: 0.5---------------
2021-02-20 03:07:08,400 - mmdet - INFO - 
+------------+-----+------+--------+-------+
| class      | gts | dets | recall | ap    |
+------------+-----+------+--------+-------+
| Car        | 62  | 131  | 0.968  | 0.879 |
| Pedestrian | 13  | 58   | 0.846  | 0.747 |
| Cyclist    | 7   | 67   | 0.429  | 0.037 |
+------------+-----+------+--------+-------+
| mAP        |     |      |        | 0.555 |
+------------+-----+------+--------+-------+
2021-02-20 03:07:08,403 - mmdet - INFO - Epoch(val) [12][25]	AP50: 0.5550, mAP: 0.5545

Understand the log

From the log, we can have a basic understanding the training process and know how well the detector is trained.

Firstly, the ResNet-50 backbone pre-trained on ImageNet is loaded, this is a common practice since training from scratch is more cost. The log shows that all the weights of the ResNet-50 backbone are loaded except the conv1.bias, which has been merged into conv.weights.

Second, since the dataset we are using is small, we loaded a Mask R-CNN model and finetune it for detection. Because the detector we actually using is Faster R-CNN, the weights in mask branch, e.g. roi_head.mask_head, are unexpected key in source state_dict and not loaded. The original Mask R-CNN is trained on COCO dataset which contains 80 classes but KITTI Tiny dataset only have 3 classes. Therefore, the last FC layer of the pre-trained Mask R-CNN for classification has different weight shape and is not used.

Third, after training, the detector is evaluated by the default VOC-style evaluation. The results show that the detector achieves 54.1 mAP on the val dataset, not bad!

Test the trained detector

After finetuning the detector, let's visualize the prediction results!


In [16]:
img = mmcv.imread('kitti_tiny/training/image_2/000068.jpeg')

model.cfg = cfg
result = inference_detector(model, img)
show_result_pyplot(model, img, result)


/content/mmdetection/mmdet/datasets/utils.py:66: UserWarning: "ImageToTensor" pipeline is replaced by "DefaultFormatBundle" for batch inference. It is recommended to manually replace it in the test data pipeline in your config file.
  'data pipeline in your config file.', UserWarning)
/content/mmdetection/mmdet/apis/inference.py:205: UserWarning: "block" will be deprecated in v2.9.0,Please use "wait_time"
  warnings.warn('"block" will be deprecated in v2.9.0,'
/content/mmdetection/mmdet/apis/inference.py:207: UserWarning: "fig_size" are deprecated and takes no effect.
  warnings.warn('"fig_size" are deprecated and takes no effect.')
/content/mmdetection/mmdet/core/visualization/image.py:75: UserWarning: "font_scale" will be deprecated in v2.9.0,Please use "font_size"
  warnings.warn('"font_scale" will be deprecated in v2.9.0,'

In [ ]: