[视频目标检测]:使用MEGA|DFF|FGFA训练自己的数据集

jupiter
2023-03-28 / 1 评论 / 673 阅读 / 正在检测是否收录...
温馨提示:
本文最后更新于2023年03月28日,已超过604天没有更新,若内容或图片失效,请留言反馈。

1.创建环境

  • 创建虚拟环境
conda create --name MEGA -y python=3.7
source activate MEGA
  • 安装基础包
conda install ipython pip
pip install ninja yacs cython matplotlib tqdm opencv-python scipy
export INSTALL_DIR=$PWD
  • 安装pytorch

    在安装pytorch的时候,原作者是这样的:

    conda install pytorch=1.3.0 torchvision cudatoolkit=10.0 -c pytorch

但实际上使用cuda11.0+pytorch1.7也可以编译跑通,所以在这一步我们将其替换成:

conda install pytorch==1.7.0 torchvision==0.8.0 torchaudio==0.7.0 cudatoolkit=11.0 -c pytorch
  • 然后就是作者使用到的coco数据集和cityperson数据集的安装:
cd $INSTALL_DIR
git clone https://github.com/cocodataset/cocoapi.git
cd cocoapi/PythonAPId
python setup.py build_ext install

cd $INSTALL_DIR
git clone https://github.com/mcordts/cityscapesScripts.git
cd cityscapesScripts/
python setup.py build_ext install
  • 安装apex:(可省略) (建议省略,没省略运行报错)
git clone https://github.com/NVIDIA/apex.git
cd apex
python setup.py install --cuda_ext --cpp_ext

如果使用的是cuda11.0+pytorch1.7这里会报错

deform_conv_cuda.cu(155): error: identifier "AT_CHECK " is undefined

解决:
在mega_core/csrc/cuda/deform_conv_cuda.cu 和 mega_core/csrc/cuda/deform_pool_cuda.cu文件的开头加上如下代码:

#ifndef AT_CHECK
#define AT_CHECK TORCH_CHECK 
#endif

实际上原作者并没有使用到apex来进行混合精度训练,这一步也可省略,若省略的话在代码中需要修改几处地方:
首先是mega_core/engine/trainer.py中的开头导入apex包注释掉,

108-109行改为:

losses.backward()

还有tools/train_net.py中33行-36行注释掉

# try:
#     from apex import amp
# except ImportError:
#     raise ImportError('Use APEX for multi-precision via apex.amp')

50行也注释掉:

#model, optimizer = amp.initialize(model, optimizer, opt_level=amp_opt_level)

还有mega_core/layers/nms.py,注释掉第5行

第8行改为:

nms = _C.nms

还有mega_core/layers/roi_align.py注释掉第10、57行

还有mega_core/layers/roi_pool.py注释掉第10、56行

这样应该就可以了。

2.下载和初始化mega.pytorch

# install PyTorch Detection
cd $INSTALL_DIR
git clone https://github.com/Scalsol/mega.pytorch.git
cd mega.pytorch

# the following will install the lib with
# symbolic links, so that you can modify
# the files if you want and won't need to
# re-build it
python setup.py build develop
pip install 'pillow<7.0.0'

3.制作自己的数据集

参考作者提供的customize.md文件

3.1 数据集格式

参考:https://github.com/Scalsol/mega.pytorch/blob/master/CUSTOMIZE.md

【注意事项】

1.图片编号是从0开始的6位数字;(不想实现自己的数据加载器这是必要的)

2.annotation内的xml文件与train、val钟文件一一对应。

datasets
├── vid_custom
|   |── train
|   |   |── video_snippet_1
|   |   |   |── 000000.JPEG
|   |   |   |── 000001.JPEG
|   |   |   |── 000002.JPEG
|   |   |   ...
|   |   |── video_snippet_2
|   |   |   |── 000000.JPEG
|   |   |   |── 000001.JPEG
|   |   |   |── 000002.JPEG
|   |   |   ...
|   |   ...
|   |── val
|   |   |── video_snippet_1
|   |   |   |── 000000.JPEG
|   |   |   |── 000001.JPEG
|   |   |   |── 000002.JPEG
|   |   |   ...
|   |   |── video_snippet_2
|   |   |   |── 000000.JPEG
|   |   |   |── 000001.JPEG
|   |   |   |── 000002.JPEG
|   |   |   ...
|   |   ...
|   |── annotation
|   |   |── train
|   |   |   |── video_snippet_1
|   |   |   |   |── 000000.xml
|   |   |   |   |── 000001.xml
|   |   |   |   |── 000002.xml
|   |   |   |   ...
|   |   |   |── video_snippet_2
|   |   |   |   |── 000000.xml
|   |   |   |   |── 000001.xml
|   |   |   |   |── 000002.xml
|   |   |   |   ...
|   |   ...
|   |   |── val
|   |   |   |── video_snippet_1
|   |   |   |   |── 000000.xml
|   |   |   |   |── 000001.xml
|   |   |   |   |── 000002.xml
|   |   |   |   ...
|   |   |   |── video_snippet_2
|   |   |   |   |── 000000.xml
|   |   |   |   |── 000001.xml
|   |   |   |   |── 000002.xml
|   |   |   |   ...
|   |   ...

3.2 准备自己txt文件

具体参考源MEGA代码中datasets\ILSVRC2015\ImageSets提供的文档。

格式:每一行4列依次代表:video folder, no meaning(just ignore it),frame number,video length;
  • 训练集VID_train.txt 对应vid_custom/train文件夹
train/ILSVRC2015_VID_train_0000/ILSVRC2015_train_00000000 1 10 300
train/ILSVRC2015_VID_train_0000/ILSVRC2015_train_00000000 1 30 300
train/ILSVRC2015_VID_train_0000/ILSVRC2015_train_00000000 1 50 300
train/ILSVRC2015_VID_train_0000/ILSVRC2015_train_00000000 1 70 300
train/ILSVRC2015_VID_train_0000/ILSVRC2015_train_00000000 1 90 300
train/ILSVRC2015_VID_train_0000/ILSVRC2015_train_00000000 1 110 300
train/ILSVRC2015_VID_train_0000/ILSVRC2015_train_00000000 1 130 300
train/ILSVRC2015_VID_train_0000/ILSVRC2015_train_00000000 1 150 300
train/ILSVRC2015_VID_train_0000/ILSVRC2015_train_00000000 1 170 300
train/ILSVRC2015_VID_train_0000/ILSVRC2015_train_00000000 1 190 300
train/ILSVRC2015_VID_train_0000/ILSVRC2015_train_00000000 1 210 300
train/ILSVRC2015_VID_train_0000/ILSVRC2015_train_00000000 1 230 300
train/ILSVRC2015_VID_train_0000/ILSVRC2015_train_00000000 1 250 300
train/ILSVRC2015_VID_train_0000/ILSVRC2015_train_00000000 1 270 300
train/ILSVRC2015_VID_train_0000/ILSVRC2015_train_00000000 1 290 300
train/ILSVRC2015_VID_train_0000/ILSVRC2015_train_00001000 1 1 48
train/ILSVRC2015_VID_train_0000/ILSVRC2015_train_00001000 1 4 48
train/ILSVRC2015_VID_train_0000/ILSVRC2015_train_00001000 1 8 48
···
  • 验证集VID_val.txt 对应vid_custom/val文件夹
val/ILSVRC2015_val_00000000 1 0 464
val/ILSVRC2015_val_00000000 2 1 464
val/ILSVRC2015_val_00000000 3 2 464
val/ILSVRC2015_val_00000000 4 3 464
val/ILSVRC2015_val_00000000 5 4 464
val/ILSVRC2015_val_00000000 6 5 464
val/ILSVRC2015_val_00000000 7 6 464
val/ILSVRC2015_val_00000000 8 7 464
val/ILSVRC2015_val_00000000 9 8 464
val/ILSVRC2015_val_00000000 10 9 464
val/ILSVRC2015_val_00000000 11 10 464
val/ILSVRC2015_val_00000000 12 11 464
val/ILSVRC2015_val_00000000 13 12 464
val/ILSVRC2015_val_00000000 14 13 464
val/ILSVRC2015_val_00000000 15 14 464
val/ILSVRC2015_val_00000000 16 15 464
···

4.参数修改

  • mega_core/data/datasets/vid.py修改VIDDataset内classes和classes_map:
# classes=['__background__',#always index0
'car']
# classes_map=['__background__',# always index0
 'n02958343']
# 自己标的数据集两个都填一样的就行
classes = ['__background__',  # always index 0
                'BridgeVehicle', 'Person', 'FollowMe', 'Plane', 'LuggageTruck', 'RefuelingTruck', 'FoodTruck', 'Tractor']
classes_map = ['__background__',  # always index 0
                'BridgeVehicle', 'Person', 'FollowMe', 'Plane', 'LuggageTruck', 'RefuelingTruck', 'FoodTruck', 'Tractor']
  • mega_core/config/paths_catalog.py

    • 修改 DatasetCatalog.DATASETS,在变量的最后加上如下内容

      "vid_custom_train":{
          "img_dir":"vid_custom/train",
          "anno_path":"vid_custom/annotation",
          "img_index":"vid_custom/VID_train.txt"
      },
      "vid_custom_val":{
          "img_dir":"vid_custom/val",
          "anno_path":"vid_custom/annotation",
          "img_index":"vid_custom/VID_val.txt"
      }
    • 修改if函数下if语句,添加上vid条件

      if ("DET" in name) or ("VID" in name) or ("vid" in name):
  • 修改configs/BASE_RCNN_1gpu.yaml(取决于你用几张gpu训练)
NUM_CLASSES: 9#(物体类别数+背景)
TRAIN: ("vid_custom_train",)#记得加“,”
TEST: ("vid_custom_val",)#记得加“,”
  • 修改configs/MEGA/vid_R_101_C4_MEGA_1x.yaml
DATASETS:
  TRAIN: ("vid_custom_train",)#记得加“,”
  TEST: ("vid_custom_val",)#记得加“,”

5.训练和测试代码

5.1 开始训练

python -m torch.distributed.launch \
    --nproc_per_node=1 \
    tools/train_net.py \
    --master_port=$((RANDOM + 10000)) \
    --config-file configs/BASE_RCNN_1gpu.yaml \
    OUTPUT_DIR training_dir/BASE_RCNN
python -m torch.distributed.launch \
    --nproc_per_node=1 \
    tools/train_net.py \
    --master_port=$((RANDOM + 10000)) \
    --config-file configs/DFF/vid_R_50_C4_DFF_1x.yaml \
    OUTPUT_DIR training_dir/vid_R_50_C4_DFF_1x

5.2 开始测试

python -m torch.distributed.launch \
    --nproc_per_node 1 \
    tools/test_net.py \
    --config-file configs/BASE_RCNN_1gpu.yaml \
    MODEL.WEIGHT training_dir/BASE_RCNN/model_0020000.pth
  python tools/test_prediction.py \
        --config-file configs/BASE_RCNN_1gpu.yaml \
        --prediction ./ 

参考资料

  1. MEGA训练自己的数据集-docker
  2. https://github.com/Scalsol/mega.pytorch/issues/63
0

评论 (1)

打卡
取消
  1. 头像
    芋泥苑
    Windows 7 · FireFox

    你写得非常清晰明了,让我很容易理解你的观点。

    回复