1.创建环境
- 创建虚拟环境
conda create --name MEGA -y python=3.7
source activate MEGA
- 安装基础包
conda install ipython pip
pip install ninja yacs cython matplotlib tqdm opencv-python scipy
export INSTALL_DIR=$PWD
安装pytorch
在安装pytorch的时候,原作者是这样的:
conda install pytorch=1.3.0 torchvision cudatoolkit=10.0 -c pytorch
但实际上使用cuda11.0+pytorch1.7也可以编译跑通,所以在这一步我们将其替换成:
conda install pytorch==1.7.0 torchvision==0.8.0 torchaudio==0.7.0 cudatoolkit=11.0 -c pytorch
- 然后就是作者使用到的coco数据集和cityperson数据集的安装:
cd $INSTALL_DIR
git clone https://github.com/cocodataset/cocoapi.git
cd cocoapi/PythonAPId
python setup.py build_ext install
cd $INSTALL_DIR
git clone https://github.com/mcordts/cityscapesScripts.git
cd cityscapesScripts/
python setup.py build_ext install
- 安装apex:(可省略) (建议省略,没省略运行报错)
git clone https://github.com/NVIDIA/apex.git
cd apex
python setup.py install --cuda_ext --cpp_ext
如果使用的是cuda11.0+pytorch1.7这里会报错
deform_conv_cuda.cu(155): error: identifier "AT_CHECK " is undefined
解决:
在mega_core/csrc/cuda/deform_conv_cuda.cu 和 mega_core/csrc/cuda/deform_pool_cuda.cu文件的开头加上如下代码:
#ifndef AT_CHECK
#define AT_CHECK TORCH_CHECK
#endif
实际上原作者并没有使用到apex来进行混合精度训练,这一步也可省略,若省略的话在代码中需要修改几处地方:
首先是mega_core/engine/trainer.py中的开头导入apex包注释掉,108-109行改为:
losses.backward()
还有tools/train_net.py中33行-36行注释掉
# try: # from apex import amp # except ImportError: # raise ImportError('Use APEX for multi-precision via apex.amp')
50行也注释掉:
#model, optimizer = amp.initialize(model, optimizer, opt_level=amp_opt_level)
还有mega_core/layers/nms.py,注释掉第5行
第8行改为:
nms = _C.nms
还有mega_core/layers/roi_align.py注释掉第10、57行
还有mega_core/layers/roi_pool.py注释掉第10、56行
这样应该就可以了。
2.下载和初始化mega.pytorch
# install PyTorch Detection
cd $INSTALL_DIR
git clone https://github.com/Scalsol/mega.pytorch.git
cd mega.pytorch
# the following will install the lib with
# symbolic links, so that you can modify
# the files if you want and won't need to
# re-build it
python setup.py build develop
pip install 'pillow<7.0.0'
3.制作自己的数据集
参考作者提供的customize.md文件
3.1 数据集格式
参考:https://github.com/Scalsol/mega.pytorch/blob/master/CUSTOMIZE.md
【注意事项】
1.图片编号是从0开始的6位数字;(不想实现自己的数据加载器这是必要的)
2.annotation内的xml文件与train、val钟文件一一对应。
datasets
├── vid_custom
| |── train
| | |── video_snippet_1
| | | |── 000000.JPEG
| | | |── 000001.JPEG
| | | |── 000002.JPEG
| | | ...
| | |── video_snippet_2
| | | |── 000000.JPEG
| | | |── 000001.JPEG
| | | |── 000002.JPEG
| | | ...
| | ...
| |── val
| | |── video_snippet_1
| | | |── 000000.JPEG
| | | |── 000001.JPEG
| | | |── 000002.JPEG
| | | ...
| | |── video_snippet_2
| | | |── 000000.JPEG
| | | |── 000001.JPEG
| | | |── 000002.JPEG
| | | ...
| | ...
| |── annotation
| | |── train
| | | |── video_snippet_1
| | | | |── 000000.xml
| | | | |── 000001.xml
| | | | |── 000002.xml
| | | | ...
| | | |── video_snippet_2
| | | | |── 000000.xml
| | | | |── 000001.xml
| | | | |── 000002.xml
| | | | ...
| | ...
| | |── val
| | | |── video_snippet_1
| | | | |── 000000.xml
| | | | |── 000001.xml
| | | | |── 000002.xml
| | | | ...
| | | |── video_snippet_2
| | | | |── 000000.xml
| | | | |── 000001.xml
| | | | |── 000002.xml
| | | | ...
| | ...
3.2 准备自己txt文件
具体参考源MEGA代码中datasets\ILSVRC2015\ImageSets提供的文档。
格式:每一行4列依次代表:video folder, no meaning(just ignore it),frame number,video length;
- 训练集
VID_train.txt
对应vid_custom/train
文件夹
train/ILSVRC2015_VID_train_0000/ILSVRC2015_train_00000000 1 10 300
train/ILSVRC2015_VID_train_0000/ILSVRC2015_train_00000000 1 30 300
train/ILSVRC2015_VID_train_0000/ILSVRC2015_train_00000000 1 50 300
train/ILSVRC2015_VID_train_0000/ILSVRC2015_train_00000000 1 70 300
train/ILSVRC2015_VID_train_0000/ILSVRC2015_train_00000000 1 90 300
train/ILSVRC2015_VID_train_0000/ILSVRC2015_train_00000000 1 110 300
train/ILSVRC2015_VID_train_0000/ILSVRC2015_train_00000000 1 130 300
train/ILSVRC2015_VID_train_0000/ILSVRC2015_train_00000000 1 150 300
train/ILSVRC2015_VID_train_0000/ILSVRC2015_train_00000000 1 170 300
train/ILSVRC2015_VID_train_0000/ILSVRC2015_train_00000000 1 190 300
train/ILSVRC2015_VID_train_0000/ILSVRC2015_train_00000000 1 210 300
train/ILSVRC2015_VID_train_0000/ILSVRC2015_train_00000000 1 230 300
train/ILSVRC2015_VID_train_0000/ILSVRC2015_train_00000000 1 250 300
train/ILSVRC2015_VID_train_0000/ILSVRC2015_train_00000000 1 270 300
train/ILSVRC2015_VID_train_0000/ILSVRC2015_train_00000000 1 290 300
train/ILSVRC2015_VID_train_0000/ILSVRC2015_train_00001000 1 1 48
train/ILSVRC2015_VID_train_0000/ILSVRC2015_train_00001000 1 4 48
train/ILSVRC2015_VID_train_0000/ILSVRC2015_train_00001000 1 8 48
···
- 验证集
VID_val.txt
对应vid_custom/val
文件夹
val/ILSVRC2015_val_00000000 1 0 464
val/ILSVRC2015_val_00000000 2 1 464
val/ILSVRC2015_val_00000000 3 2 464
val/ILSVRC2015_val_00000000 4 3 464
val/ILSVRC2015_val_00000000 5 4 464
val/ILSVRC2015_val_00000000 6 5 464
val/ILSVRC2015_val_00000000 7 6 464
val/ILSVRC2015_val_00000000 8 7 464
val/ILSVRC2015_val_00000000 9 8 464
val/ILSVRC2015_val_00000000 10 9 464
val/ILSVRC2015_val_00000000 11 10 464
val/ILSVRC2015_val_00000000 12 11 464
val/ILSVRC2015_val_00000000 13 12 464
val/ILSVRC2015_val_00000000 14 13 464
val/ILSVRC2015_val_00000000 15 14 464
val/ILSVRC2015_val_00000000 16 15 464
···
4.参数修改
- mega_core/data/datasets/vid.py修改VIDDataset内classes和classes_map:
# classes=['__background__',#always index0
'car']
# classes_map=['__background__',# always index0
'n02958343']
# 自己标的数据集两个都填一样的就行
classes = ['__background__', # always index 0
'BridgeVehicle', 'Person', 'FollowMe', 'Plane', 'LuggageTruck', 'RefuelingTruck', 'FoodTruck', 'Tractor']
classes_map = ['__background__', # always index 0
'BridgeVehicle', 'Person', 'FollowMe', 'Plane', 'LuggageTruck', 'RefuelingTruck', 'FoodTruck', 'Tractor']
mega_core/config/paths_catalog.py
修改 DatasetCatalog.DATASETS,在变量的最后加上如下内容
"vid_custom_train":{ "img_dir":"vid_custom/train", "anno_path":"vid_custom/annotation", "img_index":"vid_custom/VID_train.txt" }, "vid_custom_val":{ "img_dir":"vid_custom/val", "anno_path":"vid_custom/annotation", "img_index":"vid_custom/VID_val.txt" }
修改if函数下if语句,添加上vid条件
if ("DET" in name) or ("VID" in name) or ("vid" in name):
- 修改configs/BASE_RCNN_1gpu.yaml(取决于你用几张gpu训练)
NUM_CLASSES: 9#(物体类别数+背景)
TRAIN: ("vid_custom_train",)#记得加“,”
TEST: ("vid_custom_val",)#记得加“,”
- 修改configs/MEGA/vid_R_101_C4_MEGA_1x.yaml
DATASETS:
TRAIN: ("vid_custom_train",)#记得加“,”
TEST: ("vid_custom_val",)#记得加“,”
5.训练和测试代码
5.1 开始训练
python -m torch.distributed.launch \
--nproc_per_node=1 \
tools/train_net.py \
--master_port=$((RANDOM + 10000)) \
--config-file configs/BASE_RCNN_1gpu.yaml \
OUTPUT_DIR training_dir/BASE_RCNN
python -m torch.distributed.launch \
--nproc_per_node=1 \
tools/train_net.py \
--master_port=$((RANDOM + 10000)) \
--config-file configs/DFF/vid_R_50_C4_DFF_1x.yaml \
OUTPUT_DIR training_dir/vid_R_50_C4_DFF_1x
5.2 开始测试
python -m torch.distributed.launch \
--nproc_per_node 1 \
tools/test_net.py \
--config-file configs/BASE_RCNN_1gpu.yaml \
MODEL.WEIGHT training_dir/BASE_RCNN/model_0020000.pth
python tools/test_prediction.py \
--config-file configs/BASE_RCNN_1gpu.yaml \
--prediction ./
你写得非常清晰明了,让我很容易理解你的观点。