1. 数据集
1.1 水平文字数据集
ICDAR 2003(IC03):
- Introduction: 它总共包含509张图像,258张用于训练和251张用于测试。 具体来说,它在训练集中包含1110个文本实例,而在测试集中包含1156个文本实例。 它具有单词级注释。 IC03仅考虑英文文本实例。
- Link: IC03-download
ICDAR 2011(IC11):
- Introduction: IC11是用于文本检测的英语数据集。 它包含484张图像,229张用于训练和255张用于测试。 该数据集中有1564个文本实例。 它提供单词级和字符级注释。
- Link:11-download
ICDAR 2013(IC13):
- Introduction: IC13与IC11几乎相同。 它总共包含462张图像,用于训练的229张图像和用于测试的233张图像。 具体来说,它在训练集中包含849个文本实例,而在测试集中包含1095个文本实例。
- Link: IC13-download
1.2 任意四边形文本数据集
USTB-SV1K:
- Introduction:USTB-SV1K是英语数据集。 它包含来自Google街景视图的1000张街道图像,总共2955个文本实例。 它仅提供单词级注释。
- Link: USTB-SV1K-download
SVT:
- Introduction:它包含350张图像,总共725个英文文本实例。 SVT具有字符级别和单词级别的注释。 DVT的图像是从Google街景视图中获取的,分辨率较低。
- Link: SVT-download
SVT-P:
- Introduction: 它包含639个裁剪的单词图像以进行测试。 从Google街景视图的侧面快照中选择了图像。 因此,大多数图像会因非正面视角而严重失真。 它是SVT的改进数据集。
- Link: SVT-P-download (Password : vnis)
ICDAR 2015(IC15):
- Introduction: 它总共包含1500张图像,1000张用于训练和500张用于测试。 具体来说,它包含17548个文本实例。 它提供单词级别的注释。 IC15是第一个附带场景文本数据集,并且仅考虑英语单词。
- Link: IC15-download
COCO-Text:
- Introduction: 它总共包含63686张图像,用于训练的43686张图像,用于验证的10000张图像和用于测试的10000张图像。 具体来说,它包含145859个裁剪的单词图像以进行测试,包括手写和打印,清晰和模糊,英语和非英语。
- Link: COCO-Text-download
MSRA-TD500:
- Introduction: 它总共包含500张图像。 它提供文本行级别的注释而不是单词,并提供多边形框而不是轴对齐的矩形来进行文本区域注释。 它包含英文和中文文本实例。
- Link: MSRA-TD500-download
MLT 2017:
- Introduction:它总共包含10000个自然图像。 它提供单词级别的注释。 MLT有9种语言。 它是用于场景文本检测和识别的更真实和复杂的数据集。
- Link: MLT-download
MLT 2019:
- Introduction: 它总共包含18000张图像。 它提供单词级别的注释。 与MLT相比,此数据集有10种语言。 它是用于场景文本检测和识别的更真实和复杂的数据集。
- Link: MLT-2019-download
CTW:
- Introduction:它包含32285个中文文本的高分辨率街景图像,总共包含1018402个字符实例。 所有图像都在字符级别进行注释,包括其基础字符类型,绑定框和其他6个属性。 这些属性指示其背景是否复杂,是否凸起,是否为手写或印刷,是否被遮挡,是否扭曲,是否使用艺术字。
- Link: CTW-download
RCTW-17:
- Introduction:它总共包含12514张图像,用于训练的11514张图像和用于测试的1000张图像。 RCTW-17中的图像大部分是通过照相机或手机收集的,其他则是生成的图像。 文本实例用平行四边形注释。 它是第一个大规模的中文数据集,也是当时发布的最大的数据集。
- Link: RCTW-17-download
ReCTS:
- Introduction:该数据集是大规模的中国街景商标数据集。 它基于中文单词和中文文本行级标签。 标记方法是任意四边形标记。 它总共包含20000张图像。
- Link: ReCTS-download
1.3 不规则文本数据集
CUTE80:
- Introduction: 它包含在自然场景中拍摄的80张高分辨率图像。 具体来说,它包含288个裁剪的单词图像以进行测试。 数据集集中在弯曲的文本上。 没有提供词典。
- Link: CUTE80-download
Total-Text:
- Introduction: 它总共包含1,555张图像。 具体来说,它包含11459个经裁剪的单词图像,这些图像具有三种以上不同的文本方向:水平,多方向和弯曲。
- Link: Total-Text-download
SCUT-CTW1500:
- Introduction: 它总共包含1500张图像,1000张用于训练和500张用于测试。 具体来说,它包含10751个裁剪的单词图像以进行测试。 CTW-1500中的注释是具有14个顶点的多边形。 数据集主要由中文和英文组成。
- Link: CTW-1500-download
LSVT:
- Introduction: LSVT由20,000个测试数据,30,000个完整注释的训练数据和400,000个弱注释的训练数据组成,这些数据称为部分标签。 带标签的文本区域展示了文本的多样性:水平,多向和弯曲。
- Link: LSVT-download
ArTs:
- Introduction: ArT包含10,166张图像,5,603张用于训练和4,563张用于测试。 收集它们时会考虑到文本形状的多样性,并且所有文本形状在ArT中都有大量存在。
- Link: ArT-download
1.4 合成数据集
Synth80k :
- Introduction:它包含80万幅图像,其中包含约800万个合成词实例。 每个文本实例都用其文本字符串,单词级和字符级的边界框进行注释。
- Link: Synth80k-download
SynthText :
- Introduction:它包含600万个裁剪的单词图像。 生成过程与Synth90k相似。 它也以水平样式进行注释。
- Link: SynthText-download
1.5 数据集对比
Comparison of Datasets | |||||||||||||
Datasets | Language | Image | Text instance | Text Shape | Annotation level | ||||||||
Total | Train | Test | Total | Train | Test | Horizontal | Arbitrary-Quadrilateral | Multi-oriented | Char | Word | Text-Line | ||
IC03 | English | 509 | 258 | 251 | 2266 | 1110 | 1156 | ✓ | ✕ | ✕ | ✕ | ✓ | ✕ |
IC11 | English | 484 | 229 | 255 | 1564 | ~ | ~ | ✓ | ✕ | ✕ | ✓ | ✓ | ✕ |
IC13 | English | 462 | 229 | 233 | 1944 | 849 | 1095 | ✓ | ✕ | ✕ | ✓ | ✓ | ✕ |
USTB-SV1K | English | 1000 | 500 | 500 | 2955 | ~ | ~ | ✓ | ✓ | ✕ | ✕ | ✓ | ✕ |
SVT | English | 350 | 100 | 250 | 725 | 211 | 514 | ✓ | ✓ | ✕ | ✓ | ✓ | ✕ |
SVT-P | English | 238 | ~ | ~ | 639 | ~ | ~ | ✓ | ✓ | ✕ | ✕ | ✓ | ✕ |
IC15 | English | 1500 | 1000 | 500 | 17548 | 122318 | 5230 | ✓ | ✓ | ✕ | ✕ | ✓ | ✕ |
COCO-Text | English | 63686 | 43686 | 20000 | 145859 | 118309 | 27550 | ✓ | ✓ | ✕ | ✕ | ✓ | ✕ |
MSRA-TD500 | English/Chinese | 500 | 300 | 200 | ~ | ~ | ~ | ✓ | ✓ | ✕ | ✕ | ✕ | ✓ |
MLT 2017 | Multi-lingual | 18000 | 7200 | 10800 | ~ | ~ | ~ | ✓ | ✓ | ✕ | ✕ | ✓ | ✕ |
MLT 2019 | Multi-lingual | 20000 | 10000 | 10000 | ~ | ~ | ~ | ✓ | ✓ | ✕ | ✕ | ✓ | ✕ |
CTW | Chinese | 32285 | 25887 | 6398 | 1018402 | 812872 | 205530 | ✓ | ✓ | ✕ | ✓ | ✓ | ✕ |
RCTW-17 | English/Chinese | 12514 | 15114 | 1000 | ~ | ~ | ~ | ✓ | ✓ | ✕ | ✕ | ✕ | ✓ |
ReCTS | Chinese | 20000 | ~ | ~ | ~ | ~ | ~ | ✓ | ✓ | ✕ | ✓ | ✓ | ✕ |
CUTE80 | English | 80 | ~ | ~ | ~ | ~ | ~ | ✕ | ✕ | ✓ | ✕ | ✓ | ✓ |
Total-Text | English | 1525 | 1225 | 300 | 9330 | ~ | ~ | ✓ | ✓ | ✓ | ✕ | ✓ | ✓ |
CTW-1500 | English/Chinese | 1500 | 1000 | 500 | 10751 | ~ | ~ | ✓ | ✓ | ✓ | ✕ | ✓ | ✓ |
LSVT | English/Chinese | 450000 | 430000 | 20000 | ~ | ~ | ~ | ✓ | ✓ | ✓ | ✕ | ✓ | ✓ |
ArT | English/Chinese | 10166 | 5603 | 4563 | ~ | ~ | ~ | ✓ | ✓ | ✓ | ✕ | ✓ | ✕ |
Synth80k | English | 80k | ~ | ~ | 8m | ~ | ~ | ✓ | ✕ | ✕ | ✓ | ✓ | ✕ |
SynthText | English | 800k | ~ | ~ | 6m | ~ | ~ | ✓ | ✓ | ✕ | ✕ | ✓ | ✕ |
2. 场景文本检测资源总结
2.1 方法对比
场景文本检测方法可以分为四个部分:
- (a) 传统方法;
- (b) 基于分割的方法;
- (c) 基于回归的方法;
- (d) 混合方法.
注意:
(1)“ Hori”代表水平场景文本数据集。
(2)“ Quad”代表任意四边形文本数据集。
(3)“ Irreg”代表不规则场景文本数据集。
(4)“传统方法”代表不依赖深度学习的方法。
2.1.1 传统方法
Method | Model | Code | Hori | Quad | Irreg | Source | Time | Highlight |
Yao et al. [1] | TD-Mixture | ✕ | ✓ | ✓ | ✕ | CVPR | 2012 | 1) A new dataset MSRA-TD500 and protocol for evaluation. 2) Equipped a two-level classification scheme and two sets of features extractor. |
Yin et al. [2] | ✕ | ✓ | ✕ | ✕ | TPAMI | 2013 | Extract Maximally Stable Extremal Regions (MSERs) as character candidates and group them together. | |
Le et al. [5] | HOCC | ✕ | ✓ | ✓ | ✕ | CVPR | 2014 | HOCC + MSERs |
Yin et al. [7] | ✕ | ✓ | ✓ | ✕ | TPAMI | 2015 | Presenting a unified distance metric learning framework for adaptive hierarchical clustering. | |
Wu et al. [9] | ✕ | ✓ | ✓ | ✕ | TMM | 2015 | Exploring gradient directional symmetry at component level for smoothing edge components before text detection. | |
Tian et al. [17] | ✕ | ✓ | ✕ | ✕ | IJCAI | 2016 | Scene text is first detected locally in individual frames and finally linked by an optimal tracking trajectory. | |
Yang et al. [33] | ✕ | ✓ | ✓ | ✕ | TIP | 2017 | A text detector will locate character candidates and extract text regions. Then they will linked by an optimal tracking trajectory. | |
Liang et al. [8] | ✕ | ✓ | ✓ | ✓ | TIP | 2015 | Exploring maxima stable extreme regions along with stroke width transform for detecting candidate text regions. | |
Michal et al.[12] | FASText | ✕ | ✓ | ✓ | ✕ | ICCV | 2015 | Stroke keypoints are efficiently detected and then exploited to obtain stroke segmentations. |
2.1.2基于分割的方法
Method | Model | Code | Hori | Quad | Irreg | Source | Time | Highlight | ||||||||||||
Li et al. [3] | ✕ | ✓ | ✓ | ✕ | TIP | 2014 | (1)develop three novel cues that are tailored for character detection and a Bayesian method for their integration; (2)design a Markov random field model to exploit the inherent dependencies between characters. | |||||||||||||
Zhang et al. [14] | ✕ | ✓ | ✓ | ✕ | CVPR | 2016 | Utilizing FCN for salient map detection and centroid of each character prediction. | |||||||||||||
Zhu et al. [16] | ✕ | ✓ | ✓ | ✕ | CVPR | 2016 | Performs a graph-based segmentation of connected components into words (Word-Graph). | |||||||||||||
He et al. [18] | Text-CNN | ✕ | ✓ | ✓ | ✕ | TIP | 2016 | Developing a new learning mechanism to train the Text-CNN with multi-level and rich supervised information. | ||||||||||||
Yao et al. [21] | ✕ | ✓ | ✓ | ✕ | arXiv | 2016 | Proposing to localize text in a holistic manner, by casting scene text detection as a semantic segmentation problem. | |||||||||||||
Hu et al. [27] | WordSup | ✕ | ✓ | ✓ | ✕ | ICCV | 2017 | Proposing a weakly supervised framework that can utilize word annotations. Then the detected characters are fed to a text structure analysis module. | ||||||||||||
Wu et al. [28] | ✕ | ✓ | ✓ | ✕ | ICCV | 2017 | Introducing the border class to the text detection problem for the first time, and validate that the decoding process is largely simplified with the help of text border. | |||||||||||||
Tang et al.[32] | ✕ | ✓ | ✕ | ✕ | TIP | 2017 | A text-aware candidate text region(CTR) extraction model + CTR refinement model. | |||||||||||||
Dai et al. [35] | FTSN | ✕ | ✓ | ✓ | ✕ | arXiv | 2017 | Detecting and segmenting the text instance jointly and simultaneously, leveraging merits from both semantic segmentation task and region proposal based object detection task. | ||||||||||||
Wang et al. [38] | ✕ | ✓ | ✕ | ✕ | ICDAR | 2017 | This paper proposes a novel character candidate extraction method based on super-pixel segmentation and hierarchical clustering. | |||||||||||||
Deng et al. [40] | PixelLink | ✓ | ✓ | ✓ | ✕ | AAAI | 2018 | Text instances are first segmented out by linking pixels wthin the same instance together. | ||||||||||||
Liu et al. [42] | MCN | ✕ | ✓ | ✓ | ✕ | CVPR | 2018 | Stochastic Flow Graph (SFG) + Markov Clustering. | ||||||||||||
Lyu et al. [43] | ✕ | ✓ | ✓ | ✕ | CVPR | 2018 | Detect scene text by localizing corner points of text bounding boxes and segmenting text regions in relative positions. | |||||||||||||
Chu et al. [45] | Border | ✕ | ✓ | ✓ | ✕ | ECCV | 2018 | The paper presents a novel scene text detection technique that makes use of semantics-aware text borders and bootstrapping based text segment augmentation. | ||||||||||||
Long et al. [46] | TextSnake | ✕ | ✓ | ✓ | ✓ | ECCV | 2018 | The paper proposes TextSnake, which is able to effectively represent text instances in horizontal, oriented and curved forms based on symmetry axis. | ||||||||||||
Yang et al. [47] | IncepText | ✕ | ✓ | ✓ | ✕ | IJCAI | 2018 | Designing a novel Inception-Text module and introduce deformable PSROI pooling to deal with multi-oriented text detection. | ||||||||||||
Yue et al. [48] | ✕ | ✓ | ✓ | ✕ | BMVC | 2018 | Proposing a general framework for text detection called Guided CNN to achieve the two goals simultaneously. | |||||||||||||
Zhong et al. [53] | AF-RPN | ✕ | ✓ | ✓ | ✕ | arXiv | 2018 | Presenting AF-RPN(anchor-free) as an anchor-free and scale-friendly region proposal network for the Faster R-CNN framework. | ||||||||||||
Wang et al. [54] | PSENet | ✓ | ✓ | ✓ | ✓ | CVPR | 2019 | Proposing a novel Progressive Scale Expansion Network (PSENet), designed as a segmentation-based detector with multiple predictions for each text instance. | ||||||||||||
Xu et al.[57] | TextField | ✕ | ✓ | ✓ | ✓ | arXiv | 2018 | Presenting a novel direction field which can represent scene texts of arbitrary shapes. | ||||||||||||
Tian et al. [58] | FTDN | ✕ | ✓ | ✓ | ✕ | ICIP | 2018 | FTDN is able to segment text region and simultaneously regress text box at pixel-level. | ||||||||||||
Tian et al. [83] | ✕ | ✓ | ✓ | ✓ | CVPR | 2019 | Constraining embedding feature of pixels inside the same text region to share similar properties. | |||||||||||||
Huang et al. [4] | MSERs-CNN | ✕ | ✓ | ✕ | ✕ | ECCV | 2014 | Combining MSERs with CNN | ||||||||||||
Sun et al. [6] | ✕ | ✓ | ✕ | ✕ | PR | 2015 | Presenting a robust text detection approach based on color-enhanced CER and neural networks. | |||||||||||||
Baek et al. [62] | CRAFT | ✕ | ✓ | ✓ | ✓ | CVPR | 2019 | Proposing CRAFT effectively detect text area by exploring each character and affinity between characters. | ||||||||||||
Richardson et al. [87] | ✕ | ✓ | ✓ | ✕ | WACV | 2019 | Presenting an additional scale predictor the estimate the better scale of text regions for testing. | |||||||||||||
Wang et al. [88] | SAST | ✕ | ✓ | ✓ | ✓ | ACMM | 2019 | Presenting a context attended multi-task learning framework for scene text detection. | ||||||||||||
Wang et al. [90] | PAN | ✕ | ✓ | ✓ | ✓ | ICCV | 2019 | Proposing an efficient and accurate arbitrary-shaped text detector called Pixel Aggregation Network(PAN), |
2.1.3 基于回归的方法
Method | Model | Code | Hori | Quad | Irreg | Source | Time | Highlight | ||||||||||||
Gupta et al. [15] | FCRN | ✓ | ✓ | ✕ | ✕ | CVPR | 2016 | (a) Proposing a fast and scalable engine to generate synthetic images of text in clutter; (b) FCRN. | ||||||||||||
Zhong et al. [20] | DeepText | ✕ | ✓ | ✕ | ✕ | arXiv | 2016 | (a) Inception-RPN; (b) Utilize ambiguous text category (ATC) information and multilevel region-of-interest pooling (MLRP). | ||||||||||||
Liao et al. [22] | TextBoxes | ✓ | ✓ | ✕ | ✕ | AAAI | 2017 | Mainly basing SSD object detection framework. | ||||||||||||
Liu et al. [25] | DMPNet | ✕ | ✓ | ✓ | ✕ | CVPR | 2017 | Quadrilateral sliding windows + shared Monte-Carlo method for fast and accurate computing of the polygonal areas + a sequential protocol for relative regression. | ||||||||||||
He et al. [26] | DDR | ✕ | ✓ | ✓ | ✕ | ICCV | 2017 | Proposing an FCN that has bi-task outputs where one is pixel-wise classification between text and non-text, and the other is direct regression to determine the vertex coordinates of quadrilateral text boundaries. | ||||||||||||
Jiang et al. [36] | R2CNN | ✕ | ✓ | ✓ | ✕ | arXiv | 2017 | Using the Region Proposal Network (RPN) to generate axis-aligned bounding boxes that enclose the texts with different orientations. | ||||||||||||
Xing et al. [37] | ArbiText | ✕ | ✓ | ✓ | ✕ | arXiv | 2017 | Adopting the circle anchors and incorporating a pyramid pooling module into the Single Shot MultiBox Detector framework. | ||||||||||||
Zhang et al. [39] | FEN | ✕ | ✓ | ✕ | ✕ | AAAI | 2018 | Proposing a refined scene text detector with a novel Feature Enhancement Network (FEN) for Region Proposal and Text Detection Refinement. | ||||||||||||
Wang et al. [41] | ITN | ✕ | ✓ | ✓ | ✕ | CVPR | 2018 | ITN is presented to learn the geometry-aware representation encoding the unique geometric configurations of scene text instances with in-network transformation embedding. | ||||||||||||
Liao et al. [44] | RRD | ✕ | ✓ | ✓ | ✕ | CVPR | 2018 | The regression branch extracts rotation-sensitive features, while the classification branch extracts rotation-invariant features by pooling the rotation sensitive features. | ||||||||||||
Liao et al. [49] | TextBoxes++ | ✓ | ✓ | ✓ | ✕ | TIP | 2018 | Mainly basing SSD object detection framework and it replaces the rectangular box representation in conventional object detector by a quadrilateral or oriented rectangle representation. | ||||||||||||
He et al. [50] | ✕ | ✓ | ✓ | ✕ | TIP | 2018 | Proposing a scene text detection framework based on fully convolutional network with a bi-task prediction module. | |||||||||||||
Ma et al. [51] | RRPN | ✓ | ✓ | ✓ | ✕ | TMM | 2018 | RRPN + RRoI Pooling. | ||||||||||||
Zhu et al. [55] | SLPR | ✕ | ✓ | ✓ | ✓ | arXiv | 2018 | SLPR regresses multiple points on the edge of text line and then utilizes these points to sketch the outlines of the text. | ||||||||||||
Deng et al. [56] | ✓ | ✓ | ✓ | ✕ | arXiv | 2018 | CRPN employs corners to estimate the possible locations of text instances. And it also designs a embedded data augmentation module inside region-wise subnetwork. | |||||||||||||
Cai et al. [59] | FFN | ✕ | ✓ | ✕ | ✕ | ICIP | 2018 | Proposing a Feature Fusion Network to deal with text regions differing in enormous sizes. | ||||||||||||
Sabyasachi et al. [60] | RGC | ✕ | ✓ | ✓ | ✕ | ICIP | 2018 | Proposing a novel recurrent architecture to improve the learnings of a feature map at a given time. | ||||||||||||
Liu et al. [63] | CTD | ✓ | ✓ | ✓ | ✓ | PR | 2019 | CTD + TLOC + PNMS | ||||||||||||
Xie et al. [79] | DeRPN | ✓ | ✓ | ✕ | ✕ | AAAI | 2019 | DeRPN utilizes anchor string mechanism instead of anchor box in RPN. | ||||||||||||
Wang et al. [82] | ✕ | ✓ | ✓ | ✓ | CVPR | 2019 | Text-RPN + RNN | |||||||||||||
Liu et al. [84] | ✕ | ✓ | ✓ | ✓ | CVPR | 2019 | CSE mechanism | |||||||||||||
He et al. [29] | SSTD | ✓ | ✓ | ✓ | ✕ | ICCV | 2017 | Proposing an attention mechanism. Then developing a hierarchical inception module which efficiently aggregates multi-scale inception features. | ||||||||||||
Tian et al. [11] | ✕ | ✓ | ✕ | ✕ | ICCV | 2015 | Cascade boosting detects character candidates, and the min-cost flow network model get the final result. | |||||||||||||
Tian et al. [13] | CTPN | ✓ | ✓ | ✕ | ✕ | ECCV | 2016 | 1) RPN + LSTM. 2) RPN incorporate a new vertical anchor mechanism and LSTM connects the region to get the final result. | ||||||||||||
He et al. [19] | ✕ | ✓ | ✓ | ✕ | ACCV | 2016 | ER detetctor detects regions to get coarse prediction of text regions. Then the local context is aggregated to classify the remaining regions to obtain a final prediction. | |||||||||||||
Shi et al. [23] | SegLink | ✓ | ✓ | ✓ | ✕ | CVPR | 2017 | Decomposing text into segments and links. A link connects two adjacent segments. | ||||||||||||
Tian et al. [30] | WeText | ✕ | ✓ | ✕ | ✕ | ICCV | 2017 | Proposing a weakly supervised scene text detection method (WeText). | ||||||||||||
Zhu et al. [31] | RTN | ✕ | ✓ | ✕ | ✕ | ICDAR | 2017 | Mainly basing CTPN vertical vertical proposal mechanism. | ||||||||||||
Ren et al. [34] | ✕ | ✓ | ✕ | ✕ | TMM | 2017 | Proposing a CNN-based detector. It contains a text structure component detector layer, a spatial pyramid layer, and a multi-input-layer deep belief network (DBN). | |||||||||||||
Zhang et al. [10] | ✕ | ✓ | ✕ | ✕ | CVPR | 2015 | The proposed algorithm exploits the symmetry property of character groups and allows for direct extraction of text lines from natural images. | |||||||||||||
Wang et al. [86] | DSRN | ✕ | ✓ | ✓ | ✕ | IJCAI | 2019 | Presenting a scale-transfer module and scale relationship module to handle the problem of scale variation. | ||||||||||||
Tang et al.[89] | Seglink++ | ✕ | ✓ | ✓ | ✓ | PR | 2019 | Presenting instance aware component grouping (ICG) for arbitrary-shape text detection. | ||||||||||||
Wang et al.[92] | ContourNet | ✓ | ✓ | ✓ | ✓ | CVPR | 2020 | 1.A scale-insensitive Adaptive Region Proposal Network (AdaptiveRPN); 2. Local Orthogonal Texture-aware Module (LOTM). |
2.1.4 混合方法
Method | Model | Code | Hori | Quad | Irreg | Source | Time | Highlight | ||||||||||||
Tang et al. [52] | SSFT | ✕ | ✓ | ✕ | ✕ | TMM | 2018 | Proposing a novel scene text detection method that involves superpixel-based stroke feature transform (SSFT) and deep learning based region classification (DLRC). | ||||||||||||
Xie et al.[61] | SPCNet | ✕ | ✓ | ✓ | ✓ | AAAI | 2019 | Text Context module + Re-Score mechanism. | ||||||||||||
Liu et al. [64] | PMTD | ✓ | ✓ | ✓ | ✕ | arXiv | 2019 | Perform “soft” semantic segmentation. It assigns a soft pyramid label (i.e., a real value between 0 and 1) for each pixel within text instance. | ||||||||||||
Liu et al. [80] | BDN | ✓ | ✓ | ✓ | ✕ | IJCAI | 2019 | Discretizing bouding boxes into key edges to address label confusion for text detection. | ||||||||||||
Zhang et al. [81] | LOMO | ✕ | ✓ | ✓ | ✓ | CVPR | 2019 | DR + IRM + SEM | ||||||||||||
Zhou et al. [24] | EAST | ✓ | ✓ | ✓ | ✕ | CVPR | 2017 | The pipeline directly predicts words or text lines of arbitrary orientations and quadrilateral shapes in full images with instance segmentation. | ||||||||||||
Yue et al. [48] | ✕ | ✓ | ✓ | ✕ | BMVC | 2018 | Proposing a general framework for text detection called Guided CNN to achieve the two goals simultaneously. | |||||||||||||
Zhong et al. [53] | AF-RPN | ✕ | ✓ | ✓ | ✕ | arXiv | 2018 | Presenting AF-RPN(anchor-free) as an anchor-free and scale-friendly region proposal network for the Faster R-CNN framework. | ||||||||||||
Xue et al.[85] | MSR | ✕ | ✓ | ✓ | ✓ | IJCAI | 2019 | Presenting a noval multi-scale regression network. | ||||||||||||
Liao et al. [91] | DB | ✓ | ✓ | ✓ | ✓ | AAAI | 2020 | Presenting differentiable binarization module to adaptively set the thresholds for binarization, which simplifies the post-processing. | ||||||||||||
Xiao et al. [93] | SDM | ✕ | ✓ | ✓ | ✓ | ECCV | 2020 | 1. A novel sequential deformation method; 2. auxiliary character counting supervision. |
2.2 检测结果
2.2.1 水平文本数据集的检测结果
Method | Model | Source | Time | Method Category | IC11[68] | IC13 [69] | IC05[67] | ||||||
P | R | F | P | R | F | P | R | F | |||||
Yao et al. [1] | TD-Mixture | CVPR | 2012 | Traditional | ~ | ~ | ~ | 0.69 | 0.66 | 0.67 | ~ | ~ | ~ |
Yin et al. [2] | TPAMI | 2013 | 0.86 | 0.68 | 0.76 | ~ | ~ | ~ | ~ | ~ | ~ | ||
Yin et al. [7] | TPAMI | 2015 | 0.838 | 0.66 | 0.738 | ~ | ~ | ~ | ~ | ~ | ~ | ||
Wu et al. [9] | TMM | 2015 | ~ | ~ | ~ | 0.76 | 0.70 | 0.73 | ~ | ~ | ~ | ||
Liang et al. [8] | TIP | 2015 | 0.77 | 0.68 | 0.71 | 0.76 | 0.68 | 0.72 | ~ | ~ | ~ | ||
Michal et al.[12] | FASText | ICCV | 2015 | ~ | ~ | ~ | 0.84 | 0.69 | 0.77 | ~ | ~ | ~ | |
Li et al. [3] | TIP | 2014 | Segmentation | 0.80 | 0.62 | 0.70 | ~ | ~ | ~ | ~ | ~ | ~ | |
Zhang et al. [14] | CVPR | 2016 | ~ | ~ | ~ | 0.88 | 0.78 | 0.83 | ~ | ~ | ~ | ||
He et al. [18] | Text-CNN | TIP | 2016 | 0.91 | 0.74 | 0.82 | 0.93 | 0.73 | 0.82 | 0.87 | 0.73 | 0.79 | |
Yao et al. [21] | arXiv | 2016 | ~ | ~ | ~ | 0.889 | 0.802 | 0.843 | ~ | ~ | ~ | ||
Hu et al. [27] | WordSup | ICCV | 2017 | ~ | ~ | ~ | 0.933 | 0.875 | 0.903 | ~ | ~ | ~ | |
Tang et al.[32] | TIP | 2017 | 0.90 | 0.86 | 0.88 | 0.92 | 0.87 | 0.89 | ~ | ~ | ~ | ||
Wang et al. [38] | ICDAR | 2017 | 0.87 | 0.78 | 0.82 | 0.87 | 0.82 | 0.84 | ~ | ~ | ~ | ||
Deng et al. [40] | PixelLink | AAAI | 2018 | ~ | ~ | ~ | 0.886 | 0.875 | 0.881 | ~ | ~ | ~ | |
Liu et al. [42] | MCN | CVPR | 2018 | ~ | ~ | ~ | 0.88 | 0.87 | 0.88 | ~ | ~ | ~ | |
Lyu et al. [43] | CVPR | 2018 | ~ | ~ | ~ | 0.92 | 0.844 | 0.880 | ~ | ~ | ~ | ||
Chu et al. [45] | Border | ECCV | 2018 | ~ | ~ | ~ | 0.915 | 0.871 | 0.892 | ~ | ~ | ~ | |
Wang et al. [54] | PSENet | CVPR | 2019 | ~ | ~ | ~ | 0.94 | 0.90 | 0.92 | ~ | ~ | ~ | |
Huang et al. [4] | MSERs-CNN | ECCV | 2014 | 0.88 | 0.71 | 0.78 | ~ | ~ | ~ | 0.84 | 0.67 | 0.75 | |
Sun et al. [6] | PR | 2015 | 0.92 | 0.91 | 0.91 | 0.94 | 0.92 | 0.93 | ~ | ~ | ~ | ||
Gupta et al. [15] | FCRN | CVPR | 2016 | Regression | 0.94 | 0.77 | 0.85 | 0.938 | 0.764 | 0.842 | ~ | ~ | ~ |
Zhong et al. [20] | DeepText | arXiv | 2016 | 0.87 | 0.83 | 0.85 | 0.85 | 0.81 | 0.83 | ~ | ~ | ~ | |
Liao et al. [22] | TextBoxes | AAAI | 2017 | 0.89 | 0.82 | 0.86 | 0.89 | 0.83 | 0.86 | ~ | ~ | ~ | |
Liu et al. [25] | DMPNet | CVPR | 2017 | ~ | ~ | ~ | 0.93 | 0.83 | 0.870 | ~ | ~ | ~ | |
Jiang et al. [36] | R2CNN | arXiv | 2017 | ~ | ~ | ~ | 0.92 | 0.81 | 0.86 | ~ | ~ | ~ | |
Xing et al. [37] | ArbiText | arXiv | 2017 | ~ | ~ | ~ | 0.826 | 0.936 | 0.877 | ~ | ~ | ~ | |
Wang et al. [41] | ITN | CVPR | 2018 | 0.896 | 0.889 | 0.892 | 0.941 | 0.893 | 0.916 | ~ | ~ | ~ | |
Liao et al. [49] | TextBoxes++ | TIP | 2018 | ~ | ~ | ~ | 0.92 | 0.86 | 0.89 | ~ | ~ | ~ | |
He et al. [50] | TIP | 2018 | ~ | ~ | ~ | 0.91 | 0.84 | 0.88 | ~ | ~ | ~ | ||
Ma et al. [51] | RRPN | TMM | 2018 | ~ | ~ | ~ | 0.95 | 0.89 | 0.91 | ~ | ~ | ~ | |
Zhu et al. [55] | SLPR | arXiv | 2018 | ~ | ~ | ~ | 0.90 | 0.72 | 0.80 | ~ | ~ | ~ | |
Cai et al. [59] | FFN | ICIP | 2018 | ~ | ~ | ~ | 0.92 | 0.84 | 0.876 | ~ | ~ | ~ | |
Sabyasachi et al. [60] | RGC | ICIP | 2018 | ~ | ~ | ~ | 0.89 | 0.77 | 0.83 | ~ | ~ | ~ | |
Wang et al. [82] | CVPR | 2019 | ~ | ~ | ~ | 0.937 | 0.878 | 0.907 | ~ | ~ | ~ | ||
Liu et al. [84] | CVPR | 2019 | ~ | ~ | ~ | 0.937 | 0.897 | 0.917 | ~ | ~ | ~ | ||
He et al. [29] | SSTD | ICCV | 2017 | ~ | ~ | ~ | 0.89 | 0.86 | 0.88 | ~ | ~ | ~ | |
Tian et al. [11] | ICCV | 2015 | 0.86 | 0.76 | 0.81 | 0.852 | 0.759 | 0.802 | ~ | ~ | ~ | ||
Tian et al. [13] | CTPN | ECCV | 2016 | ~ | ~ | ~ | 0.93 | 0.83 | 0.88 | ~ | ~ | ~ | |
He et al. [19] | ACCV | 2016 | ~ | ~ | ~ | 0.90 | 0.75 | 0.81 | ~ | ~ | ~ | ||
Shi et al. [23] | SegLink | CVPR | 2017 | ~ | ~ | ~ | 0.877 | 0.83 | 0.853 | ~ | ~ | ~ | |
Tian et al. [30] | WeText | ICCV | 2017 | ~ | ~ | ~ | 0.911 | 0.831 | 0.869 | ~ | ~ | ~ | |
Zhu et al. [31] | RTN | ICDAR | 2017 | ~ | ~ | ~ | 0.94 | 0.89 | 0.91 | ~ | ~ | ~ | |
Ren et al. [34] | TMM | 2017 | 0.78 | 0.67 | 0.72 | 0.81 | 0.67 | 0.73 | ~ | ~ | ~ | ||
Zhang et al. [10] | CVPR | 2015 | 0.84 | 0.76 | 0.80 | 0.88 | 0.74 | 0.80 | ~ | ~ | ~ | ||
Tang et al. [52] | SSFT | TMM | 2018 | Hybrid | 0.906 | 0.847 | 0.876 | 0.911 | 0.861 | 0.885 | ~ | ~ | ~ |
Xie et al.[61] | SPCNet | AAAI | 2019 | ~ | ~ | ~ | 0.94 | 0.91 | 0.92 | ~ | ~ | ~ | |
Liu et al. [80] | BDN | IJCAI | 2019 | ~ | ~ | ~ | 0.887 | 0.894 | 0.89 | ~ | ~ | ~ | |
Zhou et al. [24] | EAST | CVPR | 2017 | ~ | ~ | ~ | 0.93 | 0.83 | 0.870 | ~ | ~ | ~ | |
Yue et al. [48] | BMVC | 2018 | ~ | ~ | ~ | 0.885 | 0.846 | 0.870 | ~ | ~ | ~ | ||
Zhong et al. [53] | AF-RPN | arXiv | 2018 | ~ | ~ | ~ | 0.94 | 0.90 | 0.92 | ~ | ~ | ~ | |
Xue et al.[85] | MSR | IJCAI | 2019 | ~ | ~ | ~ | 0.918 | 0.885 | 0.901 | ~ | ~ | ~ |
2.2.2 任意四边形文本数据集的检测结果
Method | Model | Source | Time | Method Category | IC15 [70] | MSRA-TD500 [71] | USTB-SV1K [65] | SVT [66] | ||||||||
P | R | F | P | R | F | P | R | F | P | R | F | |||||
Le et al. [5] | HOCC | CVPR | 2014 | Traditional | ~ | ~ | ~ | 0.71 | 0.62 | 0.66 | ~ | ~ | ~ | ~ | ~ | ~ |
Yin et al. [7] | TPAMI | 2015 | ~ | ~ | ~ | 0.81 | 0.63 | 0.71 | 0.499 | 0.454 | 0.475 | ~ | ~ | ~ | ||
Wu et al. [9] | TMM | 2015 | ~ | ~ | ~ | 0.63 | 0.70 | 0.66 | ~ | ~ | ~ | ~ | ~ | ~ | ||
Tian et al. [17] | IJCAI | 2016 | ~ | ~ | ~ | 0.95 | 0.58 | 0.721 | 0.537 | 0.488 | 0.51 | ~ | ~ | ~ | ||
Yang et al. [33] | TIP | 2017 | ~ | ~ | ~ | 0.95 | 0.58 | 0.72 | 0.54 | 0.49 | 0.51 | ~ | ~ | ~ | ||
Liang et al. [8] | TIP | 2015 | ~ | ~ | ~ | 0.74 | 0.66 | 0.70 | ~ | ~ | ~ | ~ | ~ | ~ | ||
Zhang et al. [14] | CVPR | 2016 | Segmentation | 0.71 | 0.43 | 0.54 | 0.83 | 0.67 | 0.74 | ~ | ~ | ~ | ~ | ~ | ~ | |
Zhu et al. [16] | CVPR | 2016 | 0.81 | 0.91 | 0.85 | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ||
He et al. [18] | Text-CNN | TIP | 2016 | ~ | ~ | ~ | 0.76 | 0.61 | 0.69 | ~ | ~ | ~ | ~ | ~ | ~ | |
Yao et al. [21] | arXiv | 2016 | 0.723 | 0.587 | 0.648 | 0.765 | 0.753 | 0.759 | ~ | ~ | ~ | ~ | ~ | ~ | ||
Hu et al. [27] | WordSup | ICCV | 2017 | 0.793 | 0.77 | 0.782 | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | |
Wu et al. [28] | ICCV | 2017 | 0.91 | 0.78 | 0.84 | 0.77 | 0.78 | 0.77 | ~ | ~ | ~ | ~ | ~ | ~ | ||
Dai et al. [35] | FTSN | arXiv | 2017 | 0.886 | 0.80 | 0.841 | 0.876 | 0.771 | 0.82 | ~ | ~ | ~ | ~ | ~ | ~ | |
Deng et al. [40] | PixelLink | AAAI | 2018 | 0.855 | 0.820 | 0.837 | 0.830 | 0.732 | 0.778 | ~ | ~ | ~ | ~ | ~ | ~ | |
Liu et al. [42] | MCN | CVPR | 2018 | 0.72 | 0.80 | 0.76 | 0.88 | 0.79 | 0.83 | ~ | ~ | ~ | ~ | ~ | ~ | |
Lyu et al. [43] | CVPR | 2018 | 0.895 | 0.797 | 0.843 | 0.876 | 0.762 | 0.815 | ~ | ~ | ~ | ~ | ~ | ~ | ||
Chu et al. [45] | Border | ECCV | 2018 | ~ | ~ | ~ | 0.830 | 0.774 | 0.801 | ~ | ~ | ~ | ~ | ~ | ~ | |
Long et al. [46] | TextSnake | ECCV | 2018 | 0.849 | 0.804 | 0.826 | 0.832 | 0.739 | 0.783 | ~ | ~ | ~ | ~ | ~ | ~ | |
Yang et al. [47] | IncepText | IJCAI | 2018 | 0.938 | 0.873 | 0.905 | 0.875 | 0.790 | 0.830 | ~ | ~ | ~ | ~ | ~ | ~ | |
Wang et al. [54] | PSENet | CVPR | 2019 | 0.8692 | 0.845 | 0.8569 | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | |
Xu et al.[57] | TextField | arXiv | 2018 | 0.843 | 0.805 | 0.824 | 0.874 | 0.759 | 0.813 | ~ | ~ | ~ | ~ | ~ | ~ | |
Tian et al. [58] | FTDN | ICIP | 2018 | 0.847 | 0.773 | 0.809 | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | |
Tian et al. [83] | CVPR | 2019 | 0.883 | 0.850 | 0.866 | 0.842 | 0.817 | 0.829 | ~ | ~ | ~ | ~ | ~ | ~ | ||
Baek et al. [62] | CRAFT | CVPR | 2019 | 0.898 | 0.843 | 0.869 | 0.882 | 0.782 | 0.829 | ~ | ~ | ~ | ~ | ~ | ~ | |
Richardson et al. [87] | IJCAI | 2019 | 0.853 | 0.83 | 0.827 | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ||
Wang et al. [88] | SAST | ACMM | 2019 | 0.8755 | 0.8734 | 0.8744 | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | |
Wang et al. [90] | PAN | ICCV | 2019 | 0.84 | 0.819 | 0.829 | 0.844 | 0.838 | 0.821 | ~ | ~ | ~ | ~ | ~ | ~ | |
Gupta et al. [15] | FCRN | CVPR | 2016 | Regression | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | 0.651 | 0.599 | 0.624 |
Liu et al. [25] | DMPNet | CVPR | 2017 | 0.732 | 0.682 | 0.706 | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | |
He et al. [26] | DDR | ICCV | 2017 | 0.82 | 0.80 | 0.81 | 0.77 | 0.70 | 0.74 | ~ | ~ | ~ | ~ | ~ | ~ | |
Jiang et al. [36] | R2CNN | arXiv | 2017 | 0.856 | 0.797 | 0.825 | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | |
Xing et al. [37] | ArbiText | arXiv | 2017 | 0.792 | 0.735 | 0.759 | 0.78 | 0.72 | 0.75 | ~ | ~ | ~ | ~ | ~ | ~ | |
Wang et al. [41] | ITN | CVPR | 2018 | 0.857 | 0.741 | 0.795 | 0.903 | 0.723 | 0.803 | ~ | ~ | ~ | ~ | ~ | ~ | |
Liao et al. [44] | RRD | CVPR | 2018 | 0.88 | 0.8 | 0.838 | 0.876 | 0.73 | 0.79 | ~ | ~ | ~ | ~ | ~ | ~ | |
Liao et al. [49] | TextBoxes++ | TIP | 2018 | 0.878 | 0.785 | 0.829 | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | |
He et al. [50] | TIP | 2018 | 0.85 | 0.80 | 0.82 | 0.91 | 0.81 | 0.86 | ~ | ~ | ~ | ~ | ~ | ~ | ||
Ma et al. [51] | RRPN | TMM | 2018 | 0.822 | 0.732 | 0.774 | 0.821 | 0.677 | 0.742 | ~ | ~ | ~ | ~ | ~ | ~ | |
Zhu et al. [55] | SLPR | arXiv | 2018 | 0.855 | 0.836 | 0.845 | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | |
Deng et al. [56] | arXiv | 2018 | 0.89 | 0.81 | 0.845 | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ||
Sabyasachi et al. [60] | RGC | ICIP | 2018 | 0.83 | 0.81 | 0.82 | 0.85 | 0.76 | 0.80 | ~ | ~ | ~ | ~ | ~ | ~ | |
Wang et al. [82] | CVPR | 2019 | 0.892 | 0.86 | 0.876 | 0.852 | 0.821 | 0.836 | ~ | ~ | ~ | ~ | ~ | ~ | ||
He et al. [29] | SSTD | ICCV | 2017 | 0.80 | 0.73 | 0.77 | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | |
Tian et al. [13] | CTPN | ECCV | 2016 | 0.74 | 0.52 | 0.61 | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | |
He et al. [19] | ACCV | 2016 | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | 0.87 | 0.73 | 0.79 | ||
Shi et al. [23] | SegLink | CVPR | 2017 | 0.731 | 0.768 | 0.75 | 0.86 | 0.70 | 0.77 | ~ | ~ | ~ | ~ | ~ | ~ | |
Wang et al. [86] | DSRN | IJCAI | 2019 | 0.832 | 0.796 | 0.814 | 0.876 | 0.712 | 0.785 | ~ | ~ | ~ | ~ | ~ | ~ | |
Tang et al.[89] | Seglink++ | PR | 2019 | 0.837 | 0.803 | 0.820 | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | |
Wang et al. [92] | ContourNet | CVPR | 2020 | 0.876 | 0.861 | 0.869 | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | |
Tang et al. [52] | SSFT | TMM | 2018 | Hybrid | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | 0.541 | 0.758 | 0.631 |
Xie et al.[61] | SPCNet | AAAI | 2019 | 0.89 | 0.86 | 0.87 | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | |
Liu et al. [64] | PMTD | arXiv | 2019 | 0.913 | 0.874 | 0.893 | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | |
Liu et al. [80] | BDN | IJCAI | 2019 | 0.881 | 0.846 | 0.863 | 0.87 | 0.815 | 0.842 | ~ | ~ | ~ | ~ | ~ | ~ | |
Zhang et al. [81] | LOMO | CVPR | 2019 | 0.878 | 0.876 | 0.877 | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | |
Zhou et al. [24] | EAST | CVPR | 2017 | 0.833 | 0.783 | 0.807 | 0.873 | 0.674 | 0.761 | ~ | ~ | ~ | ~ | ~ | ~ | |
Yue et al. [48] | BMVC | 2018 | 0.866 | 0.789 | 0.823 | ~ | ~ | ~ | ~ | ~ | ~ | 0.691 | 0.660 | 0.675 | ||
Zhong et al. [53] | AF-RPN | arXiv | 2018 | 0.89 | 0.83 | 0.86 | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | |
Xue et al.[85] | MSR | IJCAI | 2019 | ~ | ~ | ~ | 0.874 | 0.767 | 0.817 | ~ | ~ | ~ | ~ | ~ | ~ | |
Liao et al. [91] | DB | AAAI | 2020 | 0.918 | 0.832 | 0.873 | 0.915 | 0.792 | 0.849 | ~ | ~ | ~ | ~ | ~ | ~ | |
Xiao et al. [93] | SDM | ECCV | 2020 | 0.9196 | 0.8922 | 0.9057 | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ |
Method | Model | Source | Time | Method Category | IC15 [70] | MSRA-TD500 [71] | USTB-SV1K [65] | SVT [66] | ||||||||
P | R | F | P | R | F | P | R | F | P | R | F | |||||
Le et al. [5] | HOCC | CVPR | 2014 | Traditional | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | 0.80 | 0.73 | 0.76 |
Yao et al. [21] | arXiv | 2016 | Segmentation | 0.432 | 0.27 | 0.333 | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | |
Hu et al. [27] | WordSup | ICCV | 2017 | 0.452 | 0.309 | 0.368 | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | |
Lyu et al. [43] | CVPR | 2018 | 0.351 | 0.348 | 0.349 | ~ | ~ | ~ | 0.743 | 0.706 | 0.724 | ~ | ~ | ~ | ||
Chu et al. [45] | Border | ECCV | 2018 | ~ | ~ | ~ | 0.782 | 0.588 | 0.671 | 0.777 | 0.621 | 0.690 | ~ | ~ | ~ | |
Yang et al. [47] | IncepText | IJCAI | 2018 | ~ | ~ | ~ | 0.785 | 0.569 | 0.660 | ~ | ~ | ~ | ~ | ~ | ~ | |
Wang et al. [54] | PSENet | CVPR | 2019 | ~ | ~ | ~ | ~ | ~ | ~ | 0.7535 | 0.6918 | 0.7213 | ~ | ~ | ~ | |
Baek et al. [62] | CRAFT | CVPR | 2019 | ~ | ~ | ~ | ~ | ~ | ~ | 0.806 | 0.682 | 0.739 | ~ | ~ | ~ | |
He et al. [29] | SSTD | ICCV | 2017 | Regression | 0.46 | 0.31 | 0.37 | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ |
Gupta et al. [15] | FCRN | CVPR | 2016 | ~ | ~ | ~ | ~ | ~ | ~ | 0.844 | 0.763 | 0.801 | ~ | ~ | ~ | |
Liao et al. [49] | TextBoxes++ | TIP | 2018 | 0.61 | 0.57 | 0.59 | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | |
Ma et al. [51] | RRPN | TMM | 2018 | ~ | ~ | ~ | ~ | ~ | ~ | 0.7669 | 0.5794 | 0.6601 | ~ | ~ | ~ | |
Deng et al. [56] | arXiv | 2018 | 0.555 | 0.633 | 0.591 | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ||
Cai et al. [59] | FFN | ICIP | 2018 | 0.43 | 0.35 | 0.39 | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | |
Xie et al. [79] | DeRPN | AAAI | 2019 | 0.586 | 0.557 | 0.571 | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | |
He et al. [29] | SSTD | ICCV | 2017 | 0.46 | 0.31 | 0.37 | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | |
Liao et al. [44] | RRD | CVPR | 2018 | ~ | ~ | ~ | 0.591 | 0.775 | 0.670 | ~ | ~ | ~ | ~ | ~ | ~ | |
Richardson et al. [87] | IJCAI | 2019 | ~ | ~ | ~ | ~ | ~ | ~ | 0.729 | 0.618 | 0.669 | ~ | ~ | ~ | ||
Wang et al. [88] | SAST | ACMM | 2019 | ~ | ~ | ~ | ~ | ~ | ~ | 0.7935 | 0.6653 | 0.7237 | ~ | ~ | ~ | |
Xie et al.[61] | SPCNet | AAAI | 2019 | Hybrid | ~ | ~ | ~ | ~ | ~ | ~ | 0.806 | 0.686 | 0.741 | ~ | ~ | ~ |
Liu et al. [64] | PMTD | arXiv | 2019 | ~ | ~ | ~ | ~ | ~ | ~ | 0.844 | 0.763 | 0.801 | ~ | ~ | ~ | |
Liu et al. [80] | BDN | IJCAI | 2019 | ~ | ~ | ~ | ~ | ~ | ~ | 0.791 | 0.698 | 0.742 | ~ | ~ | ~ | |
Zhang et al. [81] | LOMO | CVPR | 2019 | ~ | ~ | ~ | 0.791 | 0.602 | 0.684 | 0.802 | 0.672 | 0.731 | ~ | ~ | ~ | |
Zhou et al. [24] | EAST | CVPR | 2017 | 0.504 | 0.324 | 0.395 | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | ~ | |
Zhong et al. [53] | AF-RPN | arXiv | 2018 | ~ | ~ | ~ | ~ | ~ | ~ | 0.75 | 0.66 | 0.70 | ~ | ~ | ~ | |
Liao et al. [91] | DB | AAAI | 2020 | ~ | ~ | ~ | ~ | ~ | ~ | 0.831 | 0.679 | 0.747 | ~ | ~ | ~ | |
Xiao et al. [93] | SDM | ECCV | 2020 | ~ | ~ | ~ | ~ | ~ | ~ | 0.8679 | 0.7526 | 0.8061 | ~ | ~ | ~ |
2.2.3 不规则文本数据集的检测结果
在本节中,我们仅选择适用于不规则文本检测的那些方法。
Method | Model | Source | Time | Method Category | Total-text [74] | SCUT-CTW1500 [75] | ||||
P | R | F | P | R | F | |||||
Baek et al. [62] | CRAFT | CVPR | 2019 | Segmentation | 0.876 | 0.799 | 0.836 | 0.860 | 0.811 | 0.835 |
Long et al. [46] | TextSnake | ECCV | 2018 | 0.827 | 0.745 | 0.784 | 0.679 | 0.853 | 0.756 | |
Tian et al. [83] | CVPR | 2019 | ~ | ~ | ~ | 81.7 | 84.2 | 80.1 | ||
Wang et al. [54] | PSENet | CVPR | 2019 | 0.840 | 0.779 | 0.809 | 0.848 | 0.797 | 0.822 | |
Wang et al. [88] | SAST | ACMM | 2019 | 0.8557 | 0.7549 | 0.802 | 0.8119 | 0.8171 | 0.8145 | |
Wang et al. [90] | PAN | ICCV | 2019 | 0.893 | 0.81 | 0.85 | 0.864 | 0.812 | 0.837 | |
Zhu et al. [55] | SLPR | arXiv | 2018 | Regression | ~ | ~ | ~ | 0.801 | 0.701 | 0.748 |
Liu et al. [63] | CTD+TLOC | PR | 2019 | ~ | ~ | ~ | 0.774 | 0.698 | 0.734 | |
Wang et al. [82] | CVPR | 2019 | ~ | ~ | ~ | 80.1 | 80.2 | 80.1 | ||
Liu et al. [84] | CVPR | 2019 | 0.814 | 0.791 | 0.802 | 0.787 | 0.761 | 0.774 | ||
Tang et al.[89] | Seglink++ | PR | 2019 | 0.829 | 0.809 | 0.815 | 0.828 | 0.798 | 0.813 | |
Wang et al. [92] | ContourNet | CVPR | 2020 | 0.869 | 0.839 | 0.854 | 0.837 | 0.841 | 0.839 | |
Zhang et al. [81] | LOMO | CVPR | 2019 | Hybrid | 0.876 | 0.793 | 0.833 | 0.857 | 0.765 | 0.808 |
Xie et al.[61] | SPCNet | AAAI | 2019 | 0.83 | 0.83 | 0.83 | ~ | ~ | ~ | |
Xue et al.[85] | MSR | IJCAI | 2019 | 0.852 | 0.73 | 0.768 | 0.838 | 0.778 | 0.807 | |
Liao et al. [91] | DB | AAAI | 2020 | 0.871 | 0.825 | 0.847 | 0.869 | 0.802 | 0.834 | |
Xiao et al.[93] | SDM | ECCV | 2020 | 0.9085 | 0.8603 | 0.8837 | 0.884 | 0.8442 | 0.8636 |
3. 综述
[A] [TPAMI-2015] Ye Q, Doermann D. Text detection and recognition in imagery: A survey[J]. IEEE transactions on pattern analysis and machine intelligence, 2015, 37(7): 1480-1500. paper
[B] [Frontiers-Comput. Sci-2016] Zhu Y, Yao C, Bai X. Scene text detection and recognition: Recent advances and future trends[J]. Frontiers of Computer Science, 2016, 10(1): 19-36. paper
[C] [arXiv-2018] Long S, He X, Ya C. Scene Text Detection and Recognition: The Deep Learning Era[J]. arXiv preprint arXiv:1811.04256, 2018. paper
4. Evaluation
如果您有兴趣开发更好的场景文本检测指标,那么这里推荐的一些参考可能会有用:
[A] Wolf, Christian, and Jean-Michel Jolion. "Object count/area graphs for the evaluation of object detection and segmentation algorithms." International Journal of Document Analysis and Recognition (IJDAR) 8.4 (2006): 280-296. paper
[B] D. Karatzas, L. Gomez-Bigorda, A. Nicolaou, S. K. Ghosh, A. D.Bagdanov, M. Iwamura, J. Matas, L. Neumann, V. R. Chandrasekhar, S. Lu, F. Shafait, S. Uchida, and E. Valveny. ICDAR 2015 competition on robust reading. In ICDAR, pages 1156–1160, 2015. paper
[C] Calarasanu, Stefania, Jonathan Fabrizio, and Severine Dubuisson. "What is a good evaluation protocol for text localization systems? Concerns, arguments, comparisons and solutions." Image and Vision Computing 46 (2016): 1-17. paper
[D] Shi, Baoguang, et al. "ICDAR2017 competition on reading chinese text in the wild (RCTW-17)." 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR). Vol. 1. IEEE, 2017. paper
[E] Nayef, N; Yin, F; Bizid, I; et al. ICDAR2017 robust reading challenge on multi-lingual scene text detection and script identification-rrc-mlt. In Document Analysis and Recognition (ICDAR), 2017 14th IAPR International Conference on, volume 1, 1454–1459. IEEE.
paper
[F] Dangla, Aliona, et al. "A first step toward a fair comparison of evaluation protocols for text detection algorithms." 2018 13th IAPR International Workshop on Document Analysis Systems (DAS). IEEE, 2018. paper
[G] He,Mengchao and Liu, Yuliang, et al. ICPR2018 Contest on Robust Reading for Multi-Type Web images. ICPR 2018. paper
[H] Liu, Yuliang and Jin, Lianwen, et al. "Tightness-aware Evaluation Protocol for Scene Text Detection" Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2019. paper code
5. OCR Service
OCR | API | Free |
---|---|---|
Tesseract OCR Engine | × | √ |
Azure | √ | √ |
ABBYY | √ | √ |
OCR Space | √ | √ |
SODA PDF OCR | √ | √ |
Free Online OCR | √ | √ |
Online OCR | √ | √ |
Super Tools | √ | √ |
Online Chinese Recognition | √ | √ |
Calamari OCR | × | √ |
Tencent OCR | √ | × |
6. References and Code
[1] Yao C, Bai X, Liu W, et al. Detecting texts of arbitrary orientations in natural images. 2012 IEEE Conference on Computer Vision and Pattern Recognition(CVPR), 2012: 1083-1090. Paper |
[2] Yin X C, Yin X, Huang K, et al. Robust text detection in natural scene images. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2013, 36(5): 970-83. Paper |
[3] Li Y, Jia W, Shen C, et al. Characterness: An indicator of text in the wild. IEEE transactions on image processing, 2014, 23(4): 1666-1677. Paper |
[4] Huang W, Qiao Y, Tang X. Robust scene text detection with convolution neural network induced mser trees. European Conference on Computer Vision(ECCV), 2014: 497-511. Paper |
[5] Kang L, Li Y, Doermann D. Orientation robust text line detection in natural images. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2014: 4034-4041. Paper |
[6] Sun L, Huo Q, Jia W, et al. A robust approach for text detection from natural scene images. Pattern Recognition, 2015, 48(9): 2906-2920. Paper |
[7] Yin X C, Pei W Y, Zhang J, et al. Multi-orientation scene text detection with adaptive clustering. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2015 (9): 1930-1937. Paper |
[8] Liang G, Shivakumara P, Lu T, et al. Multi-spectral fusion based approach for arbitrarily oriented scene text detection in video images. IEEE Transactions on Image Processing, 2015, 24(11): 4488-4501. Paper |
[9] Wu L, Shivakumara P, Lu T, et al. A New Technique for Multi-Oriented Scene Text Line Detection and Tracking in Video. IEEE Trans. Multimedia, 2015, 17(8): 1137-1152. Paper |
[10] Zheng Z, Wei S, et al. Symmetry-based text line detection in natural scenes. IEEE Conference on Computer Vision & Pattern Recognition(CVPR), 2015. Paper |
[11] Tian S, Pan Y, Huang C, et al. Text flow: A unified text detection system in natural scene images. Proceedings of the IEEE international conference on computer vision(ICCV). 2015: 4651-4659. Paper |
[12] Buta M, et al. FASText: Efficient unconstrained scene text detector. 2015 IEEE International Conference on Computer Vision (ICCV). 2015: 1206-1214. Paper |
[13] Tian Z, Huang W, He T, et al. Detecting text in natural image with connectionist text proposal network. European conference on computer vision(ECCV), 2016: 56-72. Paper Code |
[14] Zhang Z, Zhang C, Shen W, et al. Multi-oriented text detection with fully convolutional networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR). 2016: 4159-4167. Paper |
[15] Gupta A, Vedaldi A, Zisserman A. Synthetic data for text localisation in natural images. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR). 2016: 2315-2324. Paper Code |
[16] S. Zhu and R. Zanibbi, A Text Detection System for Natural Scenes with Convolutional Feature Learning and Cascaded Classification, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016: 625-632. Paper |
[17] Tian S, Pei W Y, Zuo Z Y, et al. Scene Text Detection in Video by Learning Locally and Globally. IJCAI. 2016: 2647-2653. Paper |
[18] He T, Huang W, Qiao Y, et al. Text-attentional convolutional neural network for scene text detection. IEEE transactions on image processing, 2016, 25(6): 2529-2541. Paper |
[19] He, Dafang and Yang, Xiao and Huang, Wenyi and Zhou, Zihan and Kifer, Daniel and Giles, C Lee. Aggregating local context for accurate scene text detection. ACCV, 2016. Paper |
[20] Zhong Z, Jin L, Zhang S, et al. Deeptext: A unified framework for text proposal generation and text detection in natural images. arXiv preprint arXiv:1605.07314, 2016. Paper |
[21] Yao C, Bai X, Sang N, et al. Scene text detection via holistic, multi-channel prediction. arXiv preprint arXiv:1606.09002, 2016. Paper |
[22] Liao M, Shi B, Bai X, et al. TextBoxes: A Fast Text Detector with a Single Deep Neural Network. AAAI. 2017: 4161-4167. Paper Code |
[23] Shi B, Bai X, Belongie S. Detecting Oriented Text in Natural Images by Linking Segments. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2017: 3482-3490. Paper Code |
[24] Zhou X, Yao C, Wen H, et al. EAST: an efficient and accurate scene text detector. CVPR, 2017: 2642-2651. Paper Code |
[25] Liu Y, Jin L. Deep matching prior network: Toward tighter multi-oriented text detection. CVPR, 2017: 3454-3461. Paper |
[26] He W, Zhang X Y, Yin F, et al. Deep Direct Regression for Multi-Oriented Scene Text Detection. Proceedings of the IEEE International Conference on Computer Vision (ICCV). 2017: 745-753. Paper |
[27] Hu H, Zhang C, Luo Y, et al. Wordsup: Exploiting word annotations for character based text detection. ICCV, 2017. Paper |
[28] Wu Y, Natarajan P. Self-organized text detection with minimal post-processing via border learning. ICCV, 2017. Paper |
[29] He P, Huang W, He T, et al. Single shot text detector with regional attention. The IEEE International Conference on Computer Vision (ICCV). 2017, 6(7). Paper Code |
[30] Tian S, Lu S, Li C. Wetext: Scene text detection under weak supervision. ICCV, 2017. Paper |
[31] Zhu, Xiangyu and Jiang, Yingying et al. Deep Residual Text Detection Network for Scene Text. ICDAR, 2017. Paper |
[32] Tang Y , Wu X. Scene Text Detection and Segmentation Based on Cascaded Convolution Neural Networks. IEEE Transactions on Image Processing, 2017, 26(3):1509-1520. Paper |
[33] Yang C, Yin X C, Pei W Y, et al. Tracking Based Multi-Orientation Scene Text Detection: A Unified Framework with Dynamic Programming. IEEE Transactions on Image Processing, 2017. Paper |
[34] X. Ren, Y. Zhou, J. He, K. Chen, X. Yang and J. Sun, A Convolutional Neural Network-Based Chinese Text Detection Algorithm via Text Structure Modeling. in IEEE Transactions on Multimedia, vol. 19, no. 3, pp. 506-518, March 2017. Paper |
[35] Dai Y, Huang Z, Gao Y, et al. Fused text segmentation networks for multi-oriented scene text detection. arXiv preprint arXiv:1709.03272, 2017. Paper |
[36] Jiang Y, Zhu X, Wang X, et al. R2CNN: rotational region CNN for orientation robust scene text detection. arXiv preprint arXiv:1706.09579, 2017. Paper |
[37] Xing D, Li Z, Chen X, et al. ArbiText: Arbitrary-Oriented Text Detection in Unconstrained Scene. arXiv preprint arXiv:1711.11249, 2017. Paper |
[38] C. Wang, F. Yin and C. Liu, Scene Text Detection with Novel Superpixel Based Character Candidate Extraction. in 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), 2017, pp. 929-934. Paper |
[39] Sheng Zhang, Yuliang Liu, Lianwen Jin et al. Feature Enhancement Network: A Refined Scene Text Detector. In AAAI 2018. Paper |
[40] Dan Deng et al. PixelLink: Detecting Scene Text via Instance Segmentation. In AAAI 2018. Paper Code |
[41] Fangfang Wang, Liming Zhao, Xi L et al. Geometry-Aware Scene Text Detection with Instance Transformation Network. In CVPR 2018. Paper |
[42] Zichuan Liu, Guosheng Lin, Sheng Yang et al. Learning Markov Clustering Networks for Scene Text Detection. In CVPR 2018. Paper |
[43] Pengyuan Lyu, Cong Yao, Wenhao Wu et al. Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation. In CVPR 2018. Paper |
[44] Minghui L, Zhen Z, Baoguang S. Rotation-Sensitive Regression for Oriented Scene Text Detection. In CVPR 2018. Paper |
[45] Chuhui Xue et al. Accurate Scene Text Detection through Border Semantics Awareness and Bootstrapping. In ECCV 2018. Paper |
[46] Long, Shangbang and Ruan, Jiaqiang, et al. TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes. In ECCV, 2018. Paper |
[47] Qiangpeng Yang, Mengli Cheng et al. IncepText: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Oriented Scene Text Detection. In IJCAI 2018. Paper |
[48] Xiaoyu Yue et al. Boosting up Scene Text Detectors with Guided CNN. In BMVC 2018. Paper |
[49] Liao M, Shi B , Bai X. TextBoxes++: A Single-Shot Oriented Scene Text Detector. IEEE Transactions on Image Processing, 2018, 27(8):3676-3690. Paper Code |
[50] W. He, X. Zhang, F. Yin and C. Liu, Multi-Oriented and Multi-Lingual Scene Text Detection With Direct Regression, in IEEE Transactions on Image Processing, vol. 27, no. 11, pp.5406-5419, 2018. Paper |
[51] Ma J, Shao W, Ye H, et al. Arbitrary-oriented scene text detection via rotation proposals.in IEEE Transactions on Multimedia, 2018. Paper Code |
[52] Youbao Tang and Xiangqian Wu. Scene Text Detection Using Superpixel-Based Stroke Feature Transform and Deep Learning Based Region Classification. In TMM, 2018. Paper |
[53] Zhuoyao Zhong, Lei Sun and Qiang Huo. An Anchor-Free Region Proposal Network for Faster R-CNN based Text Detection Approaches. arXiv preprint arXiv:1804.09003. 2018. Paper |
[54] Wenhai W, Enze X, et al. Shape Robust Text Detection with Progressive Scale Expansion Network. In CVPR 2019. Paper Code |
[55] Zhu Y, Du J. Sliding Line Point Regression for Shape Robust Scene Text Detection. arXiv preprint arXiv:1801.09969, 2018. Paper |
[56] Linjie D, Yanxiang Gong, et al. Detecting Multi-Oriented Text with Corner-based Region Proposals. arXiv preprint arXiv: 1804.02690, 2018. Paper Code |
[57] Yongchao Xu, Yukang Wang, Wei Zhou, et al. TextField: Learning A Deep Direction Field for Irregular Scene Text Detection. arXiv preprint arXiv: 1812.01393, 2018. Paper |
[58] Xiaowei Tian, Dao Wu, Rui Wang, Xiaochun Cao. Focal Text: an Accurate Text Detection with Focal Loss. In ICIP 2018. Paper |
[59] Chenqin C, Pin L, Bing S. Feature Fusion Network for Scene Text Detection. In ICIP, 2018. Paper |
[60] Sabyasachi Mohanty et al. Recurrent Global Convolutional Network for Scene Text Detection. In ICIP 2018. Paper |
[61] Enze Xie, et al. Scene Text Detection with Supervised Pyramid Context Network. In AAAI 2019. Paper |
[62] Youngmin Baek, Bado Lee, et al. Character Region Awareness for Text Detection. In CVPR 2019. Paper |
[63] Yuliang L, Lianwen J, Shuaitao Z, et al. Curved Scene Text Detection via Transverse and Longitudinal Sequence Connection. Pattern Recognition, 2019. Paper Code |
[64] Jingchao Liu, Xuebo Liu, et al, Pyramid Mask Text Detector. arXiv preprint arXiv:1903.11800, 2019. Paper Code |
[79] Lele Xie, Yuliang Liu, Lianwen Jin, Zecheng Xie, DeRPN: Taking a further step toward more general object detection. In AAAI, 2019. Paper Code |
[80] Yuliang Liu, Lianwen Jin, et al, Omnidirectional Scene Text Detction with Sequential-free Box Discretization. In IJCAI, 2019.Paper Code |
[81] Chengquan Zhang, Borong Liang, et al, Look More Than Once: An Accurate Detector for Text of Arbitrary Shapes. In CVPR, 2019.Paper |
[82] Xiaobing Wang, Yingying Jiang, et al, Arbitrary Shape Scene Text Detection with Adaptive Text Region Representation. In CVPR, 2019. Paper |
[83] Zhuotao Tian, Michelle Shu, et al, Learning Shape-Aware Embedding for Scene Text Detection. In CVPR, 2019. Paper |
[84] Zichuan Liu, Guosheng Lin, et al, Towards Robust Curve Text Detection with Conditional Spatial Expansion. In CVPR, 2019. Paper |
[85] Xue C, Lu S, Zhang W. MSR: multi-scale shape regression for scene text detection. In IJCAI, 2019. Paper |
[86] Wang Y, Xie H, Fu Z, et al. DSRN: a deep scale relationship network for scene text detection. In IJCAI, 2019: 947-953. Paper |
[87] Elad Richardson, et al, It's All About The Scale -- Efficient Text Detection Using Adaptive Scaling. In WACV, 2020. Paper |
[88] Pengfei Wang, et al, A Single-Shot Arbitrarily-Shaped Text Detector based on Context Attended Multi-Task Learning. In ACMM, 2019. Paper |
[89] Jun Tang, et al, SegLink ++: Detecting Dense and Arbitrary-shaped Scene Text by Instance-aware Component Grouping. In PR, 2019. Paper |
[90] Wenhai Wang, et al, Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network. In ICCV, 2019. Paper |
[91] Minghui Liao, et al, Real-time Scene Text Detection with Differentiable Binarization. In AAAI, 2020. PaperCode |
[92] Wang, Yuxin, et al. ContourNet: Taking a Further Step toward Accurate Arbitrary-shaped Scene Text Detection. CVPR. 2020. PaperCode |
[93] Xiao, et al, Sequential Deformation for Accurate Scene Text Detection. In ECCV, 2020. Paper |
Datasets |
USTB-SV1K[65]:Xu-Cheng Yin, Xuwang Yin, Kaizhu Huang, and Hong-Wei Hao, Robust text detection in natural scene images, IEEE Trans. Pattern Analysis and Machine Intelligence (TPAMI), priprint, 2013. Paper |
SVT[66]: Wang,Kai, and S. Belongie. Word Spotting in the Wild. European Conference on Computer Vision(ECCV), 2010: 591-604. Paper |
ICDAR2005[67]: Lucas, S: ICDAR 2005 text locating competition results. In: ICDAR ,2005. Paper |
ICDAR2011[68]: Shahab, A, Shafait, F, Dengel, A: ICDAR 2011 robust reading competition challenge 2: Reading text in scene images. In: ICDAR, 2011. Paper |
ICDAR2013[69]:D. Karatzas, F. Shafait, S. Uchida, et al. ICDAR 2013 robust reading competition. In ICDAR, 2013. Paper |
ICDAR2015[70]:D. Karatzas, L. Gomez-Bigorda, A. Nicolaou, S. K. Ghosh, A. D.Bagdanov, M. Iwamura, J. Matas, L. Neumann, V. R. Chandrasekhar, S. Lu, F. Shafait, S. Uchida, and E. Valveny. ICDAR 2015 competition on robust reading. In ICDAR, pages 1156–1160, 2015. Paper |
MSRA-TD500[71]:C. Yao, X. Bai, W. Liu, Y. Ma, and Z. Tu, Detecting texts of arbitrary orientations in natural images. in Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2012, pp.1083–1090.Paper |
COCO-Text[72]:Veit A, Matera T, Neumann L, et al. Coco-text: Dataset and benchmark for text detection and recognition in natural images. arXiv preprint arXiv:1601.07140, 2016. Paper |
RCTW-17[73]:Shi B, Yao C, Liao M, et al. ICDAR2017 competition on reading chinese text in the wild (RCTW-17). Document Analysis and Recognition (ICDAR), 2017 14th IAPR International Conference on. IEEE, 2017, 1: 1429-1434. Paper |
Total-Text[74]:Chee C K, Chan C S. Total-text: A comprehensive dataset for scene text detection and recognition.Document Analysis and Recognition (ICDAR), 2017 14th IAPR International Conference on. IEEE, 2017, 1: 935-942.Paper |
SCUT-CTW1500[75]:Yuliang L, Lianwen J, Shuaitao Z, et al. Curved Scene Text Detection via Transverse and Longitudinal Sequence Connection. Pattern Recognition, 2019.Paper |
MLT 2017[76]: Nayef, N; Yin, F; Bizid, I; et al. ICDAR2017 robust reading challenge on multi-lingual scene text detection and script identification-rrc-mlt. In Document Analysis and Recognition (ICDAR), 2017 14th IAPR International Conference on, volume 1, 1454–1459. IEEE. Paper |
OSTD[77]: Chucai Yi and YingLi Tian, Text string detection from natural scenes by structure-based partition and grouping, In IEEE Transactions on Image Processing, vol. 20, no. 9, pp. 2594–2605, 2011. Paper |
CTW[78]: Yuan T L, Zhu Z, Xu K, et al. Chinese Text in the Wild. arXiv preprint arXiv:1803.00085, 2018. Paper |
如果您发现我们的资源中有任何问题,或者我们错过了任何好的论文/代码,请通过liuchongyu1996@gmail.com通知我们。 感谢您的贡献。
Copyright
Copyright © 2019 SCUT-DLVC. All Rights Reserved.
评论 (0)