Abstract
Existing scene text detection methods achieve state-of-the-art performance by designing elaborate anchors or complex post-processing. Nonetheless, most methods still face the dilemma of detecting adjacent texts as one instance and long text with large character spacing as multiple fragments. To tackle these problems, we propose an anchor-free scene text detector leveraging Center-aware Representation to achieve accurate arbitrary-shaped scene text detection namely CRNet. Firstly, we propose a center-aware location algorithm to explicitly learn center regions and center points of text instances, which is able to separate adjacent text instances effectively. Then, a multi-scale context extraction module capable of extracting local context, long-range dependencies and global context adaptively is designed to effectively perceive long text with large character spacing. Finally, a low-level features enhancement block is introduced to enhance the geometric information of text. Extensive experiments conducted on several benchmarks including SCUT-CTW1500, Total-Text, ICDAR2015, ICDAR2017 MLT, and MSRA-TD500 demonstrate the effectiveness of our method. Specifically, without any anchor and complicated post-processing, our CRNet achieves 84.2% and 85.1% on CTW1500 and MSRA-TD500 in F-measure, outperforming all state-of-the-art anchor-based and anchor-free methods.
Author supplied keywords
Cite
CITATION STYLE
Zhou, Y., Xie, H., Fang, S., Li, Y., & Zhang, Y. (2020). CRNet: A Center-aware Representation for Detecting Text of Arbitrary Shapes. In MM 2020 - Proceedings of the 28th ACM International Conference on Multimedia (pp. 2571–2580). Association for Computing Machinery, Inc. https://doi.org/10.1145/3394171.3413565
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.