duyongkun commited on
Commit
e16ae54
·
1 Parent(s): 5de2f8f

update app

Browse files
Files changed (1) hide show
  1. README.md +10 -283
README.md CHANGED
@@ -1,283 +1,10 @@
1
- <div align="center">
2
-
3
- <h1> OpenOCR: A general OCR system with accuracy and efficiency </h1>
4
-
5
- <h5 align="center"> If you find this project useful, please give us a star🌟. </h5>
6
-
7
- <a href="https://github.com/Topdu/OpenOCR/blob/main/LICENSE"><img alt="license" src="https://img.shields.io/github/license/Topdu/OpenOCR"></a>
8
- <a href='https://arxiv.org/abs/2411.15858'><img src='https://img.shields.io/badge/Paper-Arxiv-red'></a>
9
- <a href="https://huggingface.co/spaces/topdu/OpenOCR-Demo" target="_blank"><img src="https://img.shields.io/badge/%F0%9F%A4%97-Hugging Face Demo-blue"></a>
10
- <a href="https://modelscope.cn/studios/topdktu/OpenOCR-Demo" target="_blank"><img src="https://img.shields.io/badge/魔搭-Demo-blue"></a>
11
- <a href=""><img src="https://img.shields.io/badge/OS-Linux%2C%20Win%2C%20Mac-pink.svg"></a>
12
- <a href="https://github.com/Topdu/OpenOCR/graphs/contributors"><img src="https://img.shields.io/github/contributors/Topdu/OpenOCR?color=9ea"></a>
13
- <a href="https://pepy.tech/project/openocr"><img src="https://static.pepy.tech/personalized-badge/openocr?period=total&units=abbreviation&left_color=grey&right_color=blue&left_text=Clone%20downloads"></a>
14
- <a href="https://github.com/Topdu/OpenOCR/stargazers"><img src="https://img.shields.io/github/stars/Topdu/OpenOCR?color=ccf"></a>
15
- <a href="https://pypi.org/project/openocr-python/"><img alt="PyPI" src="https://img.shields.io/pypi/v/openocr-python"><img src="https://img.shields.io/pypi/dm/openocr-python?label=PyPI%20downloads"></a>
16
-
17
- <a href="#quick-start"> 🚀 Quick Start </a> | English | [简体中文](./README_ch.md)
18
-
19
- </div>
20
-
21
- ______________________________________________________________________
22
-
23
- We aim to establish a unified benchmark for training and evaluating models in scene text detection and recognition. Building on this benchmark, we introduce a general OCR system with accuracy and efficiency, **OpenOCR**. This repository also serves as the official codebase of the OCR team from the [FVL Laboratory](https://fvl.fudan.edu.cn), Fudan University.
24
-
25
- We sincerely welcome the researcher to recommend OCR or relevant algorithms and point out any potential factual errors or bugs. Upon receiving the suggestions, we will promptly evaluate and critically reproduce them. We look forward to collaborating with you to advance the development of OpenOCR and continuously contribute to the OCR community!
26
-
27
- ## Features
28
-
29
- - 🔥**UniRec: Unified Text and Formula Recognition Across Granularities**
30
-
31
- - ⚡\[[Doc](./docs/unirec.md)\] \[[Model](https://huggingface.co/topdu/unirec_100m)\] \[[ModelScope Demo](https://www.modelscope.cn/studios/topdktu/OpenOCR-UniRec-Demo)\] \[[Hugging Face Demo](https://huggingface.co/spaces/topdu/OpenOCR-UniRec-Demo)\] \[[Local Demo](./docs/unirec.md#local-demo)\] \[Paper coming soon\]
32
- - Recognizing plain text (words, lines, paragraphs), formulas (single-line, multi-line), and mixed text-and-formulas content.
33
- - 0.1B parameters.
34
- - Trained from scratch on 50M data without pre-training.
35
-
36
- - 🔥**OpenOCR: A general OCR system with accuracy and efficiency**
37
-
38
- - ⚡\[[Quick Start](#quick-start)\] \[[Model](https://github.com/Topdu/OpenOCR/releases/tag/develop0.0.1)\] \[[ModelScope Demo](https://modelscope.cn/studios/topdktu/OpenOCR-Demo)\] \[[Hugging Face Demo](https://huggingface.co/spaces/topdu/OpenOCR-Demo)\] \[[Local Demo](#local-demo)\] \[[PaddleOCR Implementation](https://paddlepaddle.github.io/PaddleOCR/latest/algorithm/text_recognition/algorithm_rec_svtrv2.html)\]
39
- - [Introduction](./docs/openocr.md)
40
- - A practical OCR system building on SVTRv2.
41
- - Outperforms [PP-OCRv4](https://paddlepaddle.github.io/PaddleOCR/latest/ppocr/model_list.html) baseline by 4.5% on the [OCR competition leaderboard](https://aistudio.baidu.com/competition/detail/1131/0/leaderboard) in terms of accuracy, while preserving quite similar inference speed.
42
- - [x] Supports Chinese and English text detection and recognition.
43
- - [x] Provides server model and mobile model.
44
- - [x] Fine-tunes OpenOCR on a custom dataset: [Fine-tuning Det](./docs/finetune_det.md), [Fine-tuning Rec](./docs/finetune_rec.md).
45
- - [x] [ONNX model export for wider compatibility](#export-onnx-model).
46
-
47
- - 🔥**SVTRv2: CTC Beats Encoder-Decoder Models in Scene Text Recognition (ICCV 2025)**
48
-
49
- - \[[Paper](https://arxiv.org/abs/2411.15858)\] \[[Doc](./configs/rec/svtrv2/)\] \[[Model](./configs/rec/svtrv2/readme.md#11-models-and-results)\] \[[Datasets](./docs/svtrv2.md#downloading-datasets)\] \[[Config, Training and Inference](./configs/rec/svtrv2/readme.md#3-model-training--evaluation)\] \[[Benchmark](./docs/svtrv2.md#results-benchmark--configs--checkpoints)\]
50
- - [Introduction](./docs/svtrv2.md)
51
- - A unified training and evaluation benchmark (on top of [Union14M](https://github.com/Mountchicken/Union14M?tab=readme-ov-file#3-union14m-dataset)) for Scene Text Recognition
52
- - Supports 24 Scene Text Recognition methods trained from scratch on the large-scale real dataset [Union14M-L-Filter](./docs/svtrv2.md#dataset-details), and will continue to add the latest methods.
53
- - Improves accuracy by 20-30% compared to models trained based on synthetic datasets.
54
- - Towards Arbitrary-Shaped Text Recognition and Language modeling with a Single Visual Model.
55
- - Surpasses Attention-based Encoder-Decoder Methods across challenging scenarios in terms of accuracy and speed
56
- - [Get Started](./docs/svtrv2.md#get-started-with-training-a-sota-scene-text-recognition-model-from-scratch) with training a SOTA Scene Text Recognition model from scratch.
57
-
58
- ## Ours STR algorithms
59
-
60
- - [**SVTRv2**](./configs/rec/svtrv2) (*Yongkun Du, Zhineng Chen\*, Hongtao Xie, Caiyan Jia, Yu-Gang Jiang. SVTRv2: CTC Beats Encoder-Decoder Models in Scene Text Recognition,* ICCV 2025. [Doc](./configs/rec/svtrv2/), [Paper](https://arxiv.org/abs/2411.15858))
61
- - [**IGTR**](./configs/rec/igtr/) (*Yongkun Du, Zhineng Chen\*, Yuchen Su, Caiyan Jia, Yu-Gang Jiang. Instruction-Guided Scene Text Recognition,* TPAMI 2025. [Doc](./configs/rec/igtr), [Paper](https://ieeexplore.ieee.org/document/10820836))
62
- - [**CPPD**](./configs/rec/cppd/) (*Yongkun Du, Zhineng Chen\*, Caiyan Jia, Xiaoting Yin, Chenxia Li, Yuning Du, Yu-Gang Jiang. Context Perception Parallel Decoder for Scene Text Recognition,* TPAMI 2025. [PaddleOCR Doc](https://github.com/PaddlePaddle/PaddleOCR/blob/main/docs/algorithm/text_recognition/algorithm_rec_cppd.en.md), [Paper](https://ieeexplore.ieee.org/document/10902187))
63
- - [**SMTR&FocalSVTR**](./configs/rec/smtr/) (*Yongkun Du, Zhineng Chen\*, Caiyan Jia, Xieping Gao, Yu-Gang Jiang. Out of Length Text Recognition with Sub-String Matching,* AAAI 2025. [Doc](./configs/rec/smtr/), [Paper](https://ojs.aaai.org/index.php/AAAI/article/view/32285))
64
- - [**DPTR**](./configs/rec/dptr/) (*Shuai Zhao, Yongkun Du, Zhineng Chen\*, Yu-Gang Jiang. Decoder Pre-Training with only Text for Scene Text Recognition,* ACM MM 2024. [Paper](https://dl.acm.org/doi/10.1145/3664647.3681390))
65
- - [**CDistNet**](./configs/rec/cdistnet/) (*Tianlun Zheng, Zhineng Chen\*, Shancheng Fang, Hongtao Xie, Yu-Gang Jiang. CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition,* IJCV 2024. [Paper](https://link.springer.com/article/10.1007/s11263-023-01880-0))
66
- - **MRN** (*Tianlun Zheng, Zhineng Chen\*, Bingchen Huang, Wei Zhang, Yu-Gang Jiang. MRN: Multiplexed Routing Network for Incremental Multilingual Text Recognition,* ICCV 2023. [Paper](https://openaccess.thecvf.com/content/ICCV2023/html/Zheng_MRN_Multiplexed_Routing_Network_for_Incremental_Multilingual_Text_Recognition_ICCV_2023_paper.html), [Code](https://github.com/simplify23/MRN))
67
- - **TPS++** (*Tianlun Zheng, Zhineng Chen\*, Jinfeng Bai, Hongtao Xie, Yu-Gang Jiang. TPS++: Attention-Enhanced Thin-Plate Spline for Scene Text Recognition,* IJCAI 2023. [Paper](https://arxiv.org/abs/2305.05322), [Code](https://github.com/simplify23/TPS_PP))
68
- - [**SVTR**](./configs/rec/svtr/) (*Yongkun Du, Zhineng Chen\*, Caiyan Jia, Xiaoting Yin, Tianlun Zheng, Chenxia Li, Yuning Du, Yu-Gang Jiang. SVTR: Scene Text Recognition with a Single Visual Model,* IJCAI 2022 (Long). [PaddleOCR Doc](https://github.com/Topdu/PaddleOCR/blob/main/doc/doc_ch/algorithm_rec_svtr.md), [Paper](https://www.ijcai.org/proceedings/2022/124))
69
- - [**NRTR**](./configs/rec/nrtr/) (*Fenfen Sheng, Zhineng Chen, Bo Xu. NRTR: A No-Recurrence Sequence-to-Sequence Model For Scene Text Recognition,* ICDAR 2019. [Paper](https://arxiv.org/abs/1806.00926))
70
-
71
- ## Recent Updates
72
-
73
- - **2025.07.10**: Our paper [SVTRv2](https://arxiv.org/abs/2411.15858) is accepted by ICCV 2025. Accessible in [Doc](./configs/rec/svtrv2/).
74
-
75
- - **2025.03.24**: 🔥 Releasing the feature of fine-tuning OpenOCR on a custom dataset: [Fine-tuning Det](./docs/finetune_det.md), [Fine-tuning Rec](./docs/finetune_rec.md)
76
-
77
- - **2025.03.23**: 🔥 Releasing the feature of [ONNX model export for wider compatibility](#export-onnx-model).
78
-
79
- - **2025.02.22**: Our paper [CPPD](https://ieeexplore.ieee.org/document/10902187) is accepted by TPAMI. Accessible in [Doc](./configs/rec/cppd/) and [PaddleOCR Doc](https://github.com/PaddlePaddle/PaddleOCR/blob/main/docs/algorithm/text_recognition/algorithm_rec_cppd.en.md).
80
-
81
- - **2024.12.31**: Our paper [IGTR](https://ieeexplore.ieee.org/document/10820836) is accepted by TPAMI. Accessible in [Doc](./configs/rec/igtr/).
82
-
83
- - **2024.12.16**: Our paper [SMTR](https://ojs.aaai.org/index.php/AAAI/article/view/32285) is accepted by AAAI 2025. Accessible in [Doc](./configs/rec/smtr/).
84
-
85
- - **2024.12.03**: The pre-training code for [DPTR](https://dl.acm.org/doi/10.1145/3664647.3681390) is merged.
86
-
87
- - **🔥 2024.11.23 release notes**:
88
-
89
- - **OpenOCR: A general OCR system with accuracy and efficiency**
90
- - ⚡\[[Quick Start](#quick-start)\] \[[Model](https://github.com/Topdu/OpenOCR/releases/tag/develop0.0.1)\] \[[ModelScope Demo](https://modelscope.cn/studios/topdktu/OpenOCR-Demo)\] \[[Hugging Face Demo](https://huggingface.co/spaces/topdu/OpenOCR-Demo)\] \[[Local Demo](#local-demo)\] \[[PaddleOCR Implementation](https://paddlepaddle.github.io/PaddleOCR/latest/algorithm/text_recognition/algorithm_rec_svtrv2.html)\]
91
- - [Introduction](./docs/openocr.md)
92
- - **SVTRv2: CTC Beats Encoder-Decoder Models in Scene Text Recognition**
93
- - \[[Paper](https://arxiv.org/abs/2411.15858)\] \[[Doc](./configs/rec/svtrv2/)\] \[[Model](./configs/rec/svtrv2/readme.md#11-models-and-results)\] \[[Datasets](./docs/svtrv2.md#downloading-datasets)\] \[[Config, Training and Inference](./configs/rec/svtrv2/readme.md#3-model-training--evaluation)\] \[[Benchmark](./docs/svtrv2.md#results--configs--checkpoints)\]
94
- - [Introduction](./docs/svtrv2.md)
95
- - [Get Started](./docs/svtrv2.md#get-started-with-training-a-sota-scene-text-recognition-model-from-scratch) with training a SOTA Scene Text Recognition model from scratch.
96
-
97
- ## Quick Start
98
-
99
- **Note**: OpenOCR supports inference using both the ONNX and Torch frameworks, with the dependency environments for the two frameworks being isolated. When using ONNX for inference, there is no need to install Torch, and vice versa.
100
-
101
- ### 1. ONNX Inference
102
-
103
- #### Install OpenOCR and Dependencies:
104
-
105
- ```shell
106
- pip install openocr-python
107
- pip install onnxruntime
108
- ```
109
-
110
- #### Usage:
111
-
112
- ```python
113
- from openocr import OpenOCR
114
- onnx_engine = OpenOCR(backend='onnx', device='cpu')
115
- img_path = '/path/img_path or /path/img_file'
116
- result, elapse = onnx_engine(img_path)
117
- ```
118
-
119
- ### 2. Pytorch inference
120
-
121
- #### Dependencies:
122
-
123
- - [PyTorch](http://pytorch.org/) version >= 1.13.0
124
- - Python version >= 3.7
125
-
126
- ```shell
127
- conda create -n openocr python==3.8
128
- conda activate openocr
129
- # install gpu version torch
130
- conda install pytorch==2.2.0 torchvision==0.17.0 torchaudio==2.2.0 pytorch-cuda=11.8 -c pytorch -c nvidia
131
- # or cpu version
132
- conda install pytorch torchvision torchaudio cpuonly -c pytorch
133
- ```
134
-
135
- After installing dependencies, the following two installation methods are available. Either one can be chosen.
136
-
137
- #### 2.1. Python Modules
138
-
139
- **Install OpenOCR**:
140
-
141
- ```shell
142
- pip install openocr-python
143
- ```
144
-
145
- **Usage**:
146
-
147
- ```python
148
- from openocr import OpenOCR
149
- engine = OpenOCR()
150
- img_path = '/path/img_path or /path/img_file'
151
- result, elapse = engine(img_path)
152
-
153
- # Server mode
154
- # engine = OpenOCR(mode='server')
155
- ```
156
-
157
- #### 2.2. Clone this repository:
158
-
159
- ```shell
160
- git clone https://github.com/Topdu/OpenOCR.git
161
- cd OpenOCR
162
- pip install -r requirements.txt
163
- wget https://github.com/Topdu/OpenOCR/releases/download/develop0.0.1/openocr_det_repvit_ch.pth
164
- wget https://github.com/Topdu/OpenOCR/releases/download/develop0.0.1/openocr_repsvtr_ch.pth
165
- # Rec Server model
166
- # wget https://github.com/Topdu/OpenOCR/releases/download/develop0.0.1/openocr_svtrv2_ch.pth
167
- ```
168
-
169
- **Usage**:
170
-
171
- ```shell
172
- # OpenOCR system: Det + Rec model
173
- python tools/infer_e2e.py --img_path=/path/img_fold or /path/img_file
174
- # Det model
175
- python tools/infer_det.py --c ./configs/det/dbnet/repvit_db.yml --o Global.infer_img=/path/img_fold or /path/img_file
176
- # Rec model
177
- python tools/infer_rec.py --c ./configs/rec/svtrv2/repsvtr_ch.yml --o Global.infer_img=/path/img_fold or /path/img_file
178
- ```
179
-
180
- ##### Export ONNX model
181
-
182
- ```shell
183
- pip install onnx
184
- python tools/toonnx.py --c configs/rec/svtrv2/repsvtr_ch.yml --o Global.device=cpu
185
- python tools/toonnx.py --c configs/det/dbnet/repvit_db.yml --o Global.device=cpu
186
- ```
187
-
188
- ##### Inference with ONNXRuntime
189
-
190
- ```shell
191
- pip install onnxruntime
192
- # OpenOCR system: Det + Rec model
193
- python tools/infer_e2e.py --img_path=/path/img_fold or /path/img_file --backend=onnx --device=cpu
194
- # Det model
195
- python tools/infer_det.py --c ./configs/det/dbnet/repvit_db.yml --o Global.backend=onnx Global.device=cpu Global.infer_img=/path/img_fold or /path/img_file
196
- # Rec model
197
- python tools/infer_rec.py --c ./configs/rec/svtrv2/repsvtr_ch.yml --o Global.backend=onnx Global.device=cpu Global.infer_img=/path/img_fold or /path/img_file
198
- ```
199
-
200
- #### Local Demo
201
-
202
- ```shell
203
- pip install gradio==4.20.0
204
- wget https://github.com/Topdu/OpenOCR/releases/download/develop0.0.1/OCR_e2e_img.tar
205
- tar xf OCR_e2e_img.tar
206
- # start demo
207
- python demo_gradio.py
208
- ```
209
-
210
- ## Reproduction schedule:
211
-
212
- ### Scene Text Recognition
213
-
214
- | Method | Venue | Training | Evaluation | Contributor |
215
- | --------------------------------------------- | ---------------------------------------------------------------------------------------------- | -------- | ---------- | ------------------------------------------- |
216
- | [CRNN](./configs/rec/svtrs/) | [TPAMI 2016](https://arxiv.org/abs/1507.05717) | ✅ | ✅ | |
217
- | [ASTER](./configs/rec/aster/) | [TPAMI 2019](https://ieeexplore.ieee.org/document/8395027) | ✅ | ✅ | [pretto0](https://github.com/pretto0) |
218
- | [NRTR](./configs/rec/nrtr/) | [ICDAR 2019](https://arxiv.org/abs/1806.00926) | ✅ | ✅ | |
219
- | [SAR](./configs/rec/sar/) | [AAAI 2019](https://aaai.org/papers/08610-show-attend-and-read-a-simple-and-strong-baseline-for-irregular-text-recognition/) | ✅ | ✅ | [pretto0](https://github.com/pretto0) |
220
- | [MORAN](./configs/rec/moran/) | [PR 2019](https://www.sciencedirect.com/science/article/abs/pii/S0031320319300263) | ✅ | ✅ | |
221
- | [DAN](./configs/rec/dan/) | [AAAI 2020](https://arxiv.org/pdf/1912.10205) | ✅ | ✅ | |
222
- | [RobustScanner](./configs/rec/robustscanner/) | [ECCV 2020](https://www.ecva.net/papers/eccv_2020/papers_ECCV/html/3160_ECCV_2020_paper.php) | ✅ | ✅ | [pretto0](https://github.com/pretto0) |
223
- | [AutoSTR](./configs/rec/autostr/) | [ECCV 2020](https://www.ecva.net/papers/eccv_2020/papers_ECCV/papers/123690732.pdf) | ✅ | ✅ | |
224
- | [SRN](./configs/rec/srn/) | [CVPR 2020](https://openaccess.thecvf.com/content_CVPR_2020/html/Yu_Towards_Accurate_Scene_Text_Recognition_With_Semantic_Reasoning_Networks_CVPR_2020_paper.html) | ✅ | ✅ | [pretto0](https://github.com/pretto0) |
225
- | [SEED](./configs/rec/seed/) | [CVPR 2020](https://openaccess.thecvf.com/content_CVPR_2020/html/Qiao_SEED_Semantics_Enhanced_Encoder-Decoder_Framework_for_Scene_Text_Recognition_CVPR_2020_paper.html) | ✅ | ✅ | |
226
- | [ABINet](./configs/rec/abinet/) | [CVPR 2021](https://openaccess.thecvf.com//content/CVPR2021/html/Fang_Read_Like_Humans_Autonomous_Bidirectional_and_Iterative_Language_Modeling_for_CVPR_2021_paper.html) | ✅ | ✅ | [YesianRohn](https://github.com/YesianRohn) |
227
- | [VisionLAN](./configs/rec/visionlan/) | [ICCV 2021](https://openaccess.thecvf.com/content/ICCV2021/html/Wang_From_Two_to_One_A_New_Scene_Text_Recognizer_With_ICCV_2021_paper.html) | ✅ | ✅ | [YesianRohn](https://github.com/YesianRohn) |
228
- | PIMNet | [ACM MM 2021](https://dl.acm.org/doi/10.1145/3474085.3475238) | | | TODO |
229
- | [SVTR](./configs/rec/svtrs/) | [IJCAI 2022](https://www.ijcai.org/proceedings/2022/124) | ✅ | ✅ | |
230
- | [PARSeq](./configs/rec/parseq/) | [ECCV 2022](https://www.ecva.net/papers/eccv_2022/papers_ECCV/papers/136880177.pdf) | ✅ | ✅ | |
231
- | [MATRN](./configs/rec/matrn/) | [ECCV 2022](https://www.ecva.net/papers/eccv_2022/papers_ECCV/papers/136880442.pdf) | ✅ | ✅ | |
232
- | [MGP-STR](./configs/rec/mgpstr/) | [ECCV 2022](https://www.ecva.net/papers/eccv_2022/papers_ECCV/papers/136880336.pdf) | ✅ | ✅ | |
233
- | [LPV](./configs/rec/lpv/) | [IJCAI 2023](https://www.ijcai.org/proceedings/2023/0189.pdf) | ✅ | ✅ | |
234
- | [MAERec](./configs/rec/maerec/)(Union14M) | [ICCV 2023](https://openaccess.thecvf.com/content/ICCV2023/papers/Jiang_Revisiting_Scene_Text_Recognition_A_Data_Perspective_ICCV_2023_paper.pdf) | ✅ | ✅ | |
235
- | [LISTER](./configs/rec/lister/) | [ICCV 2023](https://openaccess.thecvf.com/content/ICCV2023/papers/Cheng_LISTER_Neighbor_Decoding_for_Length-Insensitive_Scene_Text_Recognition_ICCV_2023_paper.pdf) | ✅ | ✅ | |
236
- | [CDistNet](./configs/rec/cdistnet/) | [IJCV 2024](https://link.springer.com/article/10.1007/s11263-023-01880-0) | ✅ | ✅ | [YesianRohn](https://github.com/YesianRohn) |
237
- | [BUSNet](./configs/rec/busnet/) | [AAAI 2024](https://ojs.aaai.org/index.php/AAAI/article/view/28402) | ✅ | ✅ | |
238
- | DCTC | [AAAI 2024](https://ojs.aaai.org/index.php/AAAI/article/view/28575) | | | TODO |
239
- | [CAM](./configs/rec/cam/) | [PR 2024](https://arxiv.org/abs/2402.13643) | ✅ | ✅ | |
240
- | [OTE](./configs/rec/ote/) | [CVPR 2024](https://openaccess.thecvf.com/content/CVPR2024/html/Xu_OTE_Exploring_Accurate_Scene_Text_Recognition_Using_One_Token_CVPR_2024_paper.html) | ✅ | ✅ | |
241
- | CFF | [IJCAI 2024](https://arxiv.org/abs/2407.05562) | | | TODO |
242
- | [DPTR](./configs/rec/dptr/) | [ACM MM 2024](https://dl.acm.org/doi/10.1145/3664647.3681390) | | | [fd-zs](https://github.com/fd-zs) |
243
- | VIPTR | [ACM CIKM 2024](https://arxiv.org/abs/2401.10110) | | | TODO |
244
- | [IGTR](./configs/rec/igtr/) | [TPAMI 2025](https://ieeexplore.ieee.org/document/10820836) | ✅ | ✅ | |
245
- | [SMTR](./configs/rec/smtr/) | [AAAI 2025](https://ojs.aaai.org/index.php/AAAI/article/view/32285) | ✅ | ✅ | |
246
- | [CPPD](./configs/rec/cppd/) | [TPAMI 2025](https://ieeexplore.ieee.org/document/10902187) | ✅ | ✅ | |
247
- | [FocalSVTR-CTC](./configs/rec/svtrs/) | [AAAI 2025](https://ojs.aaai.org/index.php/AAAI/article/view/32285) | ✅ | ✅ | |
248
- | [SVTRv2](./configs/rec/svtrv2/) | [ICCV 2025](https://arxiv.org/abs/2411.15858) | ✅ | ✅ | |
249
- | [ResNet+Trans-CTC](./configs/rec/svtrs/) | | ✅ | ✅ | |
250
- | [ViT-CTC](./configs/rec/svtrs/) | | ✅ | ✅ | |
251
-
252
- #### Contributors
253
-
254
- ______________________________________________________________________
255
-
256
- Yiming Lei ([pretto0](https://github.com/pretto0)), Xingsong Ye ([YesianRohn](https://github.com/YesianRohn)), and Shuai Zhao ([fd-zs](https://github.com/fd-zs)) from the [FVL Laboratory](https://fvl.fudan.edu.cn), Fudan University, with guidance from Dr. Zhineng Chen ([Homepage](https://zhinchenfd.github.io/)), completed the majority work of the algorithm reproduction. Grateful for their outstanding contributions.
257
-
258
- ### Scene Text Detection (STD)
259
-
260
- TODO
261
-
262
- ### Text Spotting
263
-
264
- TODO
265
-
266
- ______________________________________________________________________
267
-
268
- ## Citation
269
-
270
- If you find our method useful for your reserach, please cite:
271
-
272
- ```bibtex
273
- @inproceedings{Du2024SVTRv2,
274
- title={SVTRv2: CTC Beats Encoder-Decoder Models in Scene Text Recognition},
275
- author={Yongkun Du and Zhineng Chen and Hongtao Xie and Caiyan Jia and Yu-Gang Jiang},
276
- booktitle={ICCV},
277
- year={2025}
278
- }
279
- ```
280
-
281
- # Acknowledgement
282
-
283
- This codebase is built based on the [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR), [PytorchOCR](https://github.com/WenmuZhou/PytorchOCR), and [MMOCR](https://github.com/open-mmlab/mmocr). Thanks for their awesome work!
 
1
+ title: OpenOCR UniRecDemo
2
+ emoji: 😻
3
+ colorFrom: pink
4
+ colorTo: pink
5
+ sdk: gradio
6
+ sdk_version: 5.6.0
7
+ app_file: app.py
8
+ pinned: false
9
+ license: apache-2.0
10
+ short_description: 'OCR System. Homepage: https://github.com/Topdu/OpenOCR'