Open Dataset

ArTシーンテキストデータセットは10166枚の画像を含んでいます。

5.59G

1291 hits

0 likes

2 downloads

0 discuss

OCR/Text Detection,Action/Event Detection,Image Data Classification

ArTデータセットには10166枚の画像が含まれます。これは5603枚の画像を含む訓練セットと4563枚の画像を含むテストセットに分けられます。ArTはTotal Text[4]、SCUT - CT......

Introduction
Data file
Related papers
Code
Discuss(0)
Instructions

Data Structure ? 5.59G

README.md

ArTデータセットには10166枚の画像が含まれます。これは、5603枚の画像を含むトレーニングセットと、4563枚の画像を含むテストセットに分割されています。

ArTはTotal Text[4]、SCUT - CTW1500[5]、および百度の曲線シーンテキストの組み合わせです。これらのテキストを収集した目的は、任意の形状のテキスト問題をシーンテキストコミュニティに導入することです。既存の画像（3055枚）に加えて、7111枚以上の画像が2つのデータセットの混合に追加され、これによりArTは現在の中で比較的規模の大きいシーンテキストデータセットの1つとなっています。ArTデータセットには合計10166枚の画像があります。これは、5603枚の画像を含むトレーニングセットと、4563枚の新しく収集された画像を含むテストセットに分割されています。ArTデータセットは、テキスト形状の多様性を考慮して収集されているため、既存のすべてのテキスト形状（すなわち、水平、多方向、および湾曲）がデータセットに大量に存在します。これにより、このデータセットは独特なものとなっています。なぜなら、ほとんどの既存のデータセット[1、2、3]は水平および多方向のテキストインスタンスが主であるからです。

ArTデータセットのテキストインスタンスは、（a）四角形のバウンディングボックス、8、10、および12個の頂点の多角形バウンディングボックス（詳細についてはタスクタブを参照）、および（b）転写で注釈付けされています。これら2種類の注釈は、このチャレンジで提起された（a）テキスト検出、（b）認識、および（c）テキスト位置特定タスクを満たしています。

データ構造：

トレーニングセット

タスク1およびタスク3用

train_images.tar.gz (1.6G) - 5,603枚の画像
train_labels.json (41M) - 5,603枚の画像の正解ファイル

タスク2用

train_task2_images.tar.gz (439M) - 50,029枚の画像
train_labels_task2.json (35M) - 50,029枚の画像の正解ファイル

テストセット

テストセットの最初の部分：

test_part1_task2_images.tar.gz (439M) - 24836枚の画像
test_part1_images.tar.gz (1.4G) - 2271枚の画像
タスク1およびタスク3用
タスク2用

テストセットの最後の部分：

test_part2_task2_images.tar.gz (467M) - 27795枚の画像
test_part2_images.tar.gz (1.4G) - 2292枚の画像
タスク1およびタスク3用
タスク2用

参考文献

Karatzas, Dimosthenis, et al. "ICDAR 2013 robust reading competition."第12回IAPR国際文書分析と認識会議（ICDAR）. IEEE, 2013.
Karatzas, Dimosthenis, et al. "ICDAR 2015 competition on robust reading." 第13回IAPR国際文書分析と認識会議（ICDAR）. IEEE, 2015.
Gomez, Raul, et al. "ICDAR2017 robust reading challenge on COCO - Text." 第14回IAPR国際文書分析と認識会議（ICDAR）. IEEE, 2017.
Ch'ng, Chee Kheng, and Chee Seng Chan. "Total - text: A comprehensive dataset for scene text detection and recognition." 第14回IAPR国際文書分析と認識会議（ICDAR）. Vol. 1. IEEE, 2017.
Yuliang, Liu, Lianwen, Jin, et al. "Curved Scene Text Detection via Transverse and Longitudinal Sequence Connection." パターン認識, 2019.
C. Chng, Y. Liu, Y. Sun, et al, “ICDAR 2019 Robust Reading Challenge on Arbitrary - Shaped Text - RRC - ArT”, in Proc. of ICDAR 2019.

No content available at the moment

Share your thoughts

Go share your ideas~~

ALL

Welcome to exchange and share

Your sharing can help others better utilize data.

Data usage instructions:

I. Data Source and Display Explanation:

1. The data originates from internet data collection or provided by service providers, and this platform offers users the ability to view and browse datasets.

2. This platform serves only as a basic information display for datasets, including but not limited to image, text, video, and audio file types.

3. Basic dataset information comes from the original data source or the information provided by the data provider. If there are discrepancies in the dataset description, please refer to the original data source or service provider's address.

II. Ownership Explanation:

1. All datasets on this site are copyrighted by their original publishers or data providers.

III. Data Reposting Explanation:

1. If you need to repost data from this site, please retain the original data source URL and related copyright notices.

IV. Infringement and Handling Explanation:

1. If any data on this site involves infringement, please contact us promptly, and we will arrange for the data to be taken offline.

Points：

30 Go earn points？

1291
2
0
collect
Share

Select Language

AI Technology Community

Today search ranking

month_search_ranking

Dataset Category

Open Dataset

ArTシーンテキストデータセットは10166枚の画像を含んでいます。

Data Structure ? 5.59G

README.md

参考文献

Similar Data

ALL

I. Data Source and Display Explanation:

II. Ownership Explanation:

III. Data Reposting Explanation:

IV. Infringement and Handling Explanation: