Dataset Ninja LogoDataset Ninja:

FSOD Dataset

665028001174
Taggeneral
Taskobject detection
Release YearMade in 2020
Licenseunknown

Introduction #

Qi Fan, Wei Zhuo, Yu-Wing Tai

The FSOD: A High-Diverse Few-Shot Object Detection Dataset stands out as a meticulously crafted dataset tailored for few-shot object detection. Its design focuses on assessing a model’s versatility across new categories. With a collection of 1000 diverse object categories accompanied by high-quality annotations, this dataset marks a pioneering effort in the realm of few-shot object detection datasets.

Motivation

Object detection finds extensive applications across various fields. However, current methods typically depend heavily on large sets of annotated data and entail prolonged training periods. They also struggle to adapt to unseen objects not included in the training data. In contrast, the human visual system excels at recognizing new objects with minimal guidance. Few-shot learning poses significant challenges due to the diverse characteristics of objects in real-world scenarios, including variations in illumination, shape, and texture. Despite recent advancements in few-shot learning, these techniques have primarily focused on image classification, with little exploration in the realm of few-shot object detection. This is primarily because transferring insights from few-shot classification to few-shot object detection presents considerable complexity. Few-shot object detection faces a critical obstacle in localizing unseen objects within cluttered backgrounds, representing a broader challenge in generalizing object localization from a scant number of training examples belonging to novel categories. This challenge often leads to missed detections or false positives, stemming from inadequate scoring of potentially suitable bounding boxes in region proposal networks (RPNs), rendering novel object detection difficult. Such inherent issues distinguish few-shot object detection from few-shot classification.

Dataset description

The authors endeavor to tackle the challenge of few-shot object detection. Their objective is to detect all foreground objects belonging to a specific target object category in a test set, given only a limited number of support set images depicting the target object. In pursuit of this goal, the authors present two significant contributions. Firstly, they introduce a comprehensive few-shot detection model capable of detecting novel objects without necessitating re-training or fine-tuning. Their approach leverages the matching relationship between pairs of objects within a siamese network across multiple network stages. Experimental results demonstrate that the model benefits from an attention module in the early stages, enhancing proposal quality, and a multi-relation module in the later stages, effectively suppressing and filtering out false detections in complex backgrounds. Secondly, for model training, the authors curate a large, meticulously annotated dataset comprising 1000 categories, each with a few examples. This dataset fosters broad learning in object detection.

image

Given different objects as supports, the authors approach can detect all objects with same categories in the given query image.

Dataset construction

The authors developed their dataset by drawing from existing extensive supervised object detection datasets. However, direct utilization of these datasets is hindered by several factors:
Inconsistencies in labeling systems across different datasets, wherein objects with identical semantics are denoted by different terms. Suboptimal annotations characterized by inaccuracies, missing labels, duplicate bounding boxes, excessively large objects, among other issues. The train/test splits in these datasets often contain identical categories, whereas for a few-shot dataset, the aim is to have distinct categories in the train and test sets to evaluate the model’s generalization to unseen objects. To construct their dataset, the authors initially standardized the labeling system by consolidating labels with similar meanings, such as merging “ice bear” and “polar bear” into a single category while eliminating semantically irrelevant labels. They then filtered out images with subpar labeling quality and bounding boxes of inappropriate sizes. Bounding boxes smaller than 0.05% of the image size, typically indicative of poor visual quality and unsuitable for serving as support examples, were specifically discarded.

Subsequently, adhering to the principles of few-shot learning, the authors partitioned the data into training and test sets devoid of category overlap. The training set comprised categories from the MS COCO Dataset and ImageNet Dataset, while for the test set containing 200 categories, categories with the least similarity to those in the training set were selected. The remaining categories were merged into the training set, resulting in a total of 800 categories.

In summary, the authors curated a dataset encompassing 1000 categories with distinct category splits for training and testing, with 531 categories sourced from the ImageNet Dataset and 469 from the Open Image Dataset.

Dataset analysis

The dataset is specifically designed for few-shot learning and intrinsically designed to evaluate the generality of a model on novel categories. The authors dataset contains 1000 categories with 800/200 split for training and test set separately, around 66,000 images and 182,000 bounding boxes in total. The dataset has the following attributes.

Train Test
No. Class 800 200
No. Image 52350 14152
No. Box 147489 35102
Avg No. Box / Img 2.82 2.48
Min No. Img / Cls 22 30
Max No. Img / Cls 208 199
Avg No. Img / Cls 75.65 74.31
Box Size [6, 6828] [13, 4605]
Box Area Ratio [0.0009, 1] [0.0009, 1]
Box W/H Ratio [0.0216, 89] [0.0199, 51.5]

Given different objects as supports, the authors approach can detect all objects with same categories in the given query image.

The dataset has the following attributes:

  • Extensive category diversity: The dataset boasts a wide range of semantic categories, encompassing 83 overarching parent semantics such as mammals, clothing, and weaponry, further branching out into 1000 distinct leaf categories. The rigorous dataset split implemented by the authors ensures that the semantic categories in the train and test sets are markedly dissimilar, posing a significant challenge for model evaluation.

  • Demanding evaluation conditions: Evaluation of models on this dataset presents formidable challenges. Notably, objects exhibit considerable variation in box size and aspect ratios. Moreover, a substantial portion of the test set, comprising 26.5% of images, features three or more objects. It’s pertinent to highlight that the test set includes numerous bounding boxes representing categories not included in label system, adding an additional layer of complexity to the evaluation process.

ExpandExpand
Dataset LinkHomepageDataset LinkResearch Paper

Summary #

FSOD: A High-Diverse Few-Shot Object Detection Dataset is a dataset for an object detection task. It is applicable or relevant across various domains.

The dataset consists of 66502 images with 182591 labeled objects belonging to 800 different classes including cake, wheelchair, orange, and other: window blind, lipstick, houseplant, guitar, salad, mug, goose, sandal, van, shower cap, worm, shelf, shirt, hedgehog, pillow, doll, backpack, cat, swan, wall clock, butterfly, camper, countertop, dagger, flag, and 772 more.

Images in the FSOD dataset have bounding box annotations. All images are labeled (i.e. with annotations). There are 2 splits in the dataset: train (52350 images) and test (14152 images). Additionally, every image contains information about its sequence. Every label contain information about its supercategory. Explore it in supervisely labeling tool. The dataset was released in 2020 by the CN-US joint research group.

Here is a visualized example for randomly selected sample classes:

Explore #

FSOD dataset has 66502 images. Click on one of the examples below or open "Explore" tool anytime you need to view dataset images with annotations. This tool has extended visualization capabilities like zoom, translation, objects table, custom filters and more. Hover the mouse over the images to hide or show annotations.

OpenSample annotation mask from FSODSample image from FSOD
OpenSample annotation mask from FSODSample image from FSOD
OpenSample annotation mask from FSODSample image from FSOD
OpenSample annotation mask from FSODSample image from FSOD
OpenSample annotation mask from FSODSample image from FSOD
OpenSample annotation mask from FSODSample image from FSOD
OpenSample annotation mask from FSODSample image from FSOD
OpenSample annotation mask from FSODSample image from FSOD
OpenSample annotation mask from FSODSample image from FSOD
OpenSample annotation mask from FSODSample image from FSOD
OpenSample annotation mask from FSODSample image from FSOD
OpenSample annotation mask from FSODSample image from FSOD
👀
Have a look at 66502 images
Because of dataset's license preview is limited to 12 images
View images along with annotations and tags, search and filter by various parameters

Class balance #

There are 800 annotation classes in the dataset. Find the general statistics and balances for every class in the table below. Click any row to preview images that have labels of the selected class. Sort by column to find the most rare or prevalent classes.

Search
Rows 1-10 of 800
Class
Images
Objects
Count on image
average
Area on image
average
cake
rectangle
275
593
2.16
28.37%
wheelchair
rectangle
272
1041
3.83
33.91%
orange
rectangle
265
1489
5.62
33.66%
window blind
rectangle
259
1175
4.54
37.65%
lipstick
rectangle
258
750
2.91
21.81%
houseplant
rectangle
258
981
3.8
33.64%
guitar
rectangle
257
472
1.84
35.93%
salad
rectangle
254
344
1.35
52.27%
mug
rectangle
241
385
1.6
36.38%
goose
rectangle
240
974
4.06
28.35%

Images #

Explore every single image in the dataset with respect to the number of annotations of each class it has. Click a row to preview selected image. Sort by any column to find anomalies and edge cases. Use horizontal scroll if the table has many columns for a large number of classes in the dataset.

Object distribution #

Interactive heatmap chart for every class with object distribution shows how many images are in the dataset with a certain number of objects of a specific class. Users can click cell and see the list of all corresponding images.

Class sizes #

The table below gives various size properties of objects for every class. Click a row to see the image with annotations of the selected class. Sort columns to find classes with the smallest or largest objects or understand the size differences between classes.

Search
Rows 1-10 of 800
Class
Object count
Avg area
Max area
Min area
Min height
Min height
Max height
Max height
Avg height
Avg height
Min width
Min width
Max width
Max width
butterfly
rectangle
2285
2.9%
97.34%
0.12%
25px
2.44%
1121px
98.05%
128px
13.67%
11px
1.07%
2895px
100%
cookie
rectangle
1716
4.78%
81.72%
0.11%
14px
1.95%
2910px
100%
175px
20.74%
15px
1.95%
2793px
99.87%
barrel
rectangle
1683
5.82%
100%
0.1%
18px
2.25%
3274px
100%
179px
21.16%
18px
1.76%
2026px
100%
strawberry
rectangle
1517
4.87%
64.02%
0.21%
24px
2.34%
875px
99.7%
161px
21.69%
12px
2.6%
797px
82.39%
orange
rectangle
1489
6.46%
99.76%
0.11%
10px
1.31%
1024px
100%
158px
20.56%
18px
1.76%
3042px
100%
football helmet
rectangle
1256
7.26%
100%
0.1%
18px
2.61%
3588px
100%
212px
22.26%
8px
1.6%
3998px
100%
window blind
rectangle
1175
8.56%
100%
0.11%
10px
2.67%
2446px
100%
267px
32.8%
9px
0.88%
3262px
100%
tomato
rectangle
1161
5.05%
90.12%
0.11%
20px
2.83%
1406px
100%
157px
20.24%
22px
2.15%
1406px
100%
lavender
rectangle
1118
5.19%
100%
0.1%
9px
1.95%
1024px
100%
225px
29.01%
13px
1.66%
1024px
100%
wheelchair
rectangle
1041
9.92%
89.26%
0.12%
25px
3.26%
1024px
100%
292px
39.04%
19px
1.86%
989px
100%

Spatial Heatmap #

The heatmaps below give the spatial distributions of all objects for every class. These visualizations provide insights into the most probable and rare object locations on the image. It helps analyze objects' placements in a dataset.

Spatial Heatmap

Objects #

Table contains all 100387 objects. Click a row to preview an image with annotations, and use search or pagination to navigate. Sort columns to find outliers in the dataset.

Search
Rows 1-10 of 100387
Object ID
Class
Image name
click row to open
Image size
height x width
Height
Height
Width
Width
Area
1
tripod
rectangle
673d8edfcddb2c87.jpg
1024 x 992
708px
69.14%
740px
74.6%
51.58%
2
spoonbill
rectangle
n02006656_2881.jpg
375 x 500
65px
17.33%
76px
15.2%
2.63%
3
spoonbill
rectangle
n02006656_2881.jpg
375 x 500
68px
18.13%
122px
24.4%
4.42%
4
swing
rectangle
n04371774_14204.jpg
332 x 500
284px
85.54%
37px
7.4%
6.33%
5
swing
rectangle
n04371774_14204.jpg
332 x 500
151px
45.48%
229px
45.8%
20.83%
6
letter opener
rectangle
n03658185_3637.jpg
303 x 575
292px
96.37%
278px
48.35%
46.59%
7
letter opener
rectangle
n03658185_3637.jpg
303 x 575
262px
86.47%
252px
43.83%
37.9%
8
king penguin
rectangle
n02056570_10131.jpg
500 x 331
429px
85.8%
113px
34.14%
29.29%
9
king penguin
rectangle
n02056570_10131.jpg
500 x 331
425px
85%
134px
40.48%
34.41%
10
clog
rectangle
n03047690_16008.jpg
500 x 469
287px
57.4%
147px
31.34%
17.99%

License #

License is unknown for the FSOD: A High-Diverse Few-Shot Object Detection Dataset dataset.

Source

Citation #

If you make use of the FSOD data, please cite the following reference:

@dataset{FSOD,
  author={Qi Fan and Wei Zhuo and Yu-Wing Tai},
  title={FSOD: A High-Diverse Few-Shot Object Detection Dataset},
  year={2020},
  url={https://github.com/fanq15/Few-Shot-Object-Detection-Dataset?tab=readme-ov-file}
}

Source

If you are happy with Dataset Ninja and use provided visualizations and tools in your work, please cite us:

@misc{ visualization-tools-for-fsod-dataset,
  title = { Visualization Tools for FSOD Dataset },
  type = { Computer Vision Tools },
  author = { Dataset Ninja },
  howpublished = { \url{ https://datasetninja.com/fsod } },
  url = { https://datasetninja.com/fsod },
  journal = { Dataset Ninja },
  publisher = { Dataset Ninja },
  year = { 2024 },
  month = { nov },
  note = { visited on 2024-11-21 },
}

Download #

Please visit dataset homepage to download the data.

. . .

Disclaimer #

Our gal from the legal dep told us we need to post this:

Dataset Ninja provides visualizations and statistics for some datasets that can be found online and can be downloaded by general audience. Dataset Ninja is not a dataset hosting platform and can only be used for informational purposes. The platform does not claim any rights for the original content, including images, videos, annotations and descriptions. Joint publishing is prohibited.

You take full responsibility when you use datasets presented at Dataset Ninja, as well as other information, including visualizations and statistics we provide. You are in charge of compliance with any dataset license and all other permissions. You are required to navigate datasets homepage and make sure that you can use it. In case of any questions, get in touch with us at hello@datasetninja.com.