Dataset Ninja LogoDataset Ninja:

Pink-Eggs Dataset V1 Dataset

126112054
Tagbiology, environmental
Taskobject detection
Release YearMade in 2023
LicenseGNU GPL 2.0
Download10 GB

Introduction #

Di Xu, Yang Zhao, Xiang Haoet al.

Pink-Eggs Dataset V1 has been specifically curated for object detection tasks within the environmental industry. Comprising 1261 images, this dataset includes 2518 labeled objects falling under a singular class β€” eggs. The dataset presents a unique collection of images highlighting pink eggs recognized as belonging to the Pomacea canaliculata species, each accompanied by precise bounding box annotations. Its primary objective is to serve as a valuable resource for researchers, utilizing deep learning techniques to analyze and understand the distribution and proliferation of Pomacea canaliculata species. Furthermore, this dataset supports various investigative endeavors that rely on visual data pertaining to the eggs of Pomacea canaliculata, aiding studies within ecological research and environmental sciences.

Motivation

The authors were driven by a crucial need to address the urgent ecological threat posed by the invasive apple snail species, Pomacea canaliculata. Originating from South America, the rapid global spread of this species, triggered by human activities, has resulted in detrimental impacts on wetland ecosystems, potentially endangering native species and human health. With diverse control methods under consideration, ranging from pesticides to natural predators, each with distinct risks and benefits, the authors developed the PinkEggs Dataset. This initiative focuses on utilizing the capabilities of machine learning and computer vision to effectively identify Pomacea canaliculata eggs by their distinct pink color and clustering pattern. This innovative strategy not only offers a promising solution for invasive species management but also enhances authors comprehension of the behaviors and population dynamics of such species, paving the way for more sustainable and environmentally friendly solutions. Further research in this realm holds the potential to introduce groundbreaking strategies in the ongoing battle against invasive species.

About Dataset

image

Four examples of Pomacea canaliculata detection result with object bounding box localization.

Based on the morphological characteristics of the eggs observed, as well as the presence of Pomacea canaliculata in the surrounding area, authors infer that the specimens captured in Shenzhen between October and December of 2022 during daylight hours and clear weather conditions are the eggs of Pomacea canaliculata. For close-range photography, a Redmi K50 Ultra cellular device with default camera settings was employed to capture images of the eggs. To capture distant images, a D7200 camera equipped with an 18-140mm focus range lens was utilized in auto mode. In both cases, the images were saved in the JPG format.

After detecting distortions and imperfections in the collected data, data cleansing was performed by removing certain images that did not meet predetermined quality standards. Specifically, images that could be reliably identified as depicting Pomacea canaliculata eggs were retained, while images with severe degrees of blurriness were removed. These factors could be attributed to distance, motion-induced distortion, size, and camera-specific effects. Each image in the dataset was annotated with bounding box labels using the labelImg tool by three annotators, with all annotations incorporated into the dataset to facilitate object detection and classification. In order to minimize subjectivity in the labeling process, random samples of labeled data were reviewed. Furthermore, three sets of annotations were provided to enable evaluation through methods such as cross-validation and bootstrapping. The average Intersection over Union (IoU) rate was calculated between any two sets, and all the values are above 0.87. Based on these measures, authors are highly confident that the collected images are suitable for supporting research hypotheses.

The dataset was partitioned into three distinct subsets, namely train, val, and test sets, each being mutually exclusive. The training set consisted of a total of 1000 images, randomly selected from the dataset, whereas the val set and test set comprised 100 and 161 images, respectively. Additionally, all images were subjected to a modification process that involved the removal of embedded camera messages while preserving the original pixel values.

In the pursuit of acquiring a comprehensive dataset, further images of Pomacea canaliculata eggs were sourced through online search engines. However, due to a dearth of explicit consent for their reuse, download, and distribution, these images could not be included in the final curation.

ExpandExpand
Dataset LinkHomepageDataset LinkResearch Paper

Summary #

Pink-Eggs Dataset V1 is a dataset for an object detection task. It is used in the biological research.

The dataset consists of 1261 images with 2518 labeled objects belonging to 1 single class (eggs).

Images in the Pink-Eggs Dataset V1 dataset have bounding box annotations. All images are labeled (i.e. with annotations). There are 3 splits in the dataset: train (1000 images), test (161 images), and val (100 images). The dataset was released in 2023.

Dataset Poster

Explore #

Pink-Eggs Dataset V1 dataset has 1261 images. Click on one of the examples below or open "Explore" tool anytime you need to view dataset images with annotations. This tool has extended visualization capabilities like zoom, translation, objects table, custom filters and more. Hover the mouse over the images to hide or show annotations.

OpenSample annotation mask from Pink-Eggs Dataset V1Sample image from Pink-Eggs Dataset V1
OpenSample annotation mask from Pink-Eggs Dataset V1Sample image from Pink-Eggs Dataset V1
OpenSample annotation mask from Pink-Eggs Dataset V1Sample image from Pink-Eggs Dataset V1
OpenSample annotation mask from Pink-Eggs Dataset V1Sample image from Pink-Eggs Dataset V1
OpenSample annotation mask from Pink-Eggs Dataset V1Sample image from Pink-Eggs Dataset V1
OpenSample annotation mask from Pink-Eggs Dataset V1Sample image from Pink-Eggs Dataset V1
OpenSample annotation mask from Pink-Eggs Dataset V1Sample image from Pink-Eggs Dataset V1
OpenSample annotation mask from Pink-Eggs Dataset V1Sample image from Pink-Eggs Dataset V1
OpenSample annotation mask from Pink-Eggs Dataset V1Sample image from Pink-Eggs Dataset V1
OpenSample annotation mask from Pink-Eggs Dataset V1Sample image from Pink-Eggs Dataset V1
OpenSample annotation mask from Pink-Eggs Dataset V1Sample image from Pink-Eggs Dataset V1
OpenSample annotation mask from Pink-Eggs Dataset V1Sample image from Pink-Eggs Dataset V1
OpenSample annotation mask from Pink-Eggs Dataset V1Sample image from Pink-Eggs Dataset V1
OpenSample annotation mask from Pink-Eggs Dataset V1Sample image from Pink-Eggs Dataset V1
OpenSample annotation mask from Pink-Eggs Dataset V1Sample image from Pink-Eggs Dataset V1
OpenSample annotation mask from Pink-Eggs Dataset V1Sample image from Pink-Eggs Dataset V1
OpenSample annotation mask from Pink-Eggs Dataset V1Sample image from Pink-Eggs Dataset V1
OpenSample annotation mask from Pink-Eggs Dataset V1Sample image from Pink-Eggs Dataset V1
OpenSample annotation mask from Pink-Eggs Dataset V1Sample image from Pink-Eggs Dataset V1
OpenSample annotation mask from Pink-Eggs Dataset V1Sample image from Pink-Eggs Dataset V1
OpenSample annotation mask from Pink-Eggs Dataset V1Sample image from Pink-Eggs Dataset V1
OpenSample annotation mask from Pink-Eggs Dataset V1Sample image from Pink-Eggs Dataset V1
OpenSample annotation mask from Pink-Eggs Dataset V1Sample image from Pink-Eggs Dataset V1
OpenSample annotation mask from Pink-Eggs Dataset V1Sample image from Pink-Eggs Dataset V1
OpenSample annotation mask from Pink-Eggs Dataset V1Sample image from Pink-Eggs Dataset V1
OpenSample annotation mask from Pink-Eggs Dataset V1Sample image from Pink-Eggs Dataset V1
OpenSample annotation mask from Pink-Eggs Dataset V1Sample image from Pink-Eggs Dataset V1
OpenSample annotation mask from Pink-Eggs Dataset V1Sample image from Pink-Eggs Dataset V1
OpenSample annotation mask from Pink-Eggs Dataset V1Sample image from Pink-Eggs Dataset V1
OpenSample annotation mask from Pink-Eggs Dataset V1Sample image from Pink-Eggs Dataset V1
πŸ‘€
Have a look at 1261 images
View images along with annotations and tags, search and filter by various parameters

Class balance #

There are 1 annotation classes in the dataset. Find the general statistics and balances for every class in the table below. Click any row to preview images that have labels of the selected class. Sort by column to find the most rare or prevalent classes.

Search
Rows 1-1 of 1
Class
γ…€
Images
γ…€
Objects
γ…€
Count on image
average
Area on image
average
eggsβž”
rectangle
1261
2518
2
3.5%

Images #

Explore every single image in the dataset with respect to the number of annotations of each class it has. Click a row to preview selected image. Sort by any column to find anomalies and edge cases. Use horizontal scroll if the table has many columns for a large number of classes in the dataset.

Object distribution #

Interactive heatmap chart for every class with object distribution shows how many images are in the dataset with a certain number of objects of a specific class. Users can click cell and see the list of all corresponding images.

Class sizes #

The table below gives various size properties of objects for every class. Click a row to see the image with annotations of the selected class. Sort columns to find classes with the smallest or largest objects or understand the size differences between classes.

Search
Rows 1-1 of 1
Class
Object count
Avg area
Max area
Min area
Min height
Min height
Max height
Max height
Avg height
Avg height
Min width
Min width
Max width
Max width
eggs
rectangle
2518
1.76%
36.66%
0.01%
35px
0.88%
2993px
74.83%
383px
9.77%
36px
0.6%
2162px
72.07%

Spatial Heatmap #

The heatmaps below give the spatial distributions of all objects for every class. These visualizations provide insights into the most probable and rare object locations on the image. It helps analyze objects' placements in a dataset.

Spatial Heatmap

Objects #

Table contains all 2518 objects. Click a row to preview an image with annotations, and use search or pagination to navigate. Sort columns to find outliers in the dataset.

Search
Rows 1-10 of 2518
Object ID
γ…€
Class
γ…€
Image name
click row to open
Image size
height x width
Height
γ…€
Height
γ…€
Width
γ…€
Width
γ…€
Area
γ…€
1βž”
eggs
rectangle
IMG_20220925_135206.jpg
4000 x 3000
419px
10.47%
401px
13.37%
1.4%
2βž”
eggs
rectangle
_WGX2651.JPG
4000 x 6000
174px
4.35%
101px
1.68%
0.07%
3βž”
eggs
rectangle
_WGX2651.JPG
4000 x 6000
158px
3.95%
101px
1.68%
0.07%
4βž”
eggs
rectangle
_WGX2957.JPG
4000 x 6000
180px
4.5%
135px
2.25%
0.1%
5βž”
eggs
rectangle
_WGX2957.JPG
4000 x 6000
114px
2.85%
53px
0.88%
0.03%
6βž”
eggs
rectangle
_WGX2957.JPG
4000 x 6000
279px
6.97%
140px
2.33%
0.16%
7βž”
eggs
rectangle
_WGX2957.JPG
4000 x 6000
227px
5.67%
105px
1.75%
0.1%
8βž”
eggs
rectangle
_WGX2957.JPG
4000 x 6000
188px
4.7%
92px
1.53%
0.07%
9βž”
eggs
rectangle
_WGX2957.JPG
4000 x 6000
280px
7%
97px
1.62%
0.11%
10βž”
eggs
rectangle
_WGX2957.JPG
4000 x 6000
175px
4.38%
83px
1.38%
0.06%

License #

Pink-Eggs Dataset V1 is under GNU GPL 2.0 license.

Source

Citation #

If you make use of the Pink-Eggs Dataset V1 data, please cite the following reference:

@misc{xu2023pinkeggs,
  title={Pink-Eggs Dataset V1: A Step Toward Invasive Species Management Using Deep Learning Embedded Solutions}, 
  author={Di Xu and Yang Zhao and Xiang Hao and Xin Meng},
  year={2023},
  eprint={2305.09302},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

Source

If you are happy with Dataset Ninja and use provided visualizations and tools in your work, please cite us:

@misc{ visualization-tools-for-pink-eggs-dataset-v1-dataset,
  title = { Visualization Tools for Pink-Eggs Dataset V1 Dataset },
  type = { Computer Vision Tools },
  author = { Dataset Ninja },
  howpublished = { \url{ https://datasetninja.com/pink-eggs-dataset-v1 } },
  url = { https://datasetninja.com/pink-eggs-dataset-v1 },
  journal = { Dataset Ninja },
  publisher = { Dataset Ninja },
  year = { 2024 },
  month = { nov },
  note = { visited on 2024-11-21 },
}

Download #

Dataset Pink-Eggs Dataset V1 can be downloaded in Supervisely format:

As an alternative, it can be downloaded with dataset-tools package:

pip install --upgrade dataset-tools

… using following python code:

import dataset_tools as dtools

dtools.download(dataset='Pink-Eggs Dataset V1', dst_dir='~/dataset-ninja/')

Make sure not to overlook the python code example available on the Supervisely Developer Portal. It will give you a clear idea of how to effortlessly work with the downloaded dataset.

The data in original format can be downloaded here.

. . .

Disclaimer #

Our gal from the legal dep told us we need to post this:

Dataset Ninja provides visualizations and statistics for some datasets that can be found online and can be downloaded by general audience. Dataset Ninja is not a dataset hosting platform and can only be used for informational purposes. The platform does not claim any rights for the original content, including images, videos, annotations and descriptions. Joint publishing is prohibited.

You take full responsibility when you use datasets presented at Dataset Ninja, as well as other information, including visualizations and statistics we provide. You are in charge of compliance with any dataset license and all other permissions. You are required to navigate datasets homepage and make sure that you can use it. In case of any questions, get in touch with us at hello@datasetninja.com.