Introduction #
The authors introduce GDXray+: The GRIMA X-ray Database comprising 20,966 X-ray images. These images are systematically organized in a public database named GDXray+, accessible for free but exclusively for research and educational purposes. The database encompasses five categories of X-ray images: castings, welds, baggages, natural objects (nature), and settings. Each category includes several series, and each series comprises multiple X-ray images. Most series are annotated, providing either the coordinates of bounding boxes for objects of interest or labels for the images, stored in standard text files. GDXray, with a size of 3.5 GB, is available for download on the authors’ website. The authors assert that GDXray makes a valuable contribution to the X-ray testing community by offering a resource for developing, testing, and evaluating image analysis and computer vision algorithms without the need for expensive X-ray equipment. Furthermore, they emphasize its utility as a benchmark for testing and comparing different approaches on the same dataset, along with its potential use in training programs for human inspectors.
Public databases of X-ray images for medical imaging exist, but, to the best knowledge of the authors, there has not been a public database specifically for digital X-ray images in X-ray testing. In service to the X-ray testing community, the authors have collected around 21k X-ray images for the development, testing, and evaluation of image analysis and computer vision algorithms. The authors highlight that GDXray is distinct in its structure, offering a resource for various applications and presenting opportunities for diverse experiments. They elaborate on the database’s structure, providing insight into the organization of groups, series, and individual X-ray images.
Castings
The Castings group, comprising 2,727 X-ray images in 67 series, focuses on automotive parts, particularly aluminum wheels and knuckles. The authors present examples and details of each series, emphasizing that experiments on this data can be found in various publications. Notably, they highlight a series (C0001) that contains a sequence of 72 X-ray images taken from an aluminum wheel, along with annotations of bounding boxes for 226 small defects and the calibration matrix for each image.
Welds
The Welds group consists of 98 images in 3 series, taken by the BAM Federal Institute for Materials Research and Testing, Berlin, Germany. The authors showcase examples and detail each series, mentioning experiments conducted on this data and highlighting series (W0001 and W0002) that contain annotations for bounding boxes and binary images of the ground truth for 641 defects. Another series (W0003) includes 67 digitized radiographs from a round robin test on flaw recognition in welding seams, providing additional details on the data acquisition process.
Baggages
The Baggage group, containing 9,700 X-ray images in 77 series, focuses on images of various items such as backpacks, pen cases, and wallets. The authors illustrate examples and present details for each series, noting experiments conducted on this data and highlighting series (B0046, B0047, and B0048) that contain 600 X-ray images suitable for the automated detection of handguns, shuriken, and razor blades. These series include bounding box annotations for the objects of interest.
Nature
The Nature group encompasses 8,290 X-ray images in 13 series, featuring images of natural objects such as salmon filets, fruit, and wood pieces. The authors showcase examples and detail each series, emphasizing experiments conducted on this data. They highlight series (N0012 and N0013) that include annotations for bounding boxes and binary images of the ground truth for 73 fish bones. Additionally, they mention series (N0003) that provides over 7,500 labeled small crops for training purposes.
Settings
The Settings group includes 151 X-ray images in 7 series, focusing on calibration objects like checkerboards and 3D objects with regular patterns. The authors present examples and provide details for each series, noting experiments conducted on this data. They highlight series (S0001) that contains X-ray images of a copper checkerboard along with the calibration matrix for each view. Another series (S0007) can be used for modeling the distortion of an image intensifier, providing coordinates for each hole of the calibration pattern in each view along with the coordinates of the 3D model.
Summary #
GDXray+: The GRIMA X-ray Database is a dataset for object detection and classification tasks. It is used in the security industry.
The dataset consists of 20966 images with 1831 labeled objects belonging to 1 single class (object of interest).
Images in the GDXray+ dataset have bounding box annotations. There are 19834 (95% of the total) unlabeled images (i.e. without annotations). There are no pre-defined train/val/test splits in the dataset. Alternatively, the dataset could be split into 5 groups: baggages (9700 images), nature (8290 images), castings (2727 images), settings (151 images), and welds (98 images). Additionally, every image has its own series tag. The dataset was released in 2015 by the Pontificia Universidad Catolica de Chile, Universidad de Atacama, Chile, BAM, Germany, Universidad de Santiago de Chile, and Universidad Adolfo Ibanez.
Explore #
GDXray+ dataset has 20966 images. Click on one of the examples below or open "Explore" tool anytime you need to view dataset images with annotations. This tool has extended visualization capabilities like zoom, translation, objects table, custom filters and more. Hover the mouse over the images to hide or show annotations.
Class balance #
There are 1 annotation classes in the dataset. Find the general statistics and balances for every class in the table below. Click any row to preview images that have labels of the selected class. Sort by column to find the most rare or prevalent classes.
Class ã…¤ | Images ã…¤ | Objects ã…¤ | Count on image average | Area on image average |
---|---|---|---|---|
object of interestâž” rectangle | 1132 | 1831 | 1.62 | 4.99% |
Images #
Explore every single image in the dataset with respect to the number of annotations of each class it has. Click a row to preview selected image. Sort by any column to find anomalies and edge cases. Use horizontal scroll if the table has many columns for a large number of classes in the dataset.
Object distribution #
Interactive heatmap chart for every class with object distribution shows how many images are in the dataset with a certain number of objects of a specific class. Users can click cell and see the list of all corresponding images.
Class sizes #
The table below gives various size properties of objects for every class. Click a row to see the image with annotations of the selected class. Sort columns to find classes with the smallest or largest objects or understand the size differences between classes.
Class | Object count | Avg area | Max area | Min area | Min height | Min height | Max height | Max height | Avg height | Avg height | Min width | Min width | Max width | Max width |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
object of interest rectangle | 1831 | 3.09% | 40.49% | 0.03% | 9px | 2.1% | 853px | 99.83% | 85px | 13.53% | 9px | 1.43% | 553px | 64.38% |
Spatial Heatmap #
The heatmaps below give the spatial distributions of all objects for every class. These visualizations provide insights into the most probable and rare object locations on the image. It helps analyze objects' placements in a dataset.
Objects #
Table contains all 1831 objects. Click a row to preview an image with annotations, and use search or pagination to navigate. Sort columns to find outliers in the dataset.
Object ID ã…¤ | Class ã…¤ | Image name click row to open | Image size height x width | Height ã…¤ | Height ã…¤ | Width ã…¤ | Width ã…¤ | Area ã…¤ |
---|---|---|---|---|---|---|---|---|
1âž” | object of interest rectangle | B0063_0015.png | 1712 x 2136 | 313px | 18.28% | 208px | 9.74% | 1.78% |
2âž” | object of interest rectangle | B0073_0007.png | 1692 x 1688 | 193px | 11.41% | 319px | 18.9% | 2.16% |
3âž” | object of interest rectangle | C0054_0001.png | 574 x 768 | 40px | 6.97% | 41px | 5.34% | 0.37% |
4âž” | object of interest rectangle | C0054_0001.png | 574 x 768 | 30px | 5.23% | 28px | 3.65% | 0.19% |
5âž” | object of interest rectangle | C0045_0035.png | 256 x 256 | 12px | 4.69% | 18px | 7.03% | 0.33% |
6âž” | object of interest rectangle | C0045_0035.png | 256 x 256 | 16px | 6.25% | 17px | 6.64% | 0.42% |
7âž” | object of interest rectangle | C0030_0025.png | 572 x 768 | 21px | 3.67% | 21px | 2.73% | 0.1% |
8âž” | object of interest rectangle | C0030_0025.png | 572 x 768 | 20px | 3.5% | 26px | 3.39% | 0.12% |
9âž” | object of interest rectangle | B0001_0008.png | 2208 x 2688 | 313px | 14.18% | 313px | 11.64% | 1.65% |
10âž” | object of interest rectangle | C0057_0003.png | 574 x 768 | 26px | 4.53% | 30px | 3.91% | 0.18% |
License #
The X-ray images included in GDXray+ can be used free of charge, for research and educational purposes only. Redistribution and commercial use is prohibited. Any researcher reporting results which use this database should acknowledge the GDXray+ database by citing:
Mery, D.; Riffo, V.; Zscherpel, U.; Mondragón, G.; Lillo, I.; Zuccar, I.; Lobel, H.; Carrasco, M. (2015): GDXray: The database of X-ray images for nondestructive testing. Journal of Nondestructive Evaluation, 34.4:1-12.
Citation #
If you make use of the GDXray+ data, please cite the following reference:
Mery, D.; Riffo, V.; Zscherpel, U.; Mondragón, G.; Lillo, I.; Zuccar, I.; Lobel, H.; Carrasco, M. (2015):
GDXray: The database of X-ray images for nondestructive testing.
Journal of Nondestructive Evaluation, 34.4:1-12.
If you are happy with Dataset Ninja and use provided visualizations and tools in your work, please cite us:
@misc{ visualization-tools-for-gdxray-dataset,
title = { Visualization Tools for GDXray+ Dataset },
type = { Computer Vision Tools },
author = { Dataset Ninja },
howpublished = { \url{ https://datasetninja.com/gdxray } },
url = { https://datasetninja.com/gdxray },
journal = { Dataset Ninja },
publisher = { Dataset Ninja },
year = { 2024 },
month = { nov },
note = { visited on 2024-11-21 },
}
Download #
Please visit dataset homepage to download the data.
Disclaimer #
Our gal from the legal dep told us we need to post this:
Dataset Ninja provides visualizations and statistics for some datasets that can be found online and can be downloaded by general audience. Dataset Ninja is not a dataset hosting platform and can only be used for informational purposes. The platform does not claim any rights for the original content, including images, videos, annotations and descriptions. Joint publishing is prohibited.
You take full responsibility when you use datasets presented at Dataset Ninja, as well as other information, including visualizations and statistics we provide. You are in charge of compliance with any dataset license and all other permissions. You are required to navigate datasets homepage and make sure that you can use it. In case of any questions, get in touch with us at hello@datasetninja.com.