Introduction #
The authors present FracAtlas: A Dataset for Fracture Classification, Localization and Segmentation of Musculoskeletal Radiographs, a dataset of X-Ray scans curated from the images collected from 3 major hospitals in Bangladesh. Their dataset includes 4,083 images that have been manually annotated for bone fracture classification, localization, and segmentation with the help of 2 expert radiologists and an orthopedist using the open-source labeling platform, makesense.ai. There are 717 images with 922 instances of fractures. Each of the fracture instances has its own mask and bounding box, whereas the scans also have global labels for classification tasks.
Digital radiography is one of the most common and cost-effective standards for the diagnosis of bone fractures. For such diagnoses expert intervention is required which is time-consuming and demands rigorous training. With the recent growth of computer vision algorithms, there is a surge of interest in computer-aided diagnosis. The development of algorithms demands large datasets with proper annotations. Existing X-Ray datasets are either small or lack proper annotation, which hinders the development of machine-learning algorithms and evaluation of the relative performance of algorithms for classification, localization, and segmentation.
Dataset creation
The authors have created the FracAtlas dataset in four main steps:
1) Data Collection - general purpose X-ray images were collected in DICOM format and for de-identification, the images were converted to JPG and were given arbitrary names,
2) Data Cleaning - the resultant JPG image set was filtered out from other body parts,
3) Finding the general distribution of cleaned data - the resulting image set was taken back to the respective hospitals to find out the general distribution,
4) Annotation of the dataset - the resulting image set was annotated by 2 expert radiologists and later verified and merged by an expert orthopedic doctor.
The workflow for creating the FracAtlas dataset.
Throughout the years 2021 and 2022, approximately 14,068 X-ray scans were collected from 3 hospitals and diagnostic centers. Most of the scans were collected from Lab-Aid Medical Center, Brahmanbaria, along with Anupam General Hospital and Diagnostic Center, Bogra and Prime Diagnostic Center, Barishal. The acquired DICOM images were generated by Fujifilm and Philips devices. The ethical clearance of this study was approved by Institutional Research Ethics Board (IREB) according to the Bangladesh Medical Research Council (BMRC). The IREB approved the open publication of the data based on the facts that there are adequate provisions to maintain the confidentiality of the individuals through proper filtration of personally identifiable information. Furthermore, the permission of publishing the data to the public domain was also taken at the source. Consent for data collection for all subjects (adults and parents in the case of minors) was taken as part of the initiation of the diagnosis at the medical facilities. In the initial phase, a total of 14,068 X-Rays were collected. As the hospitals and diagnostic centers could not share patient information due to privacy concerns, all the DICOM images were given an arbitrary image name and converted to JPG image format. This automatically got rid of all the sensitive information that was present in the metadata of DICOM images. Also, along with bone fracture scans, there were also samples for chest diseases and abnormalities in the skull and spine. In the collected data, the number of bone fracture patterns in the chest, skull, and spine areas was rare. As a result, scans of these parts were removed under the supervision of a doctor. This left authors with 4,083 scans from the hand, leg, hip and shoulder regions.
Dataset distribution
There are 717 abnormal scans in dataset which contain a total of 922 instances of fractures. The abnormal studies contain at least 1 and at most 5 fracture instances in them. Some of the scans have multiple views and locales in them. There are 396 images with different views of the same organ in the same image. There are 99 images with Orthopedic Fixation Devices (hardware) in them. The FracAtlas dataset19 has a total of 1,538 scans of the hand and among them, 437 are fractured. There is a total of 2,272 leg scans, 338 hip scans and 349 shoulder scans. Among these, the number of scans belonging fractured class is 263, 63 and 63 for the leg, hip and shoulder regions respectively, marked with hand, fractured, leg, hip, shoulder tags.
The distribution of different locales along with other properties present in the images of the FracAtlas dataset.
The FracAtlas dataset comprises a total of 2,503 frontal, 1,492 lateral, and 418 oblique view images, each pertaining to different organs, they marked with frontal, lateral and oblique tags respectively.
The number of samples present for each of the frontal, lateral and oblique views present in the FracAtlas dataset for individual classes.
Dataset labeling
The distribution analysis of the data was followed by a review process by two expert radiologists, each with years of experience in the field. The radiologists went through all 4,083 images and labeled each image by identifying the presence and number of fractures, along with the location name of the fractures. After full observation, the fracture list generated by each radiologist was cross-checked with one another. The images that had unanimous labels provided by the radiologists were taken as fractured scans. In case of any disparities in the location of fractures or the count of fracture locales, the images were referred to an expert orthopedic surgeon for further review and validation. After labeling those listed images independently, the images were again cross-checked with his own findings to the ones generated by the radiologists. And after comparing all 3 samples the final labels were agreed upon. After resolving all conflicts, the images were manually annotated. Each image can have multiple locales marked by separate masks and different masks are also allowed to overlap. Thre number of fracture is marked by fracture_count tag.
Fully tagged and labeled sample image. (A) shows the original scan with global tags leg, hardware, fractured set to 1 (true) and fracture count set to 2. The remaining tags (hand, hip, shoulder, mixed, multiscan) are set to 0 (false) (B) The boxes mark the local region of the fracture instance for localization tasks. (C) The red borders mask the fracture regions for segmentation tasks.
Summary #
FracAtlas: A Dataset for Fracture Classification, Localization and Segmentation of Musculoskeletal Radiographs is a dataset for instance segmentation, semantic segmentation, object detection, and classification tasks. It is used in the medical industry.
The dataset consists of 4083 images with 1844 labeled objects belonging to 1 single class (fractured).
Images in the FracAtlas dataset have pixel-level instance segmentation and bounding box annotations. Due to the nature of the instance segmentation task, it can be automatically transformed into a semantic segmentation task (only one mask for every class). There are 3366 (82% of the total) unlabeled images (i.e. without annotations). There are 4 splits in the dataset: not fractured (3366 images), train (574 images), val (82 images), and test (61 images). Additionally, images also has a hardware tag corresponding to the availability of Orthopedic Fixation Devices in the scan. Some images have multiple views of the same organ projected from the frontal and sagittal planes, those images can be identified using the multiscan tag. Images, containing multiple body parts, are marked by mixed tag. The dataset was released in 2023 by the Islamic University of Technology, Gazipur, Bangladesh and United International University, Dhaka, Bangladesh.
Explore #
FracAtlas dataset has 4083 images. Click on one of the examples below or open "Explore" tool anytime you need to view dataset images with annotations. This tool has extended visualization capabilities like zoom, translation, objects table, custom filters and more. Hover the mouse over the images to hide or show annotations.
Class balance #
There are 1 annotation classes in the dataset. Find the general statistics and balances for every class in the table below. Click any row to preview images that have labels of the selected class. Sort by column to find the most rare or prevalent classes.
Class ã…¤ | Images ã…¤ | Objects ã…¤ | Count on image average | Area on image average |
---|---|---|---|---|
fracturedâž” any | 717 | 1844 | 2.57 | 0.89% |
Images #
Explore every single image in the dataset with respect to the number of annotations of each class it has. Click a row to preview selected image. Sort by any column to find anomalies and edge cases. Use horizontal scroll if the table has many columns for a large number of classes in the dataset.
Object distribution #
Interactive heatmap chart for every class with object distribution shows how many images are in the dataset with a certain number of objects of a specific class. Users can click cell and see the list of all corresponding images.
Class sizes #
The table below gives various size properties of objects for every class. Click a row to see the image with annotations of the selected class. Sort columns to find classes with the smallest or largest objects or understand the size differences between classes.
Class | Object count | Avg area | Max area | Min area | Min height | Min height | Max height | Max height | Avg height | Avg height | Min width | Min width | Max width | Max width |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
fractured any | 1844 | 0.53% | 6.42% | 0.03% | 8px | 1.76% | 669px | 33.7% | 60px | 8% | 8px | 1.98% | 574px | 26.01% |
Spatial Heatmap #
The heatmaps below give the spatial distributions of all objects for every class. These visualizations provide insights into the most probable and rare object locations on the image. It helps analyze objects' placements in a dataset.
Objects #
Table contains all 1844 objects. Click a row to preview an image with annotations, and use search or pagination to navigate. Sort columns to find outliers in the dataset.
Object ID ã…¤ | Class ã…¤ | Image name click row to open | Image size height x width | Height ã…¤ | Height ã…¤ | Width ã…¤ | Width ã…¤ | Area ã…¤ |
---|---|---|---|---|---|---|---|---|
1âž” | fractured any | IMG0001760.jpg | 454 x 373 | 44px | 9.69% | 36px | 9.65% | 0.48% |
2âž” | fractured any | IMG0001760.jpg | 454 x 373 | 44px | 9.69% | 36px | 9.65% | 0.94% |
3âž” | fractured any | IMG0001760.jpg | 454 x 373 | 34px | 7.49% | 40px | 10.72% | 0.61% |
4âž” | fractured any | IMG0001760.jpg | 454 x 373 | 34px | 7.49% | 40px | 10.72% | 0.8% |
5âž” | fractured any | IMG0002628.jpg | 373 x 454 | 26px | 6.97% | 38px | 8.37% | 0.38% |
6âž” | fractured any | IMG0002628.jpg | 373 x 454 | 26px | 6.97% | 38px | 8.37% | 0.58% |
7âž” | fractured any | IMG0002349.jpg | 2880 x 2304 | 145px | 5.03% | 112px | 4.86% | 0.14% |
8âž” | fractured any | IMG0002349.jpg | 2880 x 2304 | 145px | 5.03% | 112px | 4.86% | 0.24% |
9âž” | fractured any | IMG0002126.jpg | 454 x 373 | 23px | 5.07% | 41px | 10.99% | 0.29% |
10âž” | fractured any | IMG0002126.jpg | 454 x 373 | 23px | 5.07% | 41px | 10.99% | 0.56% |
License #
Citation #
If you make use of the FracAtlas data, please cite the following reference:
@article{Abedeen_2023,
title = {{FracAtlas}: A Dataset for Fracture Classification, Localization and Segmentation of Musculoskeletal Radiographs},
author = {Iftekharul Abedeen and Md. Ashiqur Rahman and Fatema Zohra Prottyasha and Tasnim Ahmed and Tareque Mohmud Chowdhury and Swakkhar Shatabda},
year = 2023,
month = {aug},
journal = {Scientific Data},
publisher = {Springer Science and Business Media {LLC}},
volume = 10,
number = 1,
doi = {10.1038/s41597-023-02432-4},
url = {https://doi.org/10.1038%2Fs41597-023-02432-4}
}
If you are happy with Dataset Ninja and use provided visualizations and tools in your work, please cite us:
@misc{ visualization-tools-for-frac-atlas-dataset,
title = { Visualization Tools for FracAtlas Dataset },
type = { Computer Vision Tools },
author = { Dataset Ninja },
howpublished = { \url{ https://datasetninja.com/frac-atlas } },
url = { https://datasetninja.com/frac-atlas },
journal = { Dataset Ninja },
publisher = { Dataset Ninja },
year = { 2025 },
month = { jan },
note = { visited on 2025-01-15 },
}
Download #
Dataset FracAtlas can be downloaded in Supervisely format:
As an alternative, it can be downloaded with dataset-tools package:
pip install --upgrade dataset-tools
… using following python code:
import dataset_tools as dtools
dtools.download(dataset='FracAtlas', dst_dir='~/dataset-ninja/')
Make sure not to overlook the python code example available on the Supervisely Developer Portal. It will give you a clear idea of how to effortlessly work with the downloaded dataset.
The data in original format can be downloaded here.
Disclaimer #
Our gal from the legal dep told us we need to post this:
Dataset Ninja provides visualizations and statistics for some datasets that can be found online and can be downloaded by general audience. Dataset Ninja is not a dataset hosting platform and can only be used for informational purposes. The platform does not claim any rights for the original content, including images, videos, annotations and descriptions. Joint publishing is prohibited.
You take full responsibility when you use datasets presented at Dataset Ninja, as well as other information, including visualizations and statistics we provide. You are in charge of compliance with any dataset license and all other permissions. You are required to navigate datasets homepage and make sure that you can use it. In case of any questions, get in touch with us at hello@datasetninja.com.