Dataset Ninja LogoDataset Ninja:

MCubeS Dataset

2000201291
Tagconstruction
Tasksemantic segmentation
Release YearMade in 2022
LicenseMIT
Download6 GB

Introduction #

Yupeng Liang, Ryosuke Wakaki, Shohei Nobuharaet al.

The MCubeS: Multimodal Material Segmentation Dataset was created it from RGB, polarization, and near-infrared images. Dataset contains 500 sets of multimodal images capturing 42 street scenes. The capacity to discern materials based on their visual attributes is fundamental for computer vision applications, especially those engaged in real-world scenarios. Material segmentation, requiring meticulous identification of materials at a per-pixel level, poses a substantial challenge. Unlike objects, materials lack clear visual signatures in standard RGB representations. Nonetheless, the distinct radiometric behaviors of diverse materials can be aptly captured through alternative imaging modalities beyond RGB. Each image and pixel within the dataset is meticulously annotated with ground truth information for both material segmentation and semantic segmentation.

While advancements in object recognition empower computers to precisely identify objects within images, the human visual system surpasses mere object identification. It possesses the ability to discriminate between, for example, a paper cup and a ceramic one, enabling thoughtful planning of interactions before physical engagement. If computers could emulate this discernment of object composition, critical decisions could be executed with enhanced speed and accuracy. The imperative task of densely recognizing materials at the pixel level in an image arises. Successful material segmentation offers notable advantages for road scene analysis, allowing autonomous vehicles or advanced driver assistance systems (ADAS) to differentiate between an asphalt road and a concrete one, or discern a leaf on the road from dirt, thereby enhancing control safety.

Despite the substantial strides in object recognition, outdoor material segmentation remains challenging due to the myriad materials encountered in the real world and the scarcity of annotated data. The intricacy of material segmentation is heightened by the lack of well-defined visual features in regular RGB images for materials, as opposed to objects that frequently exhibit distinctive shapes, contours, and surface textures.

Dataset creation

The authorts realize multimodal material segmentation, the recognition of per-pixel material categories from a set of images from the same vantage point but of different imaging modalities. In particular, they consider the combination of regular RGB, polarization, and near-infrared images at each instance. They build an imaging system consisting of a binocular stereo of quad-Bayer RGB polarization cameras, a monocular near-infrared camera, and a LiDAR to capture outdoor road scenes. In addition to materials, the authors also annotated semantic segmentation labels.

image

Per-pixel recognition of materials from multiple imaging modalities.

MCubeS provides a comprehensive depiction of the visual features of various materials found in everyday outdoor environments, observed from viewpoints situated on roads, pavements, and sidewalks. The primary challenges centered around achieving systematic image capture with precise spatio-temporal alignment and subsequent annotation involving pixel-wise categorization of materials.

In the real world, surfaces composed of diverse materials exhibit distinct subsurface compositions and surface structures at the mesoscopic scale. These differences manifest in varying behaviors of incident light, influencing its reflection, refraction, and absorption. These nuances are particularly discernible in polarization properties, including degrees of linear polarization (DoLP) or angles of linear polarization (AoLP), as well as in the absorption of near-infrared (NIR) light. The MCubeS dataset meticulously captures these intricate radiometric characteristics of diverse materials through a camera system integrating a stereo pair of RGB-polarization (RGB-P) cameras and a near-infrared (NIR) camera. Additionally, the image capture system incorporates LiDAR technology to enhance the accuracy of label propagation.

image

Example multimodal images from the MCubeS Dataset. Different imaging modalities capture characteristic radiometric behaviors of different materials.

Image Capture System

The researchers devised a customized imaging system, affixing it to a cart to capture multimodal sequences of real-world scenes, replicating a viewpoint akin to that of a car. This system is augmented with a sparse LiDAR, strategically incorporated to facilitate the subsequent propagation of annotations across diverse image modalities. The imaging apparatus comprises a duo of RGB-polarization (RGB-P) cameras (LUCID TRI050S-QC, 2/3-inch sensor), an NIR camera (FLIR GS3-U3-41C6NIR-C, 1-inch sensor), and a LiDAR unit (Livox Mid-100).

Dataset description

The MCubeS dataset comprises two distinct categories of image sequences. The first type involves continuous forward movement of the imaging system, replicating the viewpoint of a moving vehicle. The second type consists of sequences captured at a stationary position while the imaging system pans, facilitating the inclusion of materials such as water, which are infrequently observed in road scenes. This second type of data enhances the diversity of sample images, particularly for less common materials.

The dataset encompasses 42 scenes of the first type (continuous forward movement), sampled at three frames per second, with an average sequence length of 309 seconds. The raw image sequences total 26,650 image sets, from which 424 image sets, evenly spaced in time, were annotated. Additionally, 19 scenes were captured for the second type of data (fixed-location panning). For each scene, eight image sets were acquired, covering a 360-degree view, and a total of 76 image sets from these sequences were annotated. Overall, the dataset includes 500 annotated image sets. The authors defined 20 distinct materials through a thorough examination of the data within the MCubeS dataset.

image

Example regions for each class. The actual images are annotated densely for all pixels. Human body (skin and hair), leaf, grass, and sky are object and stuff names selected to represent their unique materials.

MCubeS scenes predominantly feature roadways and sidewalks, encompassing diverse elements ranging from pavements and roads to various objects like bicycles and manholes. Notably, these elements can be composed of distinct materials, even within the same category. For instance, pavements may be constructed from asphalt, concrete, brick, cobblestone, gravel, or sand. Special attention is given to road markings and manholes, considering them as distinct materials due to their significance in driving scenarios. Vehicles are primarily constructed with metal, rubber, and glass components. Human attire is labeled as fabric, attributes almost exclusive to people. In the case of natural elements, trees and grass are prevalent in the scenes, with wood and leaves designated as material labels for trees (as wood) and fallen leaves (as leaves). water, found in puddles, rivers, and ponds, is another noteworthy material.

The category encompasses materials like ceramic, plaster, plastic, and sky, which, while common, have lesser occurrences or significance in driving scenarios. The authors diligently annotate every image with per-pixel material classes. Initially, they annotate the left RGB image and subsequently propagate per-pixel labels to other views of the same frame. Additionally, a dense depth image for the left RGB-P camera is reconstructed by integrating RGB stereo and LiDAR 3D points.

image

MCubeS dataset spans a wide range of road scenes (top row), including river sidewalks to railroad tracks, each densely annotated with materials (bottom row).

Image modalities

Polarization
Light manifests as a transverse wave, characterized by perpendicular electric and magnetic fields. When the electric field of light lies within a singular plane, it is termed linearly polarized. The orientation angle of polarized light can be defined within the plane perpendicular to the transverse direction. In contrast, unpolarized light comprises electric fields uniformly oriented in all directions.

When the orientations of the electric fields form an ellipse on the plane perpendicular to the transverse direction, the light is considered partially linearly polarized. In this scenario, the major axis of the ellipse defines the angle of polarization, and the ratio of the magnitudes of the major and minor axes is denoted as the degree of polarization. The intensity of partially polarized light, with the angle of linear polarization (AoLP) and the degree of linear polarization (DoLP) measured using a polarization filter, plays a significant role.

Near-infrared Light
Light comprises a spectrum of electromagnetic waves with varying wavelengths, leading to distinctive radiometric behaviors dependent on the wavelength. Objects exhibit diverse colors as their reflection is influenced by wavelength. Simultaneously, the absorption and scattering characteristics of light transmitted into a subsurface or a medium vary based on wavelength. For instance, shorter wavelength light tends to undergo more forward scattering compared to longer wavelength light. The absorption of light experiences pronounced changes, particularly beyond the visual spectrum.

The absorption of light, notably through water—a common element in our daily surroundings, including not just puddles but also natural surfaces like water-containing leaves—is highly sensitive to wavelength. In the near-infrared range spanning from 800nm to 1000nm, the absorption coefficient of light in water exhibits nearly linear growth from 0 to 1. Consequently, for a camera equipped with a near-infrared filter within this wavelength range observing water or water-containing surfaces, varying intensity levels encode the depth or “wetness” of the surface, with darker shades indicating greater depth or wetness.

ExpandExpand
Dataset LinkHomepageDataset LinkResearch Paper

Summary #

MCubeS: Multimodal Material Segmentation Dataset is a dataset for semantic segmentation and object detection tasks. It is used in the construction and automotive industries.

The dataset consists of 2000 images with 25164 labeled objects belonging to 20 different classes including concrete, metal, leaf, and other: sky, wood, glass, asphalt, rubber, grass, plastic, brick, ceramic, road marking, fabric leather, human body, sand, cobblestone, plaster, water, and gravel.

Images in the MCubeS dataset have pixel-level semantic segmentation annotations. All images are labeled (i.e. with annotations). There are 3 splits in the dataset: train (1208 images), test (408 images), and val (384 images). Additionally, images are grouped by im id (total 500 images groups). Also every image contains information about its scene (total 42 scenes). Explore it in supervisely labeling tool. The dataset was released in 2022 by the Graduate School of Informatics, Kyoto, Japan.

Dataset Poster

Explore #

MCubeS dataset has 2000 images. Click on one of the examples below or open "Explore" tool anytime you need to view dataset images with annotations. This tool has extended visualization capabilities like zoom, translation, objects table, custom filters and more. Hover the mouse over the images to hide or show annotations.

OpenSample annotation mask from MCubeSSample image from MCubeS
OpenSample annotation mask from MCubeSSample image from MCubeS
OpenSample annotation mask from MCubeSSample image from MCubeS
OpenSample annotation mask from MCubeSSample image from MCubeS
OpenSample annotation mask from MCubeSSample image from MCubeS
OpenSample annotation mask from MCubeSSample image from MCubeS
OpenSample annotation mask from MCubeSSample image from MCubeS
OpenSample annotation mask from MCubeSSample image from MCubeS
OpenSample annotation mask from MCubeSSample image from MCubeS
OpenSample annotation mask from MCubeSSample image from MCubeS
OpenSample annotation mask from MCubeSSample image from MCubeS
OpenSample annotation mask from MCubeSSample image from MCubeS
OpenSample annotation mask from MCubeSSample image from MCubeS
OpenSample annotation mask from MCubeSSample image from MCubeS
OpenSample annotation mask from MCubeSSample image from MCubeS
OpenSample annotation mask from MCubeSSample image from MCubeS
OpenSample annotation mask from MCubeSSample image from MCubeS
OpenSample annotation mask from MCubeSSample image from MCubeS
OpenSample annotation mask from MCubeSSample image from MCubeS
OpenSample annotation mask from MCubeSSample image from MCubeS
OpenSample annotation mask from MCubeSSample image from MCubeS
OpenSample annotation mask from MCubeSSample image from MCubeS
OpenSample annotation mask from MCubeSSample image from MCubeS
OpenSample annotation mask from MCubeSSample image from MCubeS
OpenSample annotation mask from MCubeSSample image from MCubeS
OpenSample annotation mask from MCubeSSample image from MCubeS
OpenSample annotation mask from MCubeSSample image from MCubeS
OpenSample annotation mask from MCubeSSample image from MCubeS
OpenSample annotation mask from MCubeSSample image from MCubeS
OpenSample annotation mask from MCubeSSample image from MCubeS
👀
Have a look at 2000 images
View images along with annotations and tags, search and filter by various parameters

Class balance #

There are 20 annotation classes in the dataset. Find the general statistics and balances for every class in the table below. Click any row to preview images that have labels of the selected class. Sort by column to find the most rare or prevalent classes.

Search
Rows 1-10 of 20
Class
Images
Objects
Count on image
average
Area on image
average
concrete
mask
1996
1996
1
13.94%
metal
mask
1960
1960
1
7.66%
leaf
mask
1940
1940
1
12.95%
sky
mask
1888
1888
1
16.2%
wood
mask
1844
1844
1
4.63%
glass
mask
1780
1780
1
1.99%
asphalt
mask
1580
1580
1
22.6%
rubber
mask
1488
1488
1
0.9%
grass
mask
1484
1484
1
6.91%
plastic
mask
1380
1380
1
0.77%

Co-occurrence matrix #

Co-occurrence matrix is an extremely valuable tool that shows you the images for every pair of classes: how many images have objects of both classes at the same time. If you click any cell, you will see those images. We added the tooltip with an explanation for every cell for your convenience, just hover the mouse over a cell to preview the description.

Images #

Explore every single image in the dataset with respect to the number of annotations of each class it has. Click a row to preview selected image. Sort by any column to find anomalies and edge cases. Use horizontal scroll if the table has many columns for a large number of classes in the dataset.

Object distribution #

Interactive heatmap chart for every class with object distribution shows how many images are in the dataset with a certain number of objects of a specific class. Users can click cell and see the list of all corresponding images.

Class sizes #

The table below gives various size properties of objects for every class. Click a row to see the image with annotations of the selected class. Sort columns to find classes with the smallest or largest objects or understand the size differences between classes.

Search
Rows 1-10 of 20
Class
Object count
Avg area
Max area
Min area
Min height
Min height
Max height
Max height
Avg height
Avg height
Min width
Min width
Max width
Max width
concrete
mask
1996
13.94%
57.35%
0.13%
11px
1.07%
1024px
100%
694px
67.81%
168px
13.73%
1224px
100%
metal
mask
1960
7.66%
52.61%
0%
5px
0.49%
1024px
100%
631px
61.59%
5px
0.41%
1224px
100%
leaf
mask
1940
12.95%
54.11%
0.02%
27px
2.64%
1024px
100%
496px
48.45%
18px
1.47%
1224px
100%
sky
mask
1888
16.2%
43.8%
0.02%
17px
1.66%
546px
53.32%
388px
37.91%
28px
2.29%
1224px
100%
wood
mask
1844
4.63%
45.73%
0.01%
9px
0.88%
1024px
100%
428px
41.76%
5px
0.41%
1224px
100%
glass
mask
1780
1.99%
14.01%
0%
4px
0.39%
689px
67.29%
335px
32.7%
11px
0.9%
1224px
100%
asphalt
mask
1580
22.6%
47.17%
0%
2px
0.2%
638px
62.3%
411px
40.16%
10px
0.82%
1224px
100%
rubber
mask
1488
0.9%
6.01%
0%
2px
0.2%
1024px
100%
306px
29.83%
3px
0.25%
1224px
100%
grass
mask
1484
6.91%
44.62%
0.01%
10px
0.98%
614px
59.96%
274px
26.78%
9px
0.74%
1224px
100%
plastic
mask
1380
0.77%
11.7%
0%
2px
0.2%
973px
95.02%
255px
24.88%
3px
0.25%
1224px
100%

Spatial Heatmap #

The heatmaps below give the spatial distributions of all objects for every class. These visualizations provide insights into the most probable and rare object locations on the image. It helps analyze objects' placements in a dataset.

Spatial Heatmap

Objects #

Table contains all 25164 objects. Click a row to preview an image with annotations, and use search or pagination to navigate. Sort columns to find outliers in the dataset.

Search
Rows 1-10 of 25164
Object ID
Class
Image name
click row to open
Image size
height x width
Height
Height
Width
Width
Area
1
asphalt
mask
dolp_outscene1212_5_0000000120.png
1024 x 1224
516px
50.39%
1224px
100%
32.21%
2
concrete
mask
dolp_outscene1212_5_0000000120.png
1024 x 1224
1024px
100%
1224px
100%
4.51%
3
metal
mask
dolp_outscene1212_5_0000000120.png
1024 x 1224
1024px
100%
1224px
100%
13.95%
4
road marking
mask
dolp_outscene1212_5_0000000120.png
1024 x 1224
477px
46.58%
240px
19.61%
1.31%
5
glass
mask
dolp_outscene1212_5_0000000120.png
1024 x 1224
253px
24.71%
865px
70.67%
0.39%
6
plaster
mask
dolp_outscene1212_5_0000000120.png
1024 x 1224
85px
8.3%
317px
25.9%
0.36%
7
plastic
mask
dolp_outscene1212_5_0000000120.png
1024 x 1224
116px
11.33%
60px
4.9%
0.16%
8
rubber
mask
dolp_outscene1212_5_0000000120.png
1024 x 1224
436px
42.58%
897px
73.28%
1%
9
ceramic
mask
dolp_outscene1212_5_0000000120.png
1024 x 1224
73px
7.13%
291px
23.77%
0.18%
10
brick
mask
dolp_outscene1212_5_0000000120.png
1024 x 1224
597px
58.3%
1224px
100%
6.96%

License #

Multimodal Material Segmentation Dataset (MCubeS) is under MIT license.

Source

Citation #

If you make use of the Multimodal Material Segmentation data, please cite the following reference:

@InProceedings{Liang_2022_CVPR,
  author    = {Liang, Yupeng and Wakaki, Ryosuke and Nobuhara, Shohei and Nishino, Ko},
  title     = {Multimodal Material Segmentation},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  month     = {June},
  year      = {2022},
  pages     = {19800-19808}
}

Source

If you are happy with Dataset Ninja and use provided visualizations and tools in your work, please cite us:

@misc{ visualization-tools-for-multimodal-material-segmentation-dataset,
  title = { Visualization Tools for MCubeS Dataset },
  type = { Computer Vision Tools },
  author = { Dataset Ninja },
  howpublished = { \url{ https://datasetninja.com/multimodal-material-segmentation } },
  url = { https://datasetninja.com/multimodal-material-segmentation },
  journal = { Dataset Ninja },
  publisher = { Dataset Ninja },
  year = { 2024 },
  month = { jul },
  note = { visited on 2024-07-27 },
}

Download #

Dataset MCubeS can be downloaded in Supervisely format:

As an alternative, it can be downloaded with dataset-tools package:

pip install --upgrade dataset-tools

… using following python code:

import dataset_tools as dtools

dtools.download(dataset='MCubeS', dst_dir='~/dataset-ninja/')

Make sure not to overlook the python code example available on the Supervisely Developer Portal. It will give you a clear idea of how to effortlessly work with the downloaded dataset.

The data in original format can be downloaded here.

. . .

Disclaimer #

Our gal from the legal dep told us we need to post this:

Dataset Ninja provides visualizations and statistics for some datasets that can be found online and can be downloaded by general audience. Dataset Ninja is not a dataset hosting platform and can only be used for informational purposes. The platform does not claim any rights for the original content, including images, videos, annotations and descriptions. Joint publishing is prohibited.

You take full responsibility when you use datasets presented at Dataset Ninja, as well as other information, including visualizations and statistics we provide. You are in charge of compliance with any dataset license and all other permissions. You are required to navigate datasets homepage and make sure that you can use it. In case of any questions, get in touch with us at hello@datasetninja.com.