Dataset Ninja LogoDataset Ninja:

RUGD Dataset

7436241507
Tagsafety, robotics
Tasksemantic segmentation
Release YearMade in 2019
Licensepublicly available
Download5 GB

Introduction #

Maggie Wigness, Sungmin Eum, John G. Rogerset al.

The authors introduce the RUGD: Robot Unstructured Ground Driving Dataset with video sequences captured from a small, unmanned mobile robot traversing in unstructured environments. This data differs from existing autonomous driving benchmark data in that it contains significantly more terrain types, irregular class boundaries, minimal structured markings, and presents challenging visual properties often experienced in off road navigation, e.g., blurred frames. Over 7, 000 frames of pixel-wise annotation are included with this dataset.

Motivation

The development of extensive, labeled visual benchmark datasets has played a pivotal role in advancing the state-of-the-art in recognition, detection, and semantic segmentation. The success of these techniques has elevated visual perception to a primary information source for making navigation decisions on autonomous vehicles. Consequently, a multitude of benchmark datasets tailored specifically for autonomous navigation have emerged, featuring real-world scenarios where vehicles navigate through urban environments, interacting with pedestrians, bicycles, and other vehicles on roadways. These datasets serve as essential tools for evaluating progress toward the release of safe and reliable autonomous products.

Despite substantial strides in autonomous driving technologies, the current state-of-the-art is primarily tailored to well-paved and clearly outlined roadways. However, applications like Humanitarian Assistance and Disaster Relief (HADR), agricultural robotics, environmental surveying in hazardous areas, and humanitarian demining present environments that lack the structured and easily identifiable features commonly found in urban cities. Operating in these unstructured, off-road driving scenarios necessitates a visual perception system capable of semantically segmenting scenes with highly irregular class boundaries. This system must precisely localize and recognize terrain, vegetation, man-made structures, debris, and other hazards to assess traversability and ensure safety.

image

Irregular boundaries are common in unstructured environments as seen in this example image and annotation. This challenging property exists throughout all sequences in the RUGD dataset and arises because of the natural growth and pourous-like texture of vegetation and occlusion.

Because existing benchmark datasets for self-driving vehicles are collected from urban cities, they seldom contain elements that may be present in unstructured environments. As such, there is a need for an entirely new dataset relevant to these driving conditions, where structured cues cannot be readily relied on by the autonomous system.

Dataset description

Robot Unstructured Ground Driving (RUGD) dataset, is composed of a set of video sequences collected from a small unmanned ground robot performing an exploration task in a variety of natural, unstructured environments and semi-urban areas. In addition to the raw frame sequences, RUGD contains dense pixel-wise annotations for every fifth frame in a sequence, providing a total of 7, 453 labeled images for learning and evaluation in this new driving scenario.

image

Example ground truth annotations provided in the RUGD dataset. Frames from the video sequences are densely annotated with pixel-wise labels from 24 different visual classes.

The unique characteristics, and thus, major contributions, of the RUGD dataset can be summarized as follows:

  • The majority of scenes contain no discernible geometric edges or vanishing points, and semantic boundaries are highly irregular.
  • Robot exploration traversal creates irregular routes that do not adhere to a single terrain type; eight distinct terrains are traversed.
  • Unique frame viewpoints are encountered due to sloped terrains and vegetation occlusion.
  • Off-road traversal of rough terrain results in challenging frame focus.

These characteristics provide realistic off-road driving scenarios that present a variety of new challenges to the research community in addition to existing ones, e.g., illumination changes and harsh shadows. The authors provide an initial semantic segmentation benchmark using several current state-of-the-art semantic segmentation architectures, originally developed for indoor and urban scene segmentation.

Dataset collection

The robot used to collect the RUGD dataset is based on a Clearpath ‘Husky’ platform. The robot chassis is equipped with a sensor payload consisting of a Velodyne HDL-32 LiDAR, a Garmin GPS receiver, a Microstrain GX3-25 IMU, and a Prosilica GT2750C camera. The camera is capable of 6.1 megapixels at 19.8 frames per second, but most sequences are collected at half resolution and approximately 15 frames per second. This reduction in resolution and frame rate provides a compromise between file size and quality. The images are collected through an 8mm lens with a wide depth of field for outdoor lighting conditions, with exposure and gain settings for minimized motion blur. The robot typically operated at its top speed of 1.0 m/s. This platform provides a unique camera viewpoint compared to existing data. The Husky is significantly smaller than vehicles commonly used in urban environments, with external dimensions of 990 x 670 x 390 mm. The camera sensor is mounted on the front of the platform just above the external height of the robot, resulting in an environment viewpoint from less than 25 centimeters off the ground.

image

Robot used to collect the RUGD dataset.

RUGD is a collection of robot exploration video sequences captured as a human teleoperates the robot in the environment. The exploration task is defined such that the human operator maneuvers the robot to mimic autonomous behavior aimed at trying to visually observe different regions of the environment. Given this definition, the sequences depict the robot traversing not only on what may be commonly defined as a road, but also through vegetation, over small obstacles,
and other terrain present in the area. The average duration of a video sequence is just over 3 minutes. Exploration traversals are performed in areas that represent four general environment categories: creek - areas near a body of water with some vegetation, park - woodsy areas with buildings and paved roads, trail - areas representing non-paved, gravel terrain in woods, village - areas with buildings and limited paved roads.

image

Example frames from robot exploration traversal videos from each of the four environment categories.

The robot’s onboard camera sensor records videos of each traversal at a frame rate of 15Hz, except for the village scenario, which is captured at a lower frequency of 1Hz, featuring a frame resolution of 1376x1110. These videos not only showcase various terrain types but also capture challenging conditions such as harsh shadows, blurred frames, changes in illumination, washed-out regions, orientation shifts resulting from sloped planes, and occluded perspectives due to traversal through areas with tall vegetation. The deliberate inclusion of these diverse scenes ensures that the dataset faithfully mirrors the realistic conditions encountered in these environments.

Dataset split

Videos from the RUGD dataset are partitioned into train/val/test splits for our benchmark experimental evaluation. ∼64% of the total annotated frames used for training, ∼10% for validation, and the remaining ∼26% for testing. While the selection of videos for each split was decided to roughly produce specific sizes of each split, two videos were specifically placed in the test split to test realistic challenges faced in many motivating applications. First, the creek sequence is the only example with significant rock bed terrain. Reserving this as a testing video demonstrates how existing architectures are able to learn from sparse label instances. This is a highly realistic scenario in unstructured environment applications as it is difficult to collect large amounts of training data for all possible terrain a priori. Second, the trail 7 sequence represents significantly more off-road jitter than others, producing many frames that appear to be quite blurry. This property is also present in training sequences, but again we reserve the difficult video to determine how well the techniques are able to perform under these harsh conditions.

C P1 P2 P8 T T3 T4 T5 T6 T7 T9 T10 T11 T12 T13 T14 T15 V Total %
train x x x x x x x x x x x x 4759 63.83
val x x 33 9.83
test x x x x 1964 26.34

Training, validation and testing splits. VIDEO SEQUENCES (C: CREEK, P: PARK, T: TRAIL, V: VILLAGE) THAT FALL INTO EACH SPLIT ARE DENOTED WITH AN X. THE TOTAL NUMBER OF FRAMES AND PERCENTAGE OF EACH SPLIT WITHIN THE WHOLE DATASET ARE SEEN IN THE LAST TWO COLUMNS.

ExpandExpand
Dataset LinkHomepageDataset LinkResearch Paper

Summary #

RUGD: Robot Unstructured Ground Driving Dataset is a dataset for a semantic segmentation task. It is used in the safety and robotics industries.

The dataset consists of 7436 images with 50032 labeled objects belonging to 24 different classes including tree, sky, grass, and other: mulch, gravel, bush, pole, log, building, vehicle, container, fence, asphalt, water, rock, sign, concrete, rock-bed, picnic-table, dirt, bridge, sand, person, and bicycle.

Images in the RUGD dataset have pixel-level semantic segmentation annotations. All images are labeled (i.e. with annotations). There are 3 splits in the dataset: train (4779 images), test (1924 images), and val (733 images). Alternatively, the dataset could be split into 4 general environment categories: trail (4843 images), park (1640 images), creek (836 images), and village (117 images). The dataset was released in 2019 by the US joint research group.

Here are the visualized examples for the classes:

Explore #

RUGD dataset has 7436 images. Click on one of the examples below or open "Explore" tool anytime you need to view dataset images with annotations. This tool has extended visualization capabilities like zoom, translation, objects table, custom filters and more. Hover the mouse over the images to hide or show annotations.

OpenSample annotation mask from RUGDSample image from RUGD
OpenSample annotation mask from RUGDSample image from RUGD
OpenSample annotation mask from RUGDSample image from RUGD
OpenSample annotation mask from RUGDSample image from RUGD
OpenSample annotation mask from RUGDSample image from RUGD
OpenSample annotation mask from RUGDSample image from RUGD
OpenSample annotation mask from RUGDSample image from RUGD
OpenSample annotation mask from RUGDSample image from RUGD
OpenSample annotation mask from RUGDSample image from RUGD
OpenSample annotation mask from RUGDSample image from RUGD
OpenSample annotation mask from RUGDSample image from RUGD
OpenSample annotation mask from RUGDSample image from RUGD
OpenSample annotation mask from RUGDSample image from RUGD
OpenSample annotation mask from RUGDSample image from RUGD
OpenSample annotation mask from RUGDSample image from RUGD
OpenSample annotation mask from RUGDSample image from RUGD
OpenSample annotation mask from RUGDSample image from RUGD
OpenSample annotation mask from RUGDSample image from RUGD
OpenSample annotation mask from RUGDSample image from RUGD
OpenSample annotation mask from RUGDSample image from RUGD
OpenSample annotation mask from RUGDSample image from RUGD
OpenSample annotation mask from RUGDSample image from RUGD
OpenSample annotation mask from RUGDSample image from RUGD
OpenSample annotation mask from RUGDSample image from RUGD
OpenSample annotation mask from RUGDSample image from RUGD
OpenSample annotation mask from RUGDSample image from RUGD
OpenSample annotation mask from RUGDSample image from RUGD
OpenSample annotation mask from RUGDSample image from RUGD
OpenSample annotation mask from RUGDSample image from RUGD
OpenSample annotation mask from RUGDSample image from RUGD
👀
Have a look at 7436 images
View images along with annotations and tags, search and filter by various parameters

Class balance #

There are 24 annotation classes in the dataset. Find the general statistics and balances for every class in the table below. Click any row to preview images that have labels of the selected class. Sort by column to find the most rare or prevalent classes.

Search
Rows 1-10 of 24
Class
ã…¤
Images
ã…¤
Objects
ã…¤
Count on image
average
Area on image
average
treeâž”
mask
7408
7408
1
38.89%
skyâž”
mask
7035
7035
1
8.19%
grassâž”
mask
6798
6798
1
24.78%
mulchâž”
mask
4113
4113
1
19.07%
gravelâž”
mask
3595
3595
1
14.31%
bushâž”
mask
2958
2958
1
11.62%
poleâž”
mask
2565
2565
1
0.3%
logâž”
mask
2546
2546
1
1.81%
buildingâž”
mask
2407
2407
1
4.42%
vehicleâž”
mask
1969
1969
1
2.34%

Co-occurrence matrix #

Co-occurrence matrix is an extremely valuable tool that shows you the images for every pair of classes: how many images have objects of both classes at the same time. If you click any cell, you will see those images. We added the tooltip with an explanation for every cell for your convenience, just hover the mouse over a cell to preview the description.

Images #

Explore every single image in the dataset with respect to the number of annotations of each class it has. Click a row to preview selected image. Sort by any column to find anomalies and edge cases. Use horizontal scroll if the table has many columns for a large number of classes in the dataset.

Object distribution #

Interactive heatmap chart for every class with object distribution shows how many images are in the dataset with a certain number of objects of a specific class. Users can click cell and see the list of all corresponding images.

Class sizes #

The table below gives various size properties of objects for every class. Click a row to see the image with annotations of the selected class. Sort columns to find classes with the smallest or largest objects or understand the size differences between classes.

Search
Rows 1-10 of 24
Class
Object count
Avg area
Max area
Min area
Min height
Min height
Max height
Max height
Avg height
Avg height
Min width
Min width
Max width
Max width
tree
mask
7408
38.89%
91.15%
0.02%
6px
1.09%
550px
100%
313px
56.86%
14px
2.03%
688px
100%
sky
mask
7035
8.19%
42.9%
0%
2px
0.36%
550px
100%
205px
37.28%
1px
0.15%
688px
100%
grass
mask
6798
24.78%
73.78%
0%
1px
0.18%
550px
100%
236px
42.91%
1px
0.15%
688px
100%
mulch
mask
4113
19.07%
89.61%
0%
1px
0.18%
550px
100%
184px
33.47%
1px
0.15%
688px
100%
gravel
mask
3595
14.31%
52.86%
0%
1px
0.18%
543px
98.73%
175px
31.88%
1px
0.15%
688px
100%
bush
mask
2958
11.62%
95.73%
0%
1px
0.18%
550px
100%
173px
31.46%
1px
0.15%
688px
100%
pole
mask
2565
0.3%
12.15%
0%
1px
0.18%
493px
89.64%
130px
23.58%
1px
0.15%
688px
100%
log
mask
2546
1.81%
19.1%
0%
1px
0.18%
503px
91.45%
96px
17.43%
1px
0.15%
688px
100%
building
mask
2407
4.42%
55.5%
0.01%
5px
0.91%
550px
100%
101px
18.31%
4px
0.58%
688px
100%
vehicle
mask
1969
2.34%
38.03%
0%
2px
0.36%
550px
100%
70px
12.79%
4px
0.58%
688px
100%

Spatial Heatmap #

The heatmaps below give the spatial distributions of all objects for every class. These visualizations provide insights into the most probable and rare object locations on the image. It helps analyze objects' placements in a dataset.

Spatial Heatmap

Objects #

Table contains all 50032 objects. Click a row to preview an image with annotations, and use search or pagination to navigate. Sort columns to find outliers in the dataset.

Search
Rows 1-10 of 50032
Object ID
ã…¤
Class
ã…¤
Image name
click row to open
Image size
height x width
Height
ã…¤
Height
ã…¤
Width
ã…¤
Width
ã…¤
Area
ã…¤
1âž”
sky
mask
trail-11_02156.png
550 x 688
280px
50.91%
688px
100%
6.84%
2âž”
tree
mask
trail-11_02156.png
550 x 688
366px
66.55%
688px
100%
42.76%
3âž”
mulch
mask
trail-11_02156.png
550 x 688
252px
45.82%
688px
100%
39.54%
4âž”
bush
mask
trail-11_02156.png
550 x 688
197px
35.82%
615px
89.39%
10.86%
5âž”
sky
mask
trail-11_01626.png
550 x 688
245px
44.55%
662px
96.22%
7.31%
6âž”
grass
mask
trail-11_01626.png
550 x 688
291px
52.91%
688px
100%
23.51%
7âž”
water
mask
trail-11_01626.png
550 x 688
185px
33.64%
366px
53.2%
6.68%
8âž”
tree
mask
trail-11_01626.png
550 x 688
296px
53.82%
688px
100%
42.72%
9âž”
dirt
mask
trail-11_01626.png
550 x 688
169px
30.73%
480px
69.77%
10.13%
10âž”
person
mask
trail-11_01626.png
550 x 688
93px
16.91%
26px
3.78%
0.38%

License #

The RUGD: Robot Unstructured Ground Driving Dataset is publicly available.

Source

Citation #

If you make use of the RUGD data, please cite the following reference:

@inproceedings{RUGD2019IROS,
  author = {Wigness, Maggie and Eum, Sungmin and Rogers, John G and Han, David and Kwon, Heesung},
  title = {A RUGD Dataset for Autonomous Navigation and Visual Perception in Unstructured Outdoor Environments},
  booktitle = {International Conference on Intelligent Robots and Systems (IROS)},
  year = {2019}
}

Source

If you are happy with Dataset Ninja and use provided visualizations and tools in your work, please cite us:

@misc{ visualization-tools-for-rugd-dataset,
  title = { Visualization Tools for RUGD Dataset },
  type = { Computer Vision Tools },
  author = { Dataset Ninja },
  howpublished = { \url{ https://datasetninja.com/rugd } },
  url = { https://datasetninja.com/rugd },
  journal = { Dataset Ninja },
  publisher = { Dataset Ninja },
  year = { 2024 },
  month = { nov },
  note = { visited on 2024-11-21 },
}

Download #

Dataset RUGD can be downloaded in Supervisely format:

As an alternative, it can be downloaded with dataset-tools package:

pip install --upgrade dataset-tools

… using following python code:

import dataset_tools as dtools

dtools.download(dataset='RUGD', dst_dir='~/dataset-ninja/')

Make sure not to overlook the python code example available on the Supervisely Developer Portal. It will give you a clear idea of how to effortlessly work with the downloaded dataset.

The data in original format can be downloaded here:

. . .

Disclaimer #

Our gal from the legal dep told us we need to post this:

Dataset Ninja provides visualizations and statistics for some datasets that can be found online and can be downloaded by general audience. Dataset Ninja is not a dataset hosting platform and can only be used for informational purposes. The platform does not claim any rights for the original content, including images, videos, annotations and descriptions. Joint publishing is prohibited.

You take full responsibility when you use datasets presented at Dataset Ninja, as well as other information, including visualizations and statistics we provide. You are in charge of compliance with any dataset license and all other permissions. You are required to navigate datasets homepage and make sure that you can use it. In case of any questions, get in touch with us at hello@datasetninja.com.