Meat Cut - Dataset Ninja

Introduction #

Released 2021-04-19 ·McCarren Andrew, Scriney Michael, Mark Roantreeet al.

The aim of Meat Cut Image Dataset (BEEF) creation was to identify five different meat cuts from images and weights collected by a trained operator within the working environment of a commercial Irish beef plant. Individual cut images and weights from 7,987 meats cuts extracted from semimembranosus muscles (i.e., topside muscle), post editing, were available.

Motivation

The identification of different meat cuts for labeling and quality control on production lines is still largely a manual process. As a result, it is a labor-intensive exercise with the potential for not only error but also bacterial cross-contamination. Artificial intelligence is used in many disciplines to identify objects within images, but these approaches usually require a considerable volume of images for training and validation. Processes such as meat cutting, fat determination, and meat deboning have been partially automated. However, the labeling and identification of meat cuts still require a substantial amount of human intervention and manual handling. This can incur additional labor costs as well as being a source of error and potential microbiological contamination. Primal boning lines are a typical example of where multiple operators simultaneously work on a range of meat cuts. Each cut will eventually arrive at a weighing station where a single operator will inspect, identify, and weigh the arriving meat cut. The automation of the weighing process on boning lines has traditionally been conducted on single-meat-cut production lines. However, due to spatial restrictions in many meat plants, there is a preference in the beef industry to operate multiple meat cut types simultaneously on a single processing line.

Dataset creation

The data collected for this project were from beef cuts taken from a Topside (i.e., semimembranosus muscle) trimming line of a major Irish beef processor. The process flow for this line required an operator to weigh the primal topside cut on a start-of-line (SOL) weighing scales. Each cut was then placed on a conveyor belt where a team of operators removed fat, gristle, and secondary muscles. The remaining meat cuts were then labeled, weighed, and an image captured by a trained operator on an end-of-line (EOL) weighing scales, where the meat cuts were vacuum packed and labeled. For this dataset, there were five different meat cuts derived from the topside primal. The data acquisition required a hardware setup of weighing scales, at both the SOL and EOL together with a Vivotek bullet camera (IP8362—Bullet–Network Camera) at the EOL to capture a photo image of each meat cut. In addition, bespoke data capture software was used to acquire the characteristics of each meat cut being weighed in a 4-step process.

A manual capture of the carcass identifier number, primal weight, and the time of arrival at the SOL scales.
The time and the id of the operator validating the meat cut image as well as the meat cut weight, meat cut label, and a photo image at the EOL scales were all captured on bespoke data capture software used as a form of data acquisition in the development of an Agri Data Warehouse.
The EOL operator identified the meat cut using the data capture interface, ensuring the correct image was stored to disk and linked to the appropriate database entry containing the variables captured at both EOL and SOL points.
After each meat cut was removed from the scales, an image of the empty scales was captured. This was done to help remove image noise (discussed later).

Topside cuts: five meat cut variations. (a) Cap Off Pear Off, PAD topside muscle (20001); (b) Cap off, Pear on topside muscle (20002); (c) Topside Heart muscle (20003); (d) Topside Bullet muscle (20004); and (e) Cap Off, Non-PAD, Blue Skin Only topside muscle (20010).

End of line (EOL): a user interface for data collection.

A trained operator identified the meat cuts for subsequent categorization; the cuts were categorized as:

cap off pear off, pad topside muscle
cap off, pear on topside muscle
topside heart muscle
topside bullet muscle
cap off, non-pad, blue skin only topside muscle

Meat cut ID	N	Meat cut description	X−±S	Cut yield, %
20001	1,060	Cap Off, Pear Off, PAD	6.47 ± 1.17	55.11
20002	14	Cap Off, PAD On	8.87 ± 0.98	68.18
20003	2,132	Topside Heart PAD	5.87 ± 1.10	44.00
20004	2,085	Topside Bullet	1.40 ± 0.29	9.45
20010	2,696	Cap Off Non-PAD Blue Skin Only	7.82 ± 1.59	61.55

Dataset summary statistics.

The data collection period lasted 3 weaks. At the end of the data collection period, an analysis was conducted to determine if there were any outlying weights; this was undertaken by comparing the weights of the primal cut weighed on the SOL scales with the weight of the corresponding generated meat cut on the EOL scales. The ratio of each meat cut weighed on the EOL relative to the primal cut on the SOL is known as the product yield. Boning operators generally have target product yields which are dependent on the product specification. As the beef plant operator had a specification limit of 10.00% for each of the meat cuts used in these experiments, any absolute difference between the actual product yield and the target product yield that exceeded 10.00% was flagged as an outlier and subsequently removed from the dataset.

Image preprocessing

When conducting image preprocessing, one generally aims to improve the prediction process by enhancing certain characteristics and/or blurring others. For this dataset, each meat cut image was accompanied by its associated background image. In order to remove distracting or confusing items (e.g., operator hands or small meat blobs), the background image was removed from the meat cut image. This image was then converted to grayscale, and finally, the meat cut was segmented from the scale using the Gaussian blur technique. This final set of original and grayscale images was used in the model construction.

Images at various stages of preprocessing: (a) The background image reflecting the scale on which the meat cuts were placed, (b) the scale with a meat cut on it, (c) the difference between image (a) and (b), (d) the grayscale conversion of image (c), and (e) image represents the segmented meat cut.

The frequency of meat cut 20002 was disproportionately low as it is not frequently harvested in this plant. Therefore, it was decided to use data augmentation to create artificial training samples for meat cut 20002 in order to improve the imbalanced nature of the dataset. As part of the augmentation process, transformations such as anticlockwise rotation, clockwise rotation, horizontal flip, vertical flip, noise addition, and blurring were implemented. These processes created 84 additional images for meat cut 20002 resulting in a final count of 98 images.

Note: the authors of the dataset did not provide a way to compare Weights with the corresponding image.

Expand

Homepage

Research Paper

Summary #

Meat Cut Image Dataset (BEEF) is a dataset for classification and identification tasks. It is used in the food industry.

The dataset consists of 9017 images with 0 labeled objects. There are no pre-defined train/val/test splits in the dataset. Alternatively, the dataset could be split into 5 meat cuts: cap off, non-pad, blue skin only topside muscle (2740 images), topside bullet muscle (2188 images), topside heart muscle (2178 images), cap off pear off, pad topside muscle (1092 images), and cap off, pear on topside muscle (14 images), or into 2 image sets: beef (8212 images) and background (805 images). Additionally, every image marked with product id, plant id, timestamp, date tags. The dataset was released in 2021 by the Big Data and Analytics Research Centre, Ireland.

Explore #

Meat Cut dataset has 9017 images. Click on one of the examples below or open "Explore" tool anytime you need to view dataset images with annotations. This tool has extended visualization capabilities like zoom, translation, objects table, custom filters and more. Hover the mouse over the images to hide or show annotations.

👀

Have a look at 9017 images

View images along with annotations and tags, search and filter by various parameters

License #

Meat Cut Image Dataset (BEEF) is under CC BY 4.0 license.

Source

Citation #

If you make use of the Meat Cut data, please cite the following reference:

@dataset{mccarren_2021_4704391,
  author       = {McCarren, Andrew and
                  Scriney, Michael and
                  Roantree, Mark and
                  Gualano, Leonardo and
                  Onibonoje, Oluwadurotimi and
                  Prakash, Satya},
  title        = {Meat Cut Image Dataset (BEEF)},
  month        = apr,
  year         = 2021,
  publisher    = {Zenodo},
  doi          = {10.5281/zenodo.4704391},
  url          = {https://doi.org/10.5281/zenodo.4704391}
}

Source

If you are happy with Dataset Ninja and use provided visualizations and tools in your work, please cite us:

@misc{ visualization-tools-for-meat-cut-dataset,
  title = { Visualization Tools for Meat Cut Dataset },
  type = { Computer Vision Tools },
  author = { Dataset Ninja },
  howpublished = { \url{ https://datasetninja.com/meat-cut } },
  url = { https://datasetninja.com/meat-cut },
  journal = { Dataset Ninja },
  publisher = { Dataset Ninja },
  year = { 2025 },
  month = { aug },
  note = { visited on 2025-08-25 },
}

Download #

Dataset Meat Cut can be downloaded in Supervisely format:

As an alternative, it can be downloaded with dataset-tools package:

pip install --upgrade dataset-tools

… using following python code:

import dataset_tools as dtools

dtools.download(dataset='Meat Cut', dst_dir='~/dataset-ninja/')

Make sure not to overlook the python code example available on the Supervisely Developer Portal. It will give you a clear idea of how to effortlessly work with the downloaded dataset.

The data in original format can be downloaded here.

. . .

Disclaimer #

Our gal from the legal dep told us we need to post this:

Dataset Ninja provides visualizations and statistics for some datasets that can be found online and can be downloaded by general audience. Dataset Ninja is not a dataset hosting platform and can only be used for informational purposes. The platform does not claim any rights for the original content, including images, videos, annotations and descriptions. Joint publishing is prohibited.

You take full responsibility when you use datasets presented at Dataset Ninja, as well as other information, including visualizations and statistics we provide. You are in charge of compliance with any dataset license and all other permissions. You are required to navigate datasets homepage and make sure that you can use it. In case of any questions, get in touch with us at hello@datasetninja.com.