fbpx
Now available—the Embedded Vision Summit On-Demand Edition! Gain valuable computer vision and edge AI insights and know-how from the experts at the 2021 Summit.

This blog post was originally published at Unity Technologies’ website. It is reprinted here with the permission of Unity Technologies.

This Sample Home Interior Dataset is a small example of the synthetic data provided by our Unity Computer Vision Datasets offering. Our team of experts works with customers worldwide to generate custom datasets at any scale, tailored to their specific requirements.

This dataset is an example of what a consumer electronics or smart home technology company could capture using a camera or robotic system.

How do I get the sample dataset?

Click here to download this sample dataset.

It features two different types of homes (multi-level townhome and traditional two-story home) with major living areas, kitchens and bedrooms. The dataset includes:

  • 1,000 synthetic RGB images
  • 1,000 synthetic images with instance segmentation
  • 1,000 synthetic images with semantic segmentation
  • JSON metadata for each RGB image to locate 2D and 3D bounding boxes
  • Unity Computer Vision Dataset Visualizer, a Python-based tool that allows you to visualize datasets created using Unity Computer Vision tools

What can be done with a full home synthetic dataset?

This type of dataset may be used for a range of computer vision applications for smart homes, such as:

  • Detecting animals, people and other items on smart cameras
  • Home security
  • Interior navigation for robots
  • Smart appliances (e.g., refrigerators) and smart home hubs
  • Interior design recommendations

How are the images labeled?

All objects in the Unity scene have known 2D coordinates in the image. The Unity Perception Package takes advantage of the predetermined environment layout to label objects in various ways, simultaneously as the images are being generated. This example includes images that show 2D and 3D bounding boxes as well as instance and semantic segmentation.

RGB imagery

This RGB image shows the visual quality and accuracy possible with Unity Computer Vision Datasets.


RGB Image

2D bounding boxes

The 2D Bounding Boxes precisely locate and label objects in screen space for recognition.


2D bounding boxes

3D bounding boxes

The 3D Bounding Boxes provide precise coordinates in world space of object locations.


3D bounding boxes

Instance segmentation

This image shows the instance segmentation of the dataset, where every labeled object is uniquely identified.


Instance segmentation

Semantic segmentation

Semantic segmentation provides a clear and precise mask to identify every instance of a class of objects, such as types of chairs or furniture.


Semantic segmentation

What parameters were used to help diversify the scene?

In order to generate a diverse dataset, multiple parameters are adjusted automatically in order to provide variety in the scene. The house plans were selected to provide a wide variety in spatial configurations, windows and doors, ceiling heights, stairs, kitchen layouts and lighting.

Layout variations

Kitchen variations

Lighting variations

Afternoon sun angle Sunset sun angle

Camera position variations

Centered and straight Low and angled

What is included in the JSON file?

The dataset includes labeling based on a standard COCO format. Additionally, the Perception Package used for this dataset generation outputs additional JSON files that include metadata describing camera intrinsics, formatting, and labeling such as the overlaying of 2D and 3D bounding boxes, plus object count and other reference metrics. Details of the format can be found on our Synthetic Dataset Schema page.

What can I do with the dataset visualizer?

Unity Computer Vision Dataset Visualizer is a Python-based tool that allows you to visualize and explore datasets created using Unity Computer Vision tools.

The main features include:

  • Ability to easily switch datasets by selecting a dataset folder
  • Grid view of all frames in the dataset with the ability to change zoom level
  • Individual frame view along with the JSON data associated with each frame
  • Labeler (ground truth) overlay on frames, both in the grid view and individual frame view. Supported types of ground truth include 2D and 3D bounding boxes, semantic and instance segmentation, and keypoints.
  • Ability to turn each type of ground truth overlay on or off

Requirements for Dataset Visualizer

  • Windows 10 or OSX
  • Chrome, Firefox or Safari 14 and newer (older versions of Safari are not supported)
  • Python 3.7 or 3.8 (Note: This application is not compatible with Python 3.9)

How long did it take to generate this dataset?

Generating the actual 1,000-frame dataset took less than 8 minutes.

What can I do with this dataset?

This is a sample of what a full dataset would be and does not have the quantity or diversity of images required to incorporate in a production machine learning model. It will give you the following:

  • Confidence in the visual quality possible with Unity-generated datasets
  • An example of the labels we can generate from the data
  • Ability to experiment with ingesting synthetic data into a machine learning pipeline

What types of assets were used to create this dataset?

The content team modeled the houses from scratch with all components set up for domain randomization, including interior/exterior doors, kitchen cabinetry and appliances, windows, and even the wall paint.

We licensed the furniture from content partners and prepared it for labeling and domain randomization.

How did these assets get into Unity?

These are purely virtual models. Most of the modeling was done in Autodesk Maya with materials created in Adobe Substance Designer.

How were these assets placed in the scene?

The house interior was assembled like a typical Unity scene as a collection of imported meshes and Prefabs. Furniture is placed using a structured grammar-based placement system, developed by Unity’s computer vision engineers. The project uses the High-Definition Render Pipeline (HDRP) with a combination of real-time and baked lighting.

Everything in the house was set up with labels and randomizers from the Unity Computer Vision Perception Package.


Door Prefab, showing components, materials, and randomization


Get started with synthetic data today! Learn more about Unity Computer Vision or contact us to talk to our computer vision experts about purchasing a custom dataset of your own.

Here you’ll find a wealth of practical technical insights and expert advice to help you bring AI and visual intelligence into your products without flying blind.

Contact

Address

1646 North California Blvd.,
Suite 360
Walnut Creek, CA 94596 USA

Phone
Phone: +1 (925) 954-1411
Scroll to Top