Monday, July 4, 2022
HomeArtificial IntelligenceA Dataset of 3D-Scanned Widespread Family Gadgets

A Dataset of 3D-Scanned Widespread Family Gadgets

Many latest advances in pc imaginative and prescient and robotics depend on deep studying, however coaching deep studying fashions requires a large number of knowledge to generalize to new eventualities. Traditionally, deep studying for pc imaginative and prescient has relied on datasets with hundreds of thousands of things that had been gathered by net scraping, examples of which embody ImageNet, Open Pictures, YouTube-8M, and COCO. Nonetheless, the method of making these datasets may be labor-intensive, and might nonetheless exhibit labeling errors that may distort the notion of progress. Moreover, this technique doesn’t readily generalize to arbitrary three-dimensional shapes or real-world robotic knowledge.

Actual-world robotic knowledge assortment could be very helpful, however tough to scale and difficult to label (determine from BC-Z).

Simulating robots and environments utilizing instruments akin to Gazebo, MuJoCo, and Unity can mitigate most of the inherent limitations in these datasets. Nonetheless, simulation is barely an approximation of actuality — handcrafted fashions constructed from polygons and primitives usually correspond poorly to actual objects. Even when a scene is constructed immediately from a 3D scan of an actual surroundings, the movable objects in that scan will act like mounted background surroundings and won’t reply the best way real-world objects would. Attributable to these challenges, there are few massive libraries with high-quality fashions of 3D objects that may be integrated into bodily and visible simulations to supply the variability wanted for deep studying.

In “Google Scanned Objects: A Excessive-High quality Dataset of 3D Scanned Family Gadgets”, introduced at ICRA 2022, we describe our efforts to deal with this want by creating the Scanned Objects dataset, a curated assortment of over 1000 3D-scanned widespread home items. The Scanned Objects dataset is usable in instruments that learn Simulation Description Format (SDF) fashions, together with the Gazebo and PyBullet robotics simulators. Scanned Objects is hosted on Open Robotics, an open-source internet hosting surroundings for fashions appropriate with the Gazebo simulator.

Historical past

Robotics researchers inside Google started scanning objects in 2011, creating high-fidelity 3D fashions of widespread family objects to assist robots acknowledge and grasp issues of their environments. Nonetheless, it turned obvious that 3D fashions have many makes use of past object recognition and robotic greedy, together with scene development for bodily simulations and 3D object visualization for end-user purposes. Subsequently, this Scanned Objects mission was expanded to convey 3D experiences to Google at scale, accumulating a lot of 3D scans of family objects by way of a course of that’s extra environment friendly and price efficient than conventional commercial-grade product images.

Scanned Objects was an end-to-end effort, involving improvements at practically each stage of the method, together with curation of objects at scale for 3D scanning, the event of novel 3D scanning {hardware}, environment friendly 3D scanning software program, quick 3D rendering software program for high quality assurance, and specialised frontends for net and cellular viewers. We additionally executed human-computer interplay research to create efficient experiences for interacting with 3D objects.

Objects that had been acquired for scanning.

These object fashions proved helpful in 3D visualizations for On a regular basis Robots, which used the fashions to bridge the sim-to-real hole for coaching, work later printed as RetinaGAN and RL-CycleGAN. Constructing on these earlier 3D scanning efforts, in 2019 we started getting ready an exterior model of the Scanned Objects dataset and remodeling the earlier set of 3D photographs into graspable 3D fashions.

Object Scanning

To create high-quality fashions, we constructed a scanning rig to seize photographs of an object from a number of instructions underneath managed and thoroughly calibrated situations. The system consists of two machine imaginative and prescient cameras for form detection, a DSLR digital camera for high-quality HDR shade body extraction, and a computer-controlled projector for sample recognition. The scanning rig makes use of a structured mild approach that infers a 3D form from digital camera photographs with patterns of sunshine which might be projected onto an object.

The scanning rig used to seize 3D fashions.
A shoe being scanned (left). Pictures are captured from a number of instructions with totally different patterns of sunshine and shade. A shadow passing over an object (proper) illustrates how a 3D form may be captured with an off-axis view of a shadow edge.

Simulation Mannequin Conversion

The early inner scanned fashions used protocol buffer metadata, high-resolution visuals, and codecs that weren’t appropriate for simulation. For some objects, bodily properties, akin to mass, had been captured by weighing the objects at scanning time, however floor properties, akin to friction or deformation, weren’t represented.

So, following knowledge assortment, we constructed an automatic pipeline to resolve these points and allow the usage of scanned fashions in simulation methods. The automated pipeline filters out invalid or duplicate objects, robotically assigns object names utilizing textual content descriptions of the objects, and eliminates object mesh scans that don’t meet simulation necessities. Subsequent, the pipeline estimates simulation properties (e.g., mass and second of inertia) from form and quantity, constructs collision volumes, and downscales the mannequin to a usable measurement. Lastly, the pipeline converts every mannequin to SDF format, creates thumbnail photographs, and packages the mannequin to be used in simulation methods.

The pipeline filters fashions that aren’t appropriate for simulation, generates collision volumes, computes bodily properties, downsamples meshes, generates thumbnails, and packages all of them to be used in simulation methods.
A set of Scanned Object fashions rendered in Blender.

The output of this pipeline is a simulation mannequin in an acceptable format with a reputation, mass, friction, inertia, and collision data, together with searchable metadata in a public interface appropriate with our open-source internet hosting on Open Robotics’ Gazebo.

The output objects are represented as SDF fashions that confer with Wavefront OBJ meshes averaging 1.4 Mb per mannequin. Textures for these fashions are in PNG format and common 11.2 Mb. Collectively, these present excessive decision form and texture.


The Scanned Objects dataset accommodates 1030 scanned objects and their related metadata, totaling 13 Gb, licensed underneath the CC-BY 4.0 License. As a result of these fashions are scanned somewhat than modeled by hand, they realistically mirror actual object properties, not idealized recreations, decreasing the issue of transferring studying from simulation to the actual world.

Enter views (left) and reconstructed form and texture from two novel views on the proper (determine from Differentiable Stereopsis).
Visualized motion scoring predictions over three real-world 3D scans from the Reproduction dataset and Scanned Objects (determine from Where2Act).

The Scanned Objects dataset has already been utilized in over 25 papers throughout as many initiatives, spanning pc imaginative and prescient, pc graphics, robotic manipulation, robotic navigation, and 3D form processing. Most initiatives used the dataset to supply artificial coaching knowledge for studying algorithms. For instance, the Scanned Objects dataset was utilized in Kubric, an open-sourced generator of scalable datasets to be used in over a dozen imaginative and prescient duties, and in LAX-RAY, a system for looking cabinets with lateral entry X-rays to automate the mechanical seek for occluded objects on cabinets.

We hope that the Scanned Objects dataset shall be utilized by extra robotics and simulation researchers sooner or later, and that the instance set by this dataset will encourage different homeowners of 3D mannequin repositories to make them accessible for researchers all over the place. If you want to strive it your self, head to Gazebo and begin shopping!


The authors thank the Scanned Objects group, together with Peter Anderson-Sprecher, J.J. Blumenkranz, James Bruce, Ken Conley, Katie Dektar, Charles DuHadway, Anthony Francis, Chaitanya Gharpure, Topraj Gurung, Kristy Headley, Ryan Hickman, John Isidoro, Sumit Jain, Brandon Kinman, Greg Kline, Mach Kobayashi, Nate Koenig, Kai Kohlhoff, James Kuffner, Thor Lewis, Mike Licitra, Lexi Martin, Julian (Mac) Mason, Rus Maxham, Pascal Muetschard, Kannan Pashupathy, Barbara Petit, Arshan Poursohi, Jared Russell, Matt Seegmiller, John Sheu, Joe Taylor, Vincent Vanhoucke, Josh Weaver, and Tommy McHugh.

Particular thanks go to Krista Reymann for organizing this mission, serving to write the paper, and enhancing this blogpost, James Bruce for the scanning pipeline design and Pascal Muetschard for sustaining the database of object fashions.



Please enter your comment!
Please enter your name here

Most Popular

Recent Comments