2D + 3D Capture Strategy

If you require both 2D and 3D data for your application, then this tutorial is for you.

We explain and emphasize the pros and cons of different 2D-3D capture approaches, clarify some limitations, and explain how they affect cycle times. We touch upon the difference between external light for 2D and using the internal projector.

2D data

RGB image

3D data

Point Cloud

There are two different ways to get 2D data:

  1. Independently via camera.capture(Zivid::Settings2D).imageRGBA(), see 2D Image Capture Process.

  2. As part of 3D capture camera.capture(Zivid::Settings).pointCloud.copyImageRGBA(), see Point Cloud Capture Process.

Which one to use, however, depends on your requirements.

Different scenarios will lead to different tradeoffs. We break it down by which data you require first. Then we will discuss tradeoffs of speed versus quality for the different scenarios.

About external light

Before we go into the different strategies we have to discuss external light. The ideal light source for a 2D capture is strong and diffuse because this limits the blooming effects. With the internal projector as the light source, the blooming effects are almost inevitable. Mounting the camera at an angle significantly reduces this effect, but still an external diffuse light source is better. External light introduces noise in the 3D data, so one should ideally turn the external light off during 3D capture.

In addition to the reduction in blooming effects, strong external light can smooth out variations in exposure due to variations in ambient light. Typical sources for variations in ambient light:

  • changes in daylight (day/night, clouds, etc.)

  • doors opening and closing

  • ceiling light turned on and off

Such variations in exposure impact 3D and 2D data differently. The impact of exposure variations in the 2D data depends on the detection algorithm used. If segmentation is performed in 2D, then these variations may or may not impact segmentation performance. For the point cloud, you may find variations in point cloud completeness due to variations in noise.

This leads to the question: Should we use the projector for 2D?

How to decide whether or not to use the projector as the light for 2D capture.

Check out Optimizing Color Image for more information on that topic.

On Zivid One+ it is important to be aware of the switching penalty that occurs when the projector is on during 2D capture. This time-penalty only happens if the 2D capture settings use brightness > 0. For more information, see Limitation when performing captures in a sequence while switching between 2D and 3D capture calls.

If there is enough time in between each capture cycle it is possible to mitigate the switching limitation. We can take on the penalty while the system is doing something else. For example, while the robot is moving in front of the camera. In this tutorial, we call this a dummy capture.

Note

For 2 and 2+ there is no longer any switching limitation. The FW adapts to the following three scenarios:

  1. Capture 2D in a loop, with the same settings

  2. Capture 3D in a loop, with the same settings

  3. Capture 2D and then 3D, or vice versa, with the same settings in each loop

Do not apply a dummy capture for 2 and 2+.

2D data before 3D data

If you, for example, perform segmentation in 2D and then later determine your picking pose, then you need 2D faster than 3D. The fastest way to get 2D data is via a separate 2D capture. Hence, if you need 2D data before 3D data then you should perform a separate 2D capture.

The following code sample shows how you can:

  1. Capture 2D

  2. Use 2D data and capture 3D in parallel

  3. [One+ only] If duty cycle permits: perform a dummy capture to absorb the penalty where it does not impact performance.

Go to source

source

const auto frame2dAndCaptureTime = captureAndMeasure<Zivid::Frame2D>(camera, settings2D);
std::future<Duration> userThread =
    std::async(std::launch::async, useFrame<Zivid::Frame2D>, std::ref(frame2dAndCaptureTime.frame));
const auto frameAndCaptureTime = captureAndMeasure<Zivid::Frame>(camera, settings);
const auto processTime = useFrame(frameAndCaptureTime.frame);
const auto processTime2D = userThread.get();

Following is a table with the expected performance for the different scenarios.

Platform used: 11th Gen Intel(R) Core(TM) i9-11900K @ 3.50GHz with NVIDIA GeForce RTX 3070

2D followed by 3D, back-to-back

2D followed by 3D, with delay [1]

2D with projector

2D without projector

2D with projector

2D without projector

Zivid One+

2D

~430 ms [3]

~70 ms

~30 ms

~30 ms

3D

~870 ms [2]

~330 ms

~870 ms [2]

~320 ms

2D+3D

~1300 ms [2]

~400 ms

~900 ms [2]

~350 ms

Zivid 2

2D

~15 ms

~10 ms

~25 ms

~25 ms

3D

~290 ms

~290 ms

~290 ms

~290 ms

2D+3D

~300 ms

~300 ms

~300 ms

~300 ms

Zivid 2+

2D

~110 ms

~100 ms

~55 ms

~55 ms

3D

~220 ms

~220 ms

~220 ms

~220 ms

2D+3D

~300 ms

~300 ms

~260 ms

~260 ms

Note

One+ only

A new capture will not start until all the processing on any ongoing capture (2D or 3D) on the same camera is completed. This affects the course of events when sequentially calling two captures with the same camera. See Performance limitation of sequential captures with the same camera for more information.

The following shows actual benchmark numbers. You will find a more extensive table at the bottom of the page.

2D data as part of how I use 3D data

In this case, we don’t have to get access to the 2D data before the 3D data. You always get 2D data as part of a 3D acquisition. Thus we only have to care about overall speed and quality.

Speed

For optimal speed, we simply rely on 3D acquisitions to provide good 2D data. There is no additional acquisition or separate capture for 2D data.

2D Quality

For optimal 2D quality, it is recommended to use a separate acquisition for 2D. This can either be as a separate 2D capture as discussed in the previous section, or HDR capture with UseFirstAcquisition. Adding a separate acquisition for 3D HDR for color can be costly in terms of speed. This is because the exposure is multiplied by the number of patterns for the chosen Vision Engine. This is a limitation that may be removed in future SDK updates.

Following is a table that shows what you can expect from the different configurations. At the end you will find a table showing actual measurements on different hardware.

Fast

Use 2D data from 3D capture. No special acquisition or settings for 2D.

Medium Fast

Separate 2D capture followed by 3D capture.

Slow

3D capture with an additional acquisition with special settings for optimal 2D.

Platform used: 11th Gen Intel(R) Core(TM) i9-11900K @ 3.50GHz with NVIDIA GeForce RTX 3070

Fast

Medium Fast

Slow

3D [4]

2D with projector + 3D

2D without projector + 3D

3D (+1 for 2D) [5]

Zivid One+

2D

N/A

~30 ms

~30 ms

N/A

3D

~365 ms

~870 ms

~320 ms

~960 ms

2D+3D

~365 ms

~900 ms

~350 ms

~960 ms

Zivid 2

2D

N/A

~25 ms

~25 ms

N/A

3D

~295 ms

~290 ms

~290 ms

~375 ms

2D+3D

~295 ms

~300 ms

~300 ms

~375 ms

Zivid 2+

2D

N/A

~55 ms

~55 ms

N/A

3D

~170 ms

~220 ms

~220 ms

~640 ms

2D+3D

~170 ms

~260 ms

~260 ms

~640 ms

The following shows actual benchmark numbers. You will find a more extensive table at the bottom of the page.

2D data after I have used the 3D data

You always get 2D data as part of a 3D acquisition. The table below shows 3D capture time examples.

However, optimizing for 3D quality does not always optimize for 2D quality. Thus, it might be a good idea to have a separate 2D capture after the 3D capture. Following is a table with the expected performance for the different scenarios.

Platform used: 11th Gen Intel(R) Core(TM) i9-11900K @ 3.50GHz with NVIDIA GeForce RTX 3070

3D followed by 2D, back-to-back

3D followed by 2D, with delay [6]

2D with projector

2D without projector

2D with projector

2D without projector

Zivid One+

3D

~870 ms [7]

~330 ms

~370 ms

~370 ms

2D

~440 ms [8]

~90 ms

~440 ms [8]

~90 ms

2D+3D

~1250 ms [8]

~300 ms

~730 [8]

~380 ms

Zivid 2

3D

~290 ms

~290 ms

~290 ms

~290 ms

2D

~20 ms

~20 ms

~20 ms

~20 ms

2D+3D

~290 ms

~290 ms

~290 ms

~290 ms

Zivid 2+

3D

~200 ms

~200 ms

~170 ms

~170 ms

2D

~110 ms

~110 ms

~110 ms

~110 ms

2D+3D

~300 ms

~300 ms

~270 ms

~270 ms

Note

One+ only

A new capture will not start until all the processing on any ongoing capture (2D or 3D) on the same camera is completed. This affects the course of events when sequentially calling two captures with the same camera. See Performance limitation of sequential captures with the same camera for more information.

The following shows actual benchmark numbers. You will find a more extensive table at the bottom of the page.

Summary

The following tables list the different 2D+3D capture configurations. It shows how they are expected to perform relative to each other with respect to speed and quality. We separate into two scenarios:

  • Cycle time is so fast that each capture cycle needs to happen right after the other.

  • Cycle time is slow enough to allow an additional dummy capture between each capture cycle (only relevant for Zivid One+). An additional capture can take up to 800ms in the worst case. A rule of thumb is that for cycle time greater than 2 seconds a dummy capture saves time.

Back-to-back captures

Capture Cycle (no wait between cycles)

Speed

2D-Quality

Zivid One+

Zivid 2

Zivid 2+

2D with Projector

2D without Projector

Faster

Fast

Best

3D ➞ 2D [10]

Slower

Faster

Faster

Fast

Best

2D ➞ 3D [9]

Slowest

Fast

Fast

Faster

Best

3D (w/2D [12])

Slow

Slowest

Slowest

Slowest

Best

3D

Fastest

Fastest

Fastest

Good

Note

One+ only

For back-to-back captures, it is not possible to avoid switching delay, unless the projector brightness is the same. However, in this case, it is better to set Color Mode to UseFirstAcquisition, see Color Mode.

Captures with low duty cycle

Capture Cycle (time to wait for next cycle)

Speed

2D-Quality

Zivid One+

Zivid 2

Zivid 2+

2D with Projector

2D without Projector

Faster

Fast

Best

3D ➞ 2D [10] ➞ 3D ([11])

Slow

Fast

Faster

Fast

Best

2D ➞ 3D [9] ➞ 2D ([11])

Slower

Faster

Fast

Faster

Best

3D (w/2D [12]) ➞ 3D (w/2D [12])

Slowest

Slowest

Slowest

Best

3D ➞ 3D

Fastest

Fastest

Fastest

Good

Following is a table showing actual measurements on different hardware. For the 3D capture we use the Fast Consumer Goods settings.

Zivid 2+

(Z2+ M130 Fast)

Zivid 2

(Z2 M70 Fast)

Zivid One+

(Z1+ M Fast)

Tip

To test different 2D-3D strategies on your PC, you can run ZividBenchmark.cpp sample with settings loaded from YML files. Go to Samples, and select C++ for instructions.

Version History

SDK

Changes

2.11.0

2 and 2+ now support concurrent processing and acquisition for 3D ➞ 2D and 3D ➞ 2D, and switching between capture modes have been optimized.