2D + 3D Capture Strategy
If you require both 2D and 3D data for your application, then this tutorial is for you.
We explain and emphasize the pros and cons of different 2D-3D capture approaches, clarify some limitations, and explain how they affect cycle times. We touch upon the difference between external light for 2D and using the internal projector.
- 2D data
RGB image
- 3D data
There are two different ways to get 2D data:
Independently via
camera.capture(Zivid::Settings2D).imageRGBA()
, see 2D Image Capture Process.As part of 3D capture
camera.capture(Zivid::Settings).pointCloud.copyImageRGBA()
, see Point Cloud Capture Process.
Which one to use, however, depends on your requirements.
Different scenarios will lead to different tradeoffs. We break it down by which data you require first. Then we will discuss tradeoffs of speed versus quality for the different scenarios.
I need 2D data before 3D data
I only need 2D data after I have used the 3D data
About external light
Before we go into the different strategies we have to discuss external light. The ideal light source for a 2D capture is strong and diffuse because this limits the blooming effects. With the internal projector as the light source, the blooming effects are almost inevitable. Mounting the camera at an angle significantly reduces this effect, but still an external diffuse light source is better. External light introduces noise in the 3D data, so one should ideally turn the external light off during 3D capture.
In addition to the reduction in blooming effects, strong external light can smooth out variations in exposure due to variations in ambient light. Typical sources for variations in ambient light:
changes in daylight (day/night, clouds, etc.)
doors opening and closing
ceiling light turned on and off
Such variations in exposure impact 3D and 2D data differently. The impact of exposure variations in the 2D data depends on the detection algorithm used. If segmentation is performed in 2D, then these variations may or may not impact segmentation performance. For the point cloud, you may find variations in point cloud completeness due to variations in noise.
This leads to the question: Should we use the projector for 2D?
Check out Optimizing Color Image for more information on that topic.
On Zivid One+ it is important to be aware of the switching penalty that occurs when the projector is on during 2D capture. For more information, see Limitation when performing captures in a sequence while switching between 2D and 3D capture calls.
If there is enough time in between each capture cycle it is possible to mitigate the switching limitation. We can take on the penalty while the system is doing something else. For example, while the robot is moving in front of the camera. In this tutorial, we call this a dummy capture.
2D data before 3D data
If you, for example, perform segmentation in 2D and then later determine your picking pose, then you need 2D faster than 3D. The fastest way to get 2D data is via a separate 2D capture. Hence, if you need 2D data before 3D data then you should perform a separate 2D capture.
The following code sample shows how you can:
Capture 2D
Use 2D data and capture 3D in parallel
If duty cycle permits: perform a dummy capture to absorb the penalty where it does not impact performance.
auto camera = zivid.connectCamera();
dummyCapture2D(camera, settings2D);
const auto frame2dAndCaptureTime = captureAndMeasure<Zivid::Frame2D>(camera, settings2D);
std::cout
<< "Starting 3D capture in current thread and using 2D data in separate thread, such that the two happen in parallel"
<< std::endl;
std::future<void> userThread =
std::async(std::launch::async, useFrame<Zivid::Frame2D>, std::ref(frame2dAndCaptureTime.frame));
const auto frameAndCaptureTime = captureAndMeasure<Zivid::Frame>(camera, settings);
useFrame(frameAndCaptureTime.frame);
std::cout << "Wait for usage of 2D frame to finish" << std::endl;
userThread.get();
printCaptureFunctionReturnTime(frame2dAndCaptureTime.captureTime, frameAndCaptureTime.captureTime);
Following is a table with the expected performance for the different scenarios.
2D followed by 3D, back-to-back |
2D followed by 3D, with delay [1] |
||||
---|---|---|---|---|---|
2D with projector |
2D without projector |
2D with projector |
2D without projector |
||
One+ |
2D |
~420 ms [3] |
~70 ms |
~30 ms |
~30 ms |
3D |
~870 ms [2] |
~320 ms |
~870 ms [2] |
~320 ms |
|
Two |
2D |
~50 ms |
~35 ms |
~30 ms |
~25 ms |
3D |
~140 ms |
~130 ms |
~140 ms |
~130 ms |
Note
A new capture will not start until all the processing on any ongoing capture (2D or 3D) on the same camera is completed. This affects the course of events when sequentially calling two captures with the same camera. See Performance limitation of sequential captures with the same camera for more information.
2D data as part of how I use 3D data
In this case, we don’t have to get access to the 2D data before the 3D data. You always get 2D data as part of a 3D acquisition. Thus we only have to care about overall speed and quality.
Speed
For optimal speed, we simply rely on 3D acquisitions to provide good 2D data. There is no additional acquisition or separate capture for 2D data.
2D Quality
For optimal 2D quality, it is recommended to use a separate acquisition for 2D. This can either be as a separate 2D capture as discussed in the previous section, or HDR capture with UseFirstAcquisition. Adding a separate acquisition for 3D HDR for color can be costly in terms of speed. This is because the exposure is multiplied by the number of patterns for the chosen Vision Engine. This is a limitation that may be removed in future SDK updates.
Following is a table that shows what you can expect from the different configurations.
At the end you will find a table showing actual measurements on different hardware with the Fast Consumer Goods settings for Zivid Two M70 (Z2 M70 Fast
).
- Fast
Use 2D data from 3D capture. No special acquisition or settings for 2D.
- Medium Fast
Separate 2D capture followed by 3D capture.
- Slow
3D capture with an additional acquisition with special settings for optimal 2D.
Fast |
Medium Fast |
Slow |
|||
---|---|---|---|---|---|
3D [4] |
2D with projector + 3D |
2D without projector + 3D |
3D (+1 for 2D) [5] |
||
One+ |
2D |
N/A |
~420 ms |
~30 ms |
N/A |
3D |
~340 ms |
~870 ms |
~320 ms |
~910 ms |
|
Two |
2D |
N/A |
~40 ms |
~40 ms |
N/A |
3D |
~280 ms |
~330 ms |
~290 ms |
~370 ms |
These are Fast Consumer Goods settings for Zivid Two M70 (Z2 M70 Fast
).
These are Fast Consumer Goods settings for Zivid Two M70 (Z2 M70 Fast
) with one additional acquisition for 2D with or without projector. The 2D acquisition is placed as the first acquisition and we set UseFirstAcquisition
.
2D data after I have used the 3D data
You always get 2D data as part of a 3D acquisition. The table below shows 3D capture time examples.
Consumer Goods Settings |
Zivid One+ |
Zivid Two |
||||
---|---|---|---|---|---|---|
Intel UHD 750 |
Intel UHD G1 |
NVIDIA 3070 |
Intel UHD 750 |
Intel UHD G1 |
NVIDIA 3070 |
|
High-end [6] |
Low-end [7] |
High-end [8] |
High-end [6] |
Low-end [7] |
High-end [8] |
|
NA |
NA |
NA |
556 (±9) ms |
947 (±379) ms |
280 (±4) ms |
|
NA |
NA |
NA |
558 (±9) ms |
949 (±394) ms |
281 (±5) ms |
|
586 (±2) ms |
903 (±4) ms |
342 (±1) ms |
NA |
NA |
NA |
|
586 (±2) ms |
901 (±4) ms |
342 (±1) ms |
NA |
NA |
NA |
High-end machine with GPU: Intel UHD Graphics 750 (ID:0x4C8A) and CPU: 11th Gen Intel(R) Core(TM) i9-11900K @ 3.50GHz, 10GbE
Low-end machine with GPU: Intel UHD Graphics G1 (ID:0x8A56) and CPU: Intel(R) Core(TM) i3-1005G1 CPU @ 1.20GHz, 1GbE
High-end machine with GPU: NVIDIA GeForce RTX 3070 and CPU: 11th Gen Intel(R) Core(TM) i9-11900K @ 3.50GHz, 10GbE
However, optimizing for 3D quality does not always optimize for 2D quality. Thus, it might be a good idea to have a separate 2D capture after the 3D capture. Following is a table with the expected performance for the different scenarios.
3D followed by 2D, back-to-back |
3D followed by 2D, with delay [9] |
||||
---|---|---|---|---|---|
2D with projector |
2D without projector |
2D with projector |
2D without projector |
||
One+ |
3D |
~680 ms [10] |
~120 ms |
~120 ms |
~120 ms |
2D |
~380 ms [11] |
~30 ms |
~380 ms [11] |
~30 ms |
|
Two |
3D |
~140 ms |
~140 ms |
~130 ms |
~110 ms |
2D |
~40 ms |
~40 ms |
~40 ms |
~40 ms |
Duty cycle is low enough that we have time to execute a dummy capture.
350ms 2D ➞ 3D penalty on One+ with projector on during 2D capture, see Limitation when performing captures in a sequence while switching between 2D and 3D capture calls.
650ms 3D ➞ 2D penalty on One+ with projector on during 2D capture, see Limitation when performing captures in a sequence while switching between 2D and 3D capture calls.
Note
A new capture will not start until all the processing on any ongoing capture (2D or 3D) on the same camera is completed. This affects the course of events when sequentially calling two captures with the same camera. See Performance limitation of sequential captures with the same camera for more information.
Summary
The following tables list the different 2D+3D capture configurations. It shows how they are expected to perform relative to each other with respect to speed and quality. We separate into two scenarios:
Cycle time is so fast that each capture cycle needs to happen right after the other.
Cycle time is slow enough to allow an additional dummy capture between each capture cycle. An additional capture can take up to 800ms in the worst case. A rule of thumb is that for cycle time greater than 2 seconds a dummy capture saves time.
Back-to-back captures
Capture Cycle (no wait between cycles) |
Speed |
2D-Quality |
|
---|---|---|---|
One+ |
Two |
||
3D ➞ 2D [13] |
Slowest |
Fast |
Best |
2D ➞ 3D [12] |
Slow |
Faster |
Best |
3D (w/2D [15]) |
Fast |
Fast |
Better |
3D |
Fastest |
Fastest |
Good |
For back-to-back captures, it is not possible to avoid switching delay, unless the projector brightness is the same.
However, in this case, it is better to set Color Mode to UseFirstAcquisition
, see Color Mode.
Captures with low duty cycle
Capture Cycle (time to wait for next cycle) |
Speed |
2D-Quality |
|
---|---|---|---|
One+ |
Two |
||
Slow |
Fast |
Best |
|
Good |
Faster |
Best |
|
Fast |
Fast |
Best |
|
3D ➞ 3D |
Fastest |
Fastest |
Good |
350ms 2D ➞ 3D penalty on One+ with projector on during 2D capture, see Limitation when performing captures in a sequence while switching between 2D and 3D capture calls.
650ms 3D ➞ 2D penalty on One+ with projector on during 2D capture, see Limitation when performing captures in a sequence while switching between 2D and 3D capture calls.
(Applies to One+) For low duty cycle applications we can append a dummy 2D/3D capture to move the 3D➞2D/2D➞3D switching penalty away from the critical section.
3D with 2D acquisition as first capture, UseFirstAcquisition
, see Color Mode.
Following is a table showing actual measurements on different hardware with the Fast Consumer Goods settings for Zivid Two M70 (Z2 M70 Fast
).
Note
We use the Fast Consumer Goods settings for Zivid Two (Z2 M70 Fast
) and Zivid One+ (Z1+ M Fast
).
2D+3D Capture |
Zivid One+ |
Zivid Two |
||||||
---|---|---|---|---|---|---|---|---|
Intel UHD 750 |
Intel UHD G1 |
NVIDIA 3070 |
Intel UHD 750 |
Intel UHD G1 |
NVIDIA 3070 |
|||
High-end [18] |
Low-end [19] |
High-end [20] |
High-end [18] |
Low-end [19] |
High-end [20] |
|||
Capture 2D and then 3D |
||||||||
✓ |
✓ |
2D |
418 (±0.7) ms |
420 (±0.8) ms |
419 (±0.6) ms |
51 (±1) ms |
52 (±7) ms |
50 (±1) ms |
3D |
1110 (±7) ms |
1439 (±6) ms |
865 (±9) ms |
607 (±3) ms |
1024 (±426) ms |
325 (±2) ms |
||
✓ |
2D |
23 (±0.4) ms |
24 (±0.4) ms |
26 (±1) ms |
23 (±0.3) ms |
25 (±1) ms |
24 (±0.3) ms |
|
3D |
1104 (±8) ms |
1438 (±9) ms |
861 (±5) ms |
608 (±3) ms |
1030 (±456) ms |
328 (±1) ms |
||
✓ |
2D |
73 (±0.5) ms |
74 (±0.7) ms |
73 (±0.5) ms |
50 (±1) ms |
51 (±0.5) ms |
49 (±1) ms |
|
3D |
566 (±5) ms |
893 (±3) ms |
319 (±4) ms |
607 (±3) ms |
1024 (±426) ms |
326 (±0.5) ms |
||
2D |
23 (±0.6) ms |
23 (±0.4) ms |
26 (±1) ms |
12 (±0.2) ms |
14 (±10) ms |
13 (±0.3) ms |
||
3D |
563 (±2) ms |
897 (±5) ms |
318 (±1) ms |
610 (±3) ms |
1027 (±407) ms |
329 (±2) ms |
||
Capture 3D and then 2D |
||||||||
✓ |
✓ |
2D |
418 (±0.7) ms |
420 (±0.7) ms |
419 (±0.6) ms |
51 (±1) ms |
52 (±10) ms |
50 (±1) ms |
3D |
1105 (±6) ms |
1436 (±10) ms |
864 (±6) ms |
604 (±3) ms |
1010 (±393) ms |
324 (±1) ms |
||
✓ |
2D |
418 (±0.8) ms |
420 (±0.7) ms |
419 (±0.6) ms |
43 (±0.2) ms |
43 (±7) ms |
43 (±0.3) ms |
|
3D |
593 (±3) ms |
889 (±17) ms |
352 (±2) ms |
592 (±3) ms |
981 (±456) ms |
323 (±0.9) ms |
||
✓ |
2D |
73 (±0.5) ms |
74 (±0.6) ms |
73 (±0.5) ms |
50 (±1) ms |
51 (±14) ms |
49 (±1) ms |
|
3D |
562 (±2) ms |
891 (±3) ms |
319 (±3) ms |
604 (±3) ms |
1008 (±338) ms |
325 (±3) ms |
||
2D |
73 (±0.6) ms |
74 (±0.6) ms |
73 (±0.5) ms |
40 (±0.4) ms |
40 (±12) ms |
40 (±0.3) ms |
||
3D |
593 (±3) ms |
889 (±15) ms |
352 (±2) ms |
557 (±3) ms |
945 (±408) ms |
287 (±3) ms |
||
Capture 3D including 2D |
||||||||
✓ |
✓ |
2D |
0 (±0.0) ms |
0 (±0.0) ms |
0 (±0.0) ms |
0 (±0.0) ms |
0 (±0.0) ms |
0 (±0.0) ms |
3D |
1108 (±41) ms |
1403 (±44) ms |
910 (±43) ms |
622 (±3) ms |
1108 (±503) ms |
370 (±1) ms |
||
✓ |
2D |
0 (±0.0) ms |
0 (±0.0) ms |
0 (±0.0) ms |
0 (±0.0) ms |
0 (±0.0) ms |
0 (±0.0) ms |
|
3D |
1111 (±41) ms |
1403 (±42) ms |
911 (±42) ms |
627 (±5) ms |
1106 (±471) ms |
373 (±4) ms |
Projector used during 2D capture.
Back-to-Back captures or delay between each capture cycle. This allows the elimination of a potential initial switch-cost.
High-end machine with GPU: Intel UHD Graphics 750 (ID:0x4C8A) and CPU: 11th Gen Intel(R) Core(TM) i9-11900K @ 3.50GHz, 10GbE
Low-end machine with GPU: Intel UHD Graphics G1 (ID:0x8A56) and CPU: Intel(R) Core(TM) i3-1005G1 CPU @ 1.20GHz, 1GbE
High-end machine with GPU: NVIDIA GeForce RTX 3070 and CPU: 11th Gen Intel(R) Core(TM) i9-11900K @ 3.50GHz, 10GbE
Tip
To test different 2D-3D strategies on your PC, you can run Capture2D+3D.cpp sample with settings loaded from YML files. Go to Samples, and select C++ for instructions.