2D+3D捕获策略
Note, if you do not care about color information, jump straight to the next section, selecting 3D and 2D settings based on capture speed.
Many detection algorithms commonly used in piece-picking applications rely on 2D data to identify which object to pick. In this article, we provide insights into different ways to acquire 2D information, their pros and cons, and external lighting conditions. We also touch upon various 2D-3D approaches, their data quality, and how they affect cycle times.
There are two approaches to get 2D data:
Separate 2D capture via
camera.capture(Zivid::Settings2D).imageRGBA()
, see 2D图像捕获流程.Part of 3D capture
camera.capture(Zivid::Settings).pointCloud.copyImageRGBA()
, see 点云捕获过程.
Which one to use depends on your requirements and the machine vision pipeline. We advocate for a dedicated 2D capture as it provides better control over the 2D settings for color optimization and can leverage multi-threading and optimized scheduling. It also grants you increased flexibility in configuring desired camera resolution and projector settings. Utilizing 2D data from the 3D capture is simpler, but you may have to compromise speed to get desired 2D quality.
小技巧
By taking a separate 2D capture, you can disable color in your 3D capture by setting Sampling::Color
to disabled
.
This will reduce the capture time for the 3D acquisition.
- Our recommendation:
Separate 2D capture with full resolution and projector on.
Subsampled 3D capture with color disabled.
Camera resolution and 1-to-1 mapping
For accurate 2D segmentation and detection, it is beneficial with a high-resolution color image. Zivid 2+ has a 5 MPx imaging sensor, while Zivid 2 and One+ have 2.3 MPx sensors. The following table shows the resolution outputs of the different cameras for both 2D and 3D captures.
2D capture |
Zivid One+ |
Zivid 2 |
Zivid 2+ |
---|---|---|---|
Full Resolution |
1920 x 1200 |
1944 x 1200 |
2448 x 2048 |
3D capture |
Zivid One+ |
Zivid 2 |
Zivid 2+ |
---|---|---|---|
Full resolution [1] |
1920 x 1200 |
1944 x 1200 |
2448 x 2048 |
2x2 subsampled [1] |
Not available |
972 x 600 |
1224 x 1024 |
Observe that 2D captures will output full-resolution images while 3D captures may be subsampled depending on pixel sampling. This means that we no longer have a 1-to-1 correlation between a 2D pixel and a 3D point. Consequently, it is more challenging to extract the 3D data from a segmented mask in the 2D image. To restore the correlation, we can either subsample or downsample the 2D image,
or recompute the mapping by extracting RGB values from the pixels that correspond to the Blue or Red pixels from they Bayer grid. The code below shows how to do this:
std::cout << "Pixels to sample: " << pixelsToSample << std::endl;
const int subsampleDivider =
(pixelsToSample.value() == Zivid::Settings::Sampling::Pixel::ValueType::all) ? 1 : 2;
int offset = (pixelsToSample.value() == Zivid::Settings::Sampling::Pixel::ValueType::blueSubsample2x2) ? 0 : 1;
cv::Mat
mappedBGR(fullResolutionBGR.rows / subsampleDivider, fullResolutionBGR.cols / subsampleDivider, CV_8UC3);
std::cout << "Mapped width: " << mappedBGR.cols << ", height: " << mappedBGR.rows << std::endl;
for(size_t row = 0; row < static_cast<size_t>(fullResolutionBGR.rows - offset); row += subsampleDivider)
{
for(size_t col = 0; col < static_cast<size_t>(fullResolutionBGR.cols - offset); col += subsampleDivider)
{
mappedBGR.at<cv::Vec3b>(row / subsampleDivider, col / subsampleDivider) =
fullResolutionBGR.at<cv::Vec3b>(row + offset, col + offset);
}
}
return mappedBGR;
if pixels_to_sample == zivid.Settings.Sampling.Pixel.blueSubsample2x2:
return rgba[::2, ::2, 0:3]
if pixels_to_sample == zivid.Settings.Sampling.Pixel.redSubsample2x2:
return rgba[1::2, 1::2, 0:3]
if pixels_to_sample == zivid.Settings.Sampling.Pixel.all:
return rgba[:, :, 0:3]
raise RuntimeError(f"Invalid pixels to sample: {pixels_to_sample}")
For more insight into resolution, sampling and mapping, check out Monochrome Capture.
备注
If you use intrinsics and 2D and 3D capture have different resolutions, ensure you use them correctly. See 相机内参 for more information.
- Our recommendation:
2D capture with full resolution
3D monochrome capture with subsampled resolution
External light considerations
The ideal light source for a 2D capture is strong, because it reduces the influence of ambient light, and diffuse, because this limits the blooming effects. This light source can either come from the internal projector or from an external light source. A third option is not to use any light at all.
Regardless of your chosen option, you may encounter blooming. When utilizing the internal projector as light source, tilting the camera, changing the background, or tuning the 2D acquisition settings can mitigate the blooming effect. On the other hand, if using external light, ensuring the light is diffuse or angling it may help. It’s important to note that external light introduces noise in the 3D data, and you should deactivate them during the 3D capture. Consequently, the use of external lights adds complexity to your cell setup and the scheduling of your machine vision pipeline
Exposure variations caused by changes in ambient light, such as transitions from day to night, doors opening and closing, or changes in ceiling lighting, affects 2D and 3D data differently. For 2D data, they can impact segmentation performance, especially when it is trained on specific datasets. For 3D data, exposure variations may affect point cloud completeness due to varying noise levels. Using either an internal projector or external diffuse light helps reduce these variations.
The below table summarizes the pros and cons of the different options with respect to 2D quality.
Internal projector |
External light [2] |
Ambient light |
|
---|---|---|---|
Robot Cell setup |
Simple |
Complex |
Simple |
Resilience to ambient light variations |
Acceptable |
良好 |
Bad |
Blooming in 2D images |
Likely |
Unlikely |
Likely |
2D color balance needed |
No |
Likely |
yes |
Assuming strong diffuse light.
Zivid One+ projector switching penalty
On Zivid One+ it is important to be aware of the switching penalty that occurs when the projector is on during 2D capture. This time-penalty only happens if the 2D capture settings use brightness > 0. For more information, see 在2D和3D捕获调用之间切换时按顺序执行捕获的限制.
如果每个捕获周期之间有足够的时间预算,则可以减轻切换限制。我们可以在系统做其他事情时进行该切换动作。例如,当机器人在相机前方移动时。在本教程中,我们将其称为虚拟捕获(dummy capture)。
- Our recommendation:
Separate 2D capture with internal projector on
Capture strategies
Optimizing for 3D quality does not necessarily give you satisfactory 2D quality. Therefore, if you depend on color information, we recommend having a separate 2D capture. We can break it down to which data you need first. This gives us the three following strategies:
2D数据先于3D数据
2D data as part of 3D data
2D data after 3D data
Which strategy you should go for depends on your machine vision algorithms and pipeline. Below we summarize the performance of the different strategies. For a more in-depth understanding and comprehensive ZividBenchmarks, please see 2D+3D捕获策略
下表列出了不同的2D+3D捕获的配置,并列出了这些不同配置在速度和图像质量方面的相对表现。我们将应用分为两种情况:
循环时间非常快,每个捕获循环都需要紧接着上一个捕获循环发生。
Cycle time is slow enough to allow an additional dummy capture between each capture cycle (only relevant for Zivid One+). An additional capture can take up to 800ms in the worst case. A rule of thumb is that for cycle time greater than 2 seconds a dummy capture saves time.
背靠背捕获
捕获循环(循环之间无等待时间) |
速度 |
2D质量 |
|
---|---|---|---|
Zivid One+ |
Zivid 2 |
||
3D ➞ 2D [4] |
最慢 |
快速 |
最好 |
2D ➞ 3D [3] |
慢速 |
较快 |
最好 |
3D (w/2D [6]) |
快速 |
快速 |
较好 |
3D |
最快 |
最快 |
良好 |
对于背靠背捕获的场景,除非投影仪亮度的设置相同,否则无法避免切换带来的延迟。在这种情况下,最好将颜色模式设置为 UseFirstAcquisition
,请参阅 Color Mode 。
低占空比捕获
捕获循环(等待下一个循环的时间) |
速度 |
2D质量 |
|
---|---|---|---|
Zivid One+ |
Zivid 2 |
||
慢速 |
快速 |
最好 |
|
良好 |
较快 |
最好 |
|
快速 |
快速 |
最好 |
|
3D ➞ 3D |
最快 |
最快 |
良好 |
当使用投影仪进行2D捕获时,对于One+相机有350毫秒的2D➞3D切换时间延迟,具体请参阅 在2D和3D捕获调用之间切换时按顺序执行捕获的限制 。
当使用投影仪进行2D捕获时,对于One+相机有650毫秒的3D➞2D切换时间延迟,具体请参阅 在2D和3D捕获调用之间切换时按顺序执行捕获的限制 。
(适用于 One+)对于低占空比的应用,我们可以附加一个虚拟2D/3D捕获,以将3D➞2D/2D➞3D切换带来的延迟从关键环节移除。
将2D采集置于第一项的3D捕获,并将颜色模式设置为 UseFirstAcquisition
,请参阅 Color Mode 。
Following is a table showing actual measurements on different hardware. For the 3D capture we use the Fast Consumer Goods settings.
- Zivid 2+
- Zivid 2
- Zivid One+
2D+3D 捕获 |
Intel UHD i5G1 |
NVIDIA 4070 |
Intel UHD 770 |
||
---|---|---|---|---|---|
Low-end [9] |
High-end [10] |
||||
先捕获2D图像,再捕获3D图像 |
|||||
✓ |
✓ |
2D |
74 (±7) ms |
73 (±1) ms |
75 (±0.3) ms |
3D |
1951 (±198) ms |
605 (±2) ms |
1301 (±2) ms |
||
✓ |
2D |
81 (±23) ms |
81 (±0.4) ms |
81 (±0.4) ms |
|
3D |
1980 (±9) ms |
634 (±5) ms |
1334 (±3) ms |
||
✓ |
2D |
74 (±0.4) ms |
73 (±1) ms |
74 (±0.3) ms |
|
3D |
1968 (±151) ms |
605 (±2) ms |
1302 (±2) ms |
||
2D |
43 (±20) ms |
43 (±0.4) 毫秒 |
43 (±0.5) ms |
||
3D |
1966 (±8) ms |
606 (±3) ms |
1307 (±3) ms |
||
先捕获3D图像,再捕获2D图像 |
|||||
✓ |
✓ |
2D |
74 (±0.4) ms |
74 (±1) ms |
75 (±0.3) ms |
3D |
1944 (±257) ms |
605 (±2) ms |
1303 (±2) ms |
||
✓ |
2D |
85 (±16) ms |
83 (±0.3) ms |
86 (±0.4) ms |
|
3D |
1817 (±540) ms |
593 (±2) ms |
1251 (±2) ms |
||
✓ |
2D |
73 (±7) ms |
73 (±1) ms |
74 (±0.3) ms |
|
3D |
1963 (±192) ms |
606 (±5) ms |
1303 (±2) ms |
||
2D |
74 (±76) ms |
71 (±0.3) ms |
74 (±0.3) ms |
||
3D |
1780 (±500) ms |
554 (±2) ms |
1212 (±2) ms |
||
捕获包含2D数据的3D图像 |
|||||
✓ |
✓ |
2D |
0 (±0.0) 毫秒 |
0 (±0.0) 毫秒 |
0 (±0.0) 毫秒 |
3D |
3997 (±269) ms |
1741 (±3) ms |
2516 (±2) ms |
||
✓ |
2D |
0 (±0.0) 毫秒 |
0 (±0.0) 毫秒 |
0 (±0.0) 毫秒 |
|
3D |
3948 (±269) ms |
1741 (±8) ms |
2516 (±8) ms |
使用了投影仪进行2D捕获。
背靠背捕获或者捕获循环之间存在延迟。这允许消除潜在的初始切换的时间损失。
Low-end machine with GPU: Intel UHD Graphics i5 G1 (ID:0x8A56) and CPU: Intel(R) Core(TM) i5-1035G1 CPU @ 1.00GHz, 1GbE
High-end machine with GPU: Intel UHD Graphics 770 and CPU: 13th Gen Intel(R) Core(TM) i9-13900K, 10GbE
2D+3D 捕获 |
Intel UHD 750 |
Intel UHD i3G1 |
NVIDIA 3070 |
||
---|---|---|---|---|---|
High-end [13] |
Low-end [14] |
High-end [15] |
|||
先捕获2D图像,再捕获3D图像 |
|||||
✓ |
✓ |
2D |
52 (±1) ms |
52 (±0.4) ms |
50 (±1) ms |
3D |
814 (±7) ms |
1259 (±8) ms |
363 (±5) ms |
||
✓ |
2D |
25 (±0.5) ms |
25 (±0.3) ms |
25 (±0.5) ms |
|
3D |
813 (±7) ms |
1261 (±8) ms |
361 (±3) ms |
||
✓ |
2D |
50 (±1) ms |
51 (±0.5) ms |
49 (±1) ms |
|
3D |
815 (±7) ms |
1257 (±6) ms |
363 (±5) ms |
||
2D |
13 (±0.2) ms |
13 (±0.2) ms |
13 (±0.2) ms |
||
3D |
813 (±6) ms |
1256 (±8) ms |
361 (±4) ms |
||
先捕获3D图像,再捕获2D图像 |
|||||
✓ |
✓ |
2D |
51 (±1) ms |
52 (±0.4) ms |
50 (±1) ms |
3D |
800 (±4) ms |
1242 (±8) ms |
363 (±5) ms |
||
✓ |
2D |
47 (±0.3) ms |
48 (±0.5) ms |
46 (±0.3) ms |
|
3D |
779 (±2) ms |
1228 (±9) ms |
353 (±3) ms |
||
✓ |
2D |
50 (±1) ms |
51 (±0.4) ms |
49 (±1) ms |
|
3D |
799 (±4) ms |
1242 (±6) ms |
362 (±5) ms |
||
2D |
40 (±0.3) ms |
43 (±0.3) ms |
40 (±0.3) ms |
||
3D |
739 (±3) ms |
1189 (±72) ms |
312 (±4) ms |
||
捕获包含2D数据的3D图像 |
|||||
✓ |
✓ |
2D |
0 (±0.0) 毫秒 |
0 (±0.0) 毫秒 |
0 (±0.0) 毫秒 |
3D |
817 (±3) ms |
1345 (±9) ms |
395 (±2) ms |
||
✓ |
2D |
0 (±0.0) 毫秒 |
0 (±0.0) 毫秒 |
0 (±0.0) 毫秒 |
|
3D |
822 (±7) ms |
1374 (±331) ms |
397 (±5) ms |
使用了投影仪进行2D捕获。
背靠背捕获或者捕获循环之间存在延迟。这允许消除潜在的初始切换的时间损失。
搭载了GPU:Intel UHD Graphics 750 (ID:0x4C8A) 和 CPU:11th Gen Intel(R) Core(TM) i9-11900K @ 3.50GHz, 10GbE 的高端机器
Low-end machine with GPU: Intel UHD Graphics i3 G1 (ID:0x8A56) and CPU: Intel(R) Core(TM) i3-1005G1 CPU @ 1.20GHz, 1GbE
搭载了GPU:NVIDIA GeForce RTX 3070 和 CPU:11th Gen Intel(R) Core(TM) i9-11900K @ 3.50GHz, 10GbE 的高端机器
2D+3D 捕获 |
Intel UHD 750 |
Intel UHD i3G1 |
NVIDIA 3070 |
||
---|---|---|---|---|---|
High-end [18] |
Low-end [19] |
High-end [20] |
|||
先捕获2D图像,再捕获3D图像 |
|||||
✓ |
✓ |
2D |
419 (±0.7) ms |
421 (±0.6) ms |
418 (±0.6) ms |
3D |
1381 (±7) ms |
1829 (±9) ms |
880 (±5) ms |
||
✓ |
2D |
24 (±0.5) ms |
24 (±0.3) ms |
24 (±0.5) ms |
|
3D |
1380 (±7) ms |
1829 (±6) ms |
880 (±5) ms |
||
✓ |
2D |
73 (±0.7) ms |
75 (±0.6) ms |
73 (±0.5) ms |
|
3D |
856 (±8) ms |
1291 (±4) ms |
339 (±2) ms |
||
2D |
24 (±0.5) ms |
24 (±0.4) ms |
24 (±0.3) ms |
||
3D |
844 (±8) ms |
1286 (±4) ms |
337 (±1) ms |
||
先捕获3D图像,再捕获2D图像 |
|||||
✓ |
✓ |
2D |
419 (±0.6) ms |
421 (±0.6) ms |
418 (±0.6) ms |
3D |
1380 (±10) ms |
1822 (±9) ms |
878 (±5) ms |
||
✓ |
2D |
419 (±0.6) ms |
421 (±0.7) ms |
418 (±0.5) ms |
|
3D |
876 (±7) ms |
1271 (±4) ms |
368 (±2) ms |
||
✓ |
2D |
73 (±0.5) ms |
75 (±0.6) ms |
73 (±0.5) ms |
|
3D |
849 (±7) ms |
1282 (±5) ms |
338 (±2) ms |
||
2D |
73 (±0.5) ms |
75 (±0.6) ms |
73 (±0.5) ms |
||
3D |
876 (±6) ms |
1269 (±4) ms |
368 (±2) ms |
||
捕获包含2D数据的3D图像 |
|||||
✓ |
✓ |
2D |
0 (±0.0) 毫秒 |
0 (±0.0) 毫秒 |
0 (±0.0) 毫秒 |
3D |
1378 (±47) ms |
1797 (±44) ms |
924 (±42) ms |
||
✓ |
2D |
0 (±0.0) 毫秒 |
0 (±0.0) 毫秒 |
0 (±0.0) 毫秒 |
|
3D |
1389 (±48) ms |
1824 (±44) ms |
922 (±42) ms |
使用了投影仪进行2D捕获。
背靠背捕获或者捕获循环之间存在延迟。这允许消除潜在的初始切换的时间损失。
搭载了GPU:Intel UHD Graphics 750 (ID:0x4C8A) 和 CPU:11th Gen Intel(R) Core(TM) i9-11900K @ 3.50GHz, 10GbE 的高端机器
Low-end machine with GPU: Intel UHD Graphics i3 G1 (ID:0x8A56) and CPU: Intel(R) Core(TM) i3-1005G1 CPU @ 1.20GHz, 1GbE
搭载了GPU:NVIDIA GeForce RTX 3070 和 CPU:11th Gen Intel(R) Core(TM) i9-11900K @ 3.50GHz, 10GbE 的高端机器
小技巧
To test different 2D-3D strategies on your PC, you can run ZividBenchmark.cpp sample with settings loaded from YML files. Go to 示例, and select C++ for instructions.
在下一章节中,我们将介绍如何 基于捕获速度选择3D和2D设置 。