2D + 3D Capture Strategy
If you require both 2D and 3D data for your application, then this tutorial is for you.
We explain and emphasize the pros and cons of different 2D-3D capture approaches, clarify some limitations, and explain how they affect cycle times. We touch upon the difference between external light for 2D and using the internal projector.
- 2D data
RGB image
- 3D data
There are two different ways to get 2D data:
Independently via
camera.capture(Zivid::Settings2D).imageRGBA()
, see 2D Image Capture Process.As part of 3D capture
camera.capture(Zivid::Settings).pointCloud.copyImageRGBA()
, see Point Cloud Capture Process.
Which one to use, however, depends on your requirements.
Different scenarios will lead to different tradeoffs. We break it down by which data you require first. Then we will discuss tradeoffs of speed versus quality for the different scenarios.
I need 2D data before 3D data
I only need 2D data after I have used the 3D data
About external light
Before we go into the different strategies we have to discuss external light. The ideal light source for a 2D capture is strong and diffuse because this limits the blooming effects. With the internal projector as the light source, the blooming effects are almost inevitable. Mounting the camera at an angle significantly reduces this effect, but still an external diffuse light source is better. External light introduces noise in the 3D data, so one should ideally turn the external light off during 3D capture.
In addition to the reduction in blooming effects, strong external light can smooth out variations in exposure due to variations in ambient light. Typical sources for variations in ambient light:
changes in daylight (day/night, clouds, etc.)
doors opening and closing
ceiling light turned on and off
Such variations in exposure impact 3D and 2D data differently. The impact of exposure variations in the 2D data depends on the detection algorithm used. If segmentation is performed in 2D, then these variations may or may not impact segmentation performance. For the point cloud, you may find variations in point cloud completeness due to variations in noise.
This leads to the question: Should we use the projector for 2D?
Check out Optimizing Color Image for more information on that topic.
2D data before 3D data
If you, for example, perform segmentation in 2D and then later determine your picking pose, then you need 2D faster than 3D. The fastest way to get 2D data is via a separate 2D capture. Hence, if you need 2D data before 3D data then you should perform a separate 2D capture.
Tip
When you capture 2D separately you should disable RGB in the 3D capture.
This saves both on acquisition and processing time.
Disable RGB in 3D capture by setting Sampling::Color
to disabled
.
Warning
On Zivid 2+ we have a 4x4 subsampling mode, Monochrome Capture. When this is used there is a 35ms switching penalty between 2D and 3D capture. This happens only if the captures happen right after each other.
The following code sample shows how you can:
Capture 2D
Use 2D data and capture 3D in parallel
const auto frame2dAndCaptureTime = captureAndMeasure<Zivid::Frame2D>(camera, settings2D);
std::future<Duration> userThread =
std::async(std::launch::async, useFrame<Zivid::Frame2D>, std::ref(frame2dAndCaptureTime.frame));
const auto frameAndCaptureTime = captureAndMeasure<Zivid::Frame>(camera, settings);
const auto processTime = useFrame(frameAndCaptureTime.frame);
const auto processTime2D = userThread.get();
The following shows actual benchmark numbers. You will find a more extensive table at the bottom of the page.
2D data as part of how I use 3D data
In this case, we don’t have to get access to the 2D data before the 3D data. You always get 2D data as part of a 3D acquisition. Thus we only have to care about overall speed and quality.
Speed
For optimal speed, we simply rely on 3D acquisitions to provide good 2D data. There is no additional acquisition or separate capture for 2D data.
2D Quality
For optimal 2D quality, it is recommended to use a separate acquisition for 2D.
Following is a table that shows what you can expect from the different configurations. At the end you will find a table showing actual measurements on different hardware.
- Fast
Use 2D data from 3D capture. No special acquisition or settings for 2D.
- Best
Separate 2D capture followed by 3D capture.
Fast |
Best |
||
---|---|---|---|
3D [2] |
2D + 3D |
||
Zivid 2 |
2D |
N/A |
~25 ms |
3D |
~295 ms |
~290 ms |
|
2D+3D |
~295 ms |
~300 ms |
|
Zivid 2+ |
2D |
N/A |
~55 ms |
3D |
~170 ms |
~220 ms |
|
2D+3D |
~170 ms |
~260 ms |
The following shows actual benchmark numbers. You will find a more extensive table at the bottom of the page.
2D data after I have used the 3D data
You always get 2D data as part of a 3D acquisition. The table below shows 3D capture time examples.
However, optimizing for 3D quality does not always optimize for 2D quality. Thus, it might be a good idea to have a separate 2D capture after the 3D capture.
The following shows actual benchmark numbers. You will find a more extensive table at the bottom of the page.
Camera resolution and 1-to-1 mapping
For accurate 2D segmentation and detection, it is beneficial with a high-resolution color image. Zivid 2+ has a 5 MPx imaging sensor, while Zivid 2 has a 2.3 MPx sensors. The following table shows the resolution outputs of the different cameras for both 2D and 3D captures.
2D capture |
Zivid 2 |
Zivid 2+ |
---|---|---|
Full resolution |
1944 x 1200 |
2448 x 2048 |
2x2 subsampled |
972 x 600 |
1224 x 1024 |
4x4 subsampled |
Not available |
612 x 512 |
3D capture |
Zivid 2 |
Zivid 2+ |
---|---|---|
Full resolution [1] |
1944 x 1200 |
2448 x 2048 |
2x2 subsampled [1] |
972 x 600 |
1224 x 1024 |
4x4 subsampled [1] |
Not available |
612 x 512 |
2D information extracted from the 3D data will have same resolution.
Output resolution of both 2D and 3D captures in controlled via the combination of the Sampling::Pixel
and the Processing::Resampling
settings, see pixel sampling and Resampling.
This means that it is possible to no longer have a 1-to-1 correlation between a 2D pixel and a 3D point.
Consequently, it is more challenging to extract the 3D data from a segmented mask in the 2D image.
As mentioned, it is common to require high-resolution 2D data for segmentation and detection.
For example, our recommended preset for Consumer Goods Z2+ M130 Quality
preset uses Sampling::Pixel
set to blueSubsample2x2
.
In this case we should either:
Upsample the 3D data to restore 1-to-1 correspondence, or
Map 2D indices to the indices in the subsampled 3D data.
Resampling
In order to match the resolution of the 2D capture, simply apply an upsampling which undoes the subsampling. This retains the speed advantages of the subsampled capture. For example:
auto settings2D = Zivid::Settings2D{
Zivid::Settings2D::Acquisitions{ Zivid::Settings2D::Acquisition{} },
Zivid::Settings2D::Sampling::Pixel::all,
};
auto settings = Zivid::Settings{
Zivid::Settings::Engine::phase,
Zivid::Settings::Acquisitions{ Zivid::Settings::Acquisition{} },
Zivid::Settings::Sampling::Pixel::blueSubsample2x2,
Zivid::Settings::Sampling::Color::disabled,
Zivid::Settings::Processing::Resampling::Mode::upsample2x2,
};
settings_2d = zivid.Settings2D()
settings_2d.acquisitions.append(zivid.Settings2D.Acquisition())
settings_2d.sampling.pixel = zivid.Settings2D.Sampling.Pixel.all
settings = zivid.Settings()
settings.engine = "phase"
settings.acquisitions.append(zivid.Settings.Acquisition())
settings.sampling.pixel = zivid.Settings.Sampling.Pixel.blueSubsample2x2
settings.sampling.color = zivid.Settings.Sampling.Color.disabled
settings.processing.resampling.mode = zivid.Settings.Processing.Resampling.Mode.upsample2x2
For more details see Resampling.
Mapping pixel indices between different resolutions
The other option is to map the 2D indices to the indices in the subsampled 3D data. This option is a bit more complicated, but it is potentially more efficient. The point cloud can remain subsampled, and thus consume less memory and processing power.
To establish a correlation between the full-resolution 2D image and the subsampled point cloud, a specific mapping technique is required. This process involves extracting RGB values from the pixels that correspond to the Blue or Red pixels from the Bayer grid.
Zivid::Experimental::Calibration::pixelMapping(camera, settings);
can be used to get parameters required to perform this mapping.
Following is an example which uses this function.
const auto pixelMapping = Zivid::Experimental::Calibration::pixelMapping(camera, settings);
std::cout << "Pixel mapping: " << pixelMapping << std::endl;
cv::Mat mappedBGR(
fullResolutionBGR.rows / pixelMapping.rowStride(),
fullResolutionBGR.cols / pixelMapping.colStride(),
CV_8UC3);
std::cout << "Mapped width: " << mappedBGR.cols << ", height: " << mappedBGR.rows << std::endl;
for(size_t row = 0; row < static_cast<size_t>(fullResolutionBGR.rows - pixelMapping.rowOffset());
row += pixelMapping.rowStride())
{
for(size_t col = 0; col < static_cast<size_t>(fullResolutionBGR.cols - pixelMapping.colOffset());
col += pixelMapping.colStride())
{
mappedBGR.at<cv::Vec3b>(row / pixelMapping.rowStride(), col / pixelMapping.colStride()) =
fullResolutionBGR.at<cv::Vec3b>(row + pixelMapping.rowOffset(), col + pixelMapping.colOffset());
}
}
return mappedBGR;
pixel_mapping = calibration.pixel_mapping(camera, settings)
return rgba[
int(pixel_mapping.row_offset) :: pixel_mapping.row_stride,
int(pixel_mapping.col_offset) :: pixel_mapping.col_stride,
0:3,
]
For more details about mapping (example for blueSubsample2x2
)
In order to extract all the RGB values which correspond to the blue pixels we use the indices:
In order to extract all the RGB values which correspond to the red pixels we use the indices:
Note
If you use intrinsics and 2D and 3D capture have different resolutions, ensure you use them correctly. See Camera Intrinsics for more information.
Summary
- Our recommendation:
2D capture with full resolution
3D monochrome capture with subsampled resolution
Note
Subsampling or downsampling in user code is only necessary if you want to have 1-to-1 pixel correspondence when you capture and copy 2D and 3D with different resolutions.
The following tables list the different 2D+3D capture configurations. It shows how they are expected to perform relative to each other with respect to speed and quality.
Capture Cycle |
Speed |
2D-Quality |
|
---|---|---|---|
Zivid 2 |
Zivid 2+ |
||
Faster |
Fast |
Best |
|
3D ➞ 2D / 2D ➞ 3D |
Fast |
Fast |
Best |
3D (w/RGB enabled) |
Fastest |
Fastest |
Good |
Following is a table showing actual measurements on different hardware. For the 3D capture we use the Fast Consumer Goods settings.
- Zivid 2+
- Zivid 2
Tip
To test different 2D-3D strategies on your PC, you can run ZividBenchmark.cpp sample with settings loaded from YML files. Go to Samples, and select C++ for instructions.
Version History
SDK |
Changes |
---|---|
2.12.0 |
Acquisition time is reduced by up to 50% for 2D captures and up to 5% for 3D captures for Zivid 2+. Zivid One+ has reached its End-of-Life and is no longer supported; thus, most of the complexities related to 2D+3D captures are no longer applicable. |
2.11.0 |
Zivid 2 and Zivid 2+ now support concurrent processing and acquisition for 3D ➞ 2D and 3D ➞ 2D, and switching between capture modes have been optimized. |