2D + 3D Capture Strategy

Note, if you do not care about color information, jump straight to the next section, selecting 3D and 2D settings based on capture speed.

Many detection algorithms commonly used in piece-picking applications rely on 2D data to identify which object to pick. In this article, we provide insights into different ways to acquire 2D information, their pros and cons, and external lighting conditions. We also touch upon various 2D-3D approaches, their data quality, and how they affect cycle times.

There are two approaches to get 2D data:

Separate 2D capture via camera.capture2D(Zivid::Settings).imageRGBA(), see 2D Image Capture Process.
Part of 3D capture camera.capture2D3D(Zivid::Settings).pointCloud().copyImageRGBA(), see Point Cloud Capture Process.

Which one to use depends on your requirements and the machine vision pipeline. We advocate for a dedicated 2D capture as it can leverage multi-threading and optimized scheduling. Utilizing 2D data from the 3D capture is simpler, but you may have to compromise speed to get desired 2D quality.

Tip

When you capture 2D separately you should disable RGB in the 3D capture. This saves both on acquisition and processing time. Disable RGB in 3D capture by setting Sampling::Color to disabled.

Our recommendation:

Separate 2D capture with full resolution and projector on.
Subsampled 3D capture with color disabled.

Camera resolution and 1-to-1 mapping

For accurate 2D segmentation and detection, it is beneficial with a high-resolution color image. Zivid 2+ has a 5 MPx imaging sensor, while Zivid 2 has a 2.3 MPx sensors. The following table shows the resolution outputs of the different cameras for both 2D and 3D captures.

2D capture resolutions
2D capture	Zivid 2+	Zivid 2
Full resolution	2448 x 2048	1944 x 1200
2x2 subsampled	1224 x 1024	972 x 600
4x4 subsampled	612 x 512	Not available

3D capture resolutions
3D capture	Zivid 2+	Zivid 2
Full resolution	2448 x 2048	1944 x 1200
2x2 subsampled	1224 x 1024	972 x 600
4x4 subsampled	612 x 512	Not available

When performing a capture2D3D() capture the result is a frame that contains both 2D and 3D data.

2D data can be extracted in two ways:

frame.frame2D().imageRGBA_SRGB(): This is the same as if you would have captured 2D independently.
frame.pointCloud().copyImageRGBA_SRGB(): This will ensure 1-to-1 mapping, even in cases where the settings define that 2D and 3D should have different resolutions.

Output resolution of 2D captures is controlled via the Settings2D::Sampling::Pixel setting and the output resolution of 3D captures via the combination of the Settings::Sampling::Pixel and the Settings::Processing::Resampling settings. See Pixel Sampling (2D), Pixel Sampling (3D) and Resampling.

As mentioned, it is common to require high-resolution 2D data for segmentation and detection. For example, our recommended preset for Consumer Goods Z2+ MR130 Quality preset uses Sampling::Pixel set to by2x2. In this case we should either:

Upsample the 3D data to restore 1-to-1 correspondence, or
Map 2D indices to the indices in the subsampled 3D data, or
Get 2D data from the pointcloud via frame.pointCloud().copyImageRGBA_SRGB()

Resampling

In order to match the resolution of the 2D capture, simply apply an upsampling which undoes the subsampling. This retains the speed advantages of the subsampled capture. For example:

C++

auto settings2D = Zivid::Settings2D{
    Zivid::Settings2D::Acquisitions{ Zivid::Settings2D::Acquisition{} },
    Zivid::Settings2D::Sampling::Pixel::all,
};
auto settings = Zivid::Settings{
    Zivid::Settings::Engine::phase,
    Zivid::Settings::Acquisitions{ Zivid::Settings::Acquisition{} },
    Zivid::Settings::Sampling::Pixel::blueSubsample2x2,
    Zivid::Settings::Sampling::Color::disabled,
    Zivid::Settings::Processing::Resampling::Mode::upsample2x2,
};

Python

settings_2d = zivid.Settings2D()
settings_2d.acquisitions.append(zivid.Settings2D.Acquisition())
settings_2d.sampling.pixel = zivid.Settings2D.Sampling.Pixel.all
settings = zivid.Settings()
settings.engine = "phase"
settings.acquisitions.append(zivid.Settings.Acquisition())
settings.sampling.pixel = zivid.Settings.Sampling.Pixel.blueSubsample2x2
settings.sampling.color = zivid.Settings.Sampling.Color.disabled
settings.processing.resampling.mode = zivid.Settings.Processing.Resampling.Mode.upsample2x2

For more details see Resampling.

The other option is to map the 2D indices to the indices in the subsampled 3D data. This option is a bit more complicated, but it is potentially more efficient. The point cloud can remain subsampled, and thus consume less memory and processing power.

To establish a correlation between the full-resolution 2D data and the subsampled point cloud, a specific mapping technique is required. This process involves extracting RGB values from the pixels that correspond to the Blue or Red pixels from the Bayer grid.

Zivid::Experimental::Calibration::pixelMapping(camera, settings); can be used to get parameters required to perform this mapping. Following is an example which uses this function.

C++

const auto pixelMapping = Zivid::Experimental::Calibration::pixelMapping(camera, settings);
std::cout << "Pixel mapping: " << pixelMapping << std::endl;
cv::Mat mappedBGR(
    fullResolutionBGR.rows / pixelMapping.rowStride(),
    fullResolutionBGR.cols / pixelMapping.colStride(),
    CV_8UC3);
std::cout << "Mapped width: " << mappedBGR.cols << ", height: " << mappedBGR.rows << std::endl;
for(size_t row = 0; row < static_cast<size_t>(fullResolutionBGR.rows - pixelMapping.rowOffset());
    row += pixelMapping.rowStride())
{
    for(size_t col = 0; col < static_cast<size_t>(fullResolutionBGR.cols - pixelMapping.colOffset());
        col += pixelMapping.colStride())
    {
        mappedBGR.at<cv::Vec3b>(row / pixelMapping.rowStride(), col / pixelMapping.colStride()) =
            fullResolutionBGR.at<cv::Vec3b>(row + pixelMapping.rowOffset(), col + pixelMapping.colOffset());
    }
}
return mappedBGR;

Python

pixel_mapping = calibration.pixel_mapping(camera, settings)
return rgba[
    int(pixel_mapping.row_offset) :: pixel_mapping.row_stride,
    int(pixel_mapping.col_offset) :: pixel_mapping.col_stride,
    0:3,
]

Note

If you use intrinsics and 2D and 3D capture have different resolutions, ensure you use them correctly. See Camera Intrinsics for more information.

External light considerations

The ideal light source for a 2D capture is strong, because it reduces the influence of varying ambient light, and diffuse, because this limits the blooming effects. For Zivid cameras, this light source comes from the internal projector. Therefore, you do not need any additional lighting in your robot cell.

You may encounter blooming when utilizing the internal projector as light source. Tilting the camera, changing the background, or tuning the 2D acquisition settings can mitigate the blooming effect.

Exposure variations caused by changes in ambient light, such as transitions from day to night, doors opening and closing, or changes in ceiling lighting, affects 2D and 3D data differently. For 2D data, they can impact segmentation performance, especially when it is trained on specific datasets. For 3D data, exposure variations may affect point cloud completeness due to varying noise levels. Zivid cameras are very robust to such exposure variations.

The below table summarizes the pros and cons of using Zivid camera with respect to 2D quality.

	Internal projector
Robot Cell setup	Simple
Resilience to ambient light variations	Strong
Blooming in 2D images	Likely
2D color balance needed	No

Capture strategies

There are three capture strategies depending on which data (2D or 3D) you need first.

2D data before 3D data
2D data as part of 3D data
2D data after 3D data

Which strategy you should go for depends on your machine vision algorithms and pipeline. We recommend 2D data before 3D data (taking a 2D capture first, followed by a 3D capture with color disabled). This approach allows you to process the color image (e.g., segmentation) in parallel with capturing 3D data, thus achieving the best pick rates on the system level.

Below we summarize the performance of the different strategies. For a more in-depth understanding and comprehensive ZividBenchmarks, please see 2D + 3D Capture Strategy.

Following is a table showing actual measurements on different hardware.

Zivid 2+

(Z2+ LR110 Fast)

(Z2+ L110 Fast)

Zivid 2

(Z2 M70 Fast)

Tip

To test different 2D-3D strategies on your PC, you can run ZividBenchmark.cpp sample with settings loaded from YML files. Go to Samples, and select C++ for instructions.

Cameras

Configurations

Display Options

Standard Deviation

Capture 2D + 3D (hide)

Capture 2D and then 3D

Capture 3D and then 2D

Capture 3D including 2D

hidden

	2+R
	High-end NVIDIA
	2D	3D
Capture 2D and then 3D
Projector On for 2D
Capture 3D and then 2D
Capture 3D including 2D

In the following section, we guide you on selecting 3D and 2D settings based on capture speed.

Version History

SDK	Changes
2.12.0	Acquisition time is reduced by up to 50% for 2D captures and up to 5% for 3D captures for Zivid 2+. Zivid One+ has reached its End-of-Life and is no longer supported.
2.11.0	Added support for `redSubsample4x4` and `blueSubsample4x4`.