How To Use The Result Of Hand-Eye Calibration

우리는 물건을 피킹하기 위해 Zivid 카메라와 머신 비전 소프트웨어가 동반된 로봇에 대한 요구 사항을 분석했습니다. 이로 인해 Hand-Eye Calibration Problem 에 대한 문제를 설명하였습니다. 핸드-아이 칼리브레이션 결과를 활용하는 방법을 잘 모르는 경우 올바른 페이지를 찾으셨습니다. 이번 글에서 물체의 좌표를 Zivid 카메라 좌표계에서 로봇 베이스 좌표계로 변환하는 방법을 설명합니다.

Zivid 포인트 클라우드에서 머신 비전 소프트웨어를 실행한다고 가정해 보겠습니다. 이것은 이 Zivid gem과 같은 관심의 대상을 감지하고 그것의 위치를 추정합니다. 선택 지점을 설명하는 x, y, z 값은 Zivid 카메라의 좌표계를 기준으로 제공됩니다.

팁

애플리케이션을 실행하기 전에 핸드-아이 칼리브레이션과 동일한 캡처 주기를 사용하여 카메라를 Warm-up 하는 것이 좋습니다. 온도 의존적 성능 요인의 영향을 더욱 줄이려면 Thermal Stabilization 을 활성화하세요.

../../../_images/hand-eye-use-result-point-cloud-screenshot.png

어떤 경우에는 알고리즘이 롤, 피치 및 요 각도같은 객체의 방향도 출력합니다. 이런 매개변수는 Zivid 카메라의 좌표계를 기준으로 제공됩니다.

../../../_images/hand-eye-use-result-point-cloud-screenshot-full-pose.png

객체의 자세(위치 및 방향)는 Homogeneous Transformation Matrix로 설명할 수 있습니다. (로봇) 자세 및 좌표계에 익숙하지 않은 경우 Position, Orientation and Coordinate Transformations 을 확인하십시오.

In practice, the easiest way is to transform the point cloud from the camera to the robot base reference frame. The Zivid SDK supports transforming the data before it is copied to the CPU and is therefore very fast.

Below you will see the mathematical theory of transforming a single point or an entire point cloud from the camera to the robot base reference frame. This is followed by a code tutorial that demonstrates putting this into practice.

Eye-to-hand

Eye-to-hand 시스템을 다루는 경우 단일 3D 포인트를 Zivid 카메라에서 로봇 베이스 좌표계로 변환하는 방법은 다음과 같습니다.

p^{R O B} = H_{C A M}^{R O B} \cdot p^{C A M}

\begin{array}{r} [\begin{array}{c} x^{r} \\ y^{r} \\ z^{r} \\ 1 \end{array}] = [\begin{array}{c} R_{c}^{r} & t_{c}^{r} \\ 0 & 1 \end{array}] \cdot [\begin{array}{c} x^{c} \\ y^{c} \\ z^{c} \\ 1 \end{array}] \end{array}

전체 Zivid 포인트 클라우드를 카메라 좌표계에서 로봇 베이스 좌표계로 변환하려면 포인트 클라우드의 각 포인트에 위의 방정식을 적용하십시오.

반면에 Zivid 카메라를 기준으로 개체의 포즈를 변환하려면 다음 방정식을 적용합니다.

H_{O B J}^{R O B} = H_{C A M}^{R O B} \cdot H_{O B J}^{C A M}

\begin{array}{r} [\begin{array}{c} R_{o}^{r} & t_{o}^{r} \\ 0 & 1 \end{array}] = [\begin{array}{c} R_{c}^{r} & t_{c}^{r} \\ 0 & 1 \end{array}] \cdot [\begin{array}{c} R_{o}^{c} & t_{o}^{c} \\ 0 & 1 \end{array}] \end{array}

포즈가 Homogeneous Transformation Matrix로 표현되어 있다고 가정합니다. 그렇지 않은 경우 다음 기사를 확인하십시오. Conversions Between Common Orientation Representations.

결과 포즈는 로봇 TCP(Tool Center Point)가 피킹을 위해 달성해야 하는 포즈입니다. TCP와 로봇 플랜지 사이의 오프셋은 로봇 측에서 고려해야 합니다.

Eye-in-hand

Eye-in-hand 시스템에 대한 접근 방식도 비슷합니다. 차이점은 로봇의 현재 자세가 방정식에 포함되어야 한다는 것입니다. 다른 포즈와 마찬가지로 로봇 포즈가 Homogeneous Transformation Matrix로 표현된다고 가정합니다. 로봇 포즈가 다르게 표현되는 경우 다음 기사를 확인하세요. Conversions Between Common Orientation Representations.

다음 방정식은 단일 3D 포인트를 Zivid 카메라에서 로봇 베이스 좌표계로 변환하는 방법을 설명합니다.

p^{R O B} = H_{E E}^{R O B} \cdot H_{C A M}^{E E} \cdot p^{C A M}

\begin{array}{r} [\begin{array}{c} x^{r} \\ y^{r} \\ z^{r} \\ 1 \end{array}] = [\begin{array}{c} R_{e}^{r} & t_{e}^{r} \\ 0 & 1 \end{array}] \cdot [\begin{array}{c} R_{c}^{e} & t_{c}^{e} \\ 0 & 1 \end{array}] \cdot [\begin{array}{c} x^{c} \\ y^{c} \\ z^{c} \\ 1 \end{array}] \end{array}

전체 Zivid 포인트 클라우드를 카메라 좌표계에서 로봇 베이스 좌표계로 변환하려면 포인트 클라우드의 각 포인트에 위의 방정식을 적용하십시오.

Zivid 카메라를 기준으로 개체의 포즈를 변환하려면 다음 방정식을 사용합니다.

H_{O B J}^{R O B} = H_{E E}^{R O B} \cdot H_{C A M}^{E E} \cdot H_{O B J}^{C A M}

\begin{array}{r} [\begin{array}{c} R_{o}^{r} & t_{o}^{r} \\ 0 & 1 \end{array}] = [\begin{array}{c} R_{e}^{r} & t_{e}^{r} \\ 0 & 1 \end{array}] \cdot [\begin{array}{c} R_{c}^{e} & t_{c}^{e} \\ 0 & 1 \end{array}] \cdot [\begin{array}{c} R_{o}^{c} & t_{o}^{c} \\ 0 & 1 \end{array}] \end{array}

결과 포즈는 로봇 TCP(Tool Center Point)가 피킹을 위해 달성해야 하는 포즈입니다. TCP와 로봇 플랜지 사이의 오프셋은 로봇 측에서 고려해야 합니다.

To put this theory into practice, let’s follow the tutorial to transform a single point or an entire point cloud from the camera to the robot base reference frame.

Python

"""
Transform single data point or entire point cloud from camera to robot base reference frame using Hand-Eye calibration
matrix.

This example shows how to utilize the result of Hand-Eye calibration to transform either (picking) point coordinates
or the entire point cloud from the camera to the robot base reference frame.

For both Eye-To-Hand and Eye-In-Hand, there is a Zivid gem placed approx. 500 mm away from the robot base (see below).
The (picking) point is the Zivid gem centroid, defined as image coordinates in the camera reference frame and hard-coded
in this code example. Open the ZDF files in Zivid Studio to inspect the gem's 2D and corresponding 3D coordinates.

Eye-To-Hand
- ZDF file: ZividGemEyeToHand.zdf
- 2D image coordinates: (1035,255)
- Corresponding 3D coordinates: (37.77 -145.92 1227.1)
- Corresponding 3D coordinates (robot base reference frame): (-12.4  514.37 -21.79)

Eye-In-Hand:
- ZDF file: ZividGemEyeInHand.zdf
- 2D image coordinates: (1460,755)
- Corresponding 3D coordinates (camera reference frame): (83.95  28.84 305.7)
- Corresponding 3D coordinates (robot base reference frame): (531.03  -5.44 164.6)

For verification, check that the Zivid gem centroid 3D coordinates are the same as above after the transformation.

The YAML files for this sample can be found under the main instructions for Zivid samples.

"""

Eye-to-hand

First, we get the camera pose in the robot base reference frame (result of eye-to-hand calibration).

Python

eye_to_hand_transform_file_path = get_sample_data_path() / "EyeToHandTransform.yaml"

print("Reading camera pose in robot base reference frame (result of eye-to-hand calibration)")
base_to_camera_transform = load_and_assert_affine_matrix(eye_to_hand_transform_file_path)

The point cloud is given in the camera reference frame.

../../../_images/zivid-gem-eye-to-hand-in-camera-frame.png

At this point, your machine vision software should detect the workpiece and estimate the picking point. In the example, we assume that the picking point is known, defined as image coordinates in the camera reference frame.

Python

# The (picking) point is defined as image coordinates in camera reference frame. It is hard-coded
# for the ZividGemEyeToHand.zdf (1035,255) X: 37.77 Y: -145.92 Z: 1227.1
image_coordinate_x = 1035
image_coordinate_y = 255

The picking point in the robot base reference frame is then calculated as follows. We first get the point cloud.

Python

        file_name = "ZividGemEyeInHand.zdf"
data_file = get_sample_data_path() / file_name
print(f"Reading point cloud from file: {data_file}")

frame = zivid.Frame(data_file)
point_cloud = frame.point_cloud()

Eye-in-hand

First, we get the following poses:

Camera pose in flange (end-effector) reference frame (result of eye-in-hand calibration)
Flange (end-effector) pose in robot base reference frame (current robot pose)

Python

eye_in_hand_transform_file_path = get_sample_data_path() / "EyeInHandTransform.yaml"
robot_transform_file_path = get_sample_data_path() / "RobotTransform.yaml"

print("Reading camera pose in flange (end-effector) reference frame (result of eye-in-hand calibration)")
flange_to_camera_transform = load_and_assert_affine_matrix(eye_in_hand_transform_file_path)

print("Reading flange (end-effector) pose in robot base reference frame")
base_to_flange_transform = load_and_assert_affine_matrix(robot_transform_file_path)

By multiplying these two transformation matrices, we get the camera pose in the robot base reference frame.

Python

print("Computing camera pose in robot base reference frame")
base_to_camera_transform = np.matmul(base_to_flange_transform, flange_to_camera_transform)

The point cloud is given in the camera reference frame.

../../../_images/zivid-gem-eye-in-hand-in-camera-frame.png

At this point, your machine vision software should detect the workpiece and estimate the picking point. In the example, we assume that the picking point is known, defined as image coordinates in the camera reference frame.

Python

# The (picking) point is defined as image coordinates in camera reference frame. It is hard-coded
# for the ZividGemEyeInHand.zdf (1460,755) X: 83.95 Y: 28.84 Z: 305.7
image_coordinate_x = 1460
image_coordinate_y = 755

The picking point in the robot base reference frame is then calculated as follows. We first get the point cloud.

Python

        file_name = "ZividGemEyeInHand.zdf"
data_file = get_sample_data_path() / file_name
print(f"Reading point cloud from file: {data_file}")

frame = zivid.Frame(data_file)
point_cloud = frame.point_cloud()

Then, from the point cloud, we get the picking point (XYZ coordinates) in the camera reference frame.

Python

print("Transforming single point")

xyz = point_cloud.copy_data("xyz")

point_in_camera_frame = np.array(
    [
        xyz[image_coordinate_y, image_coordinate_x, 0],
        xyz[image_coordinate_y, image_coordinate_x, 1],
        xyz[image_coordinate_y, image_coordinate_x, 2],
        1,
    ]
)
print(f"Point coordinates in camera reference frame: {point_in_camera_frame[0:3]}")

Finally, we transform the picking point to the robot base reference frame.

Python

print("Transforming (picking) point from camera to robot base reference frame")
point_in_base_frame = np.matmul(base_to_camera_transform, point_in_camera_frame)

print(f"Point coordinates in robot base reference frame: {point_in_base_frame[0:3]}")

Instead of transforming only the picking point, we can transform the whole point cloud to the robot base reference frame. This enables your machine vision software to find the picking point in the robot base reference frame directly from the point cloud.

Python

point_cloud.transform(base_to_camera_transform)

Eye-to-hand

To get the point cloud in ZDF format, you can run the code sample or download it as part of Sample Data (ZividGemEyeToHandInRobotBaseFrame.zdf).

Eye-in-hand

To get the point cloud in ZDF format, you can run the code sample or download it as part of Sample Data (ZividGemEyeInHandInRobotBaseFrame.zdf).

Version History

SDK	Changes
2.12.0	Add a section on how to utilize the result of hand-eye calibration to transform the object’s coordinates from the Zivid camera coordinate system to the robot base coordinate system.