📷 Camera#

System Camera#

When you use the renderer feature, MotrixSim automatically creates a system camera. The system camera accepts user mouse operations to enable movement, zooming, and other functions.

System camera operation effects

Operation Instructions:

Press and drag left mouse button: Rotate camera around focal point
Press and drag right mouse button: Move focal point (red circle indicates current focal point)
Mouse wheel: Zoom in/out (cannot zoom in further when reaching focal point)

The system camera supports partial configurations from mjcf/visual/global, such as:

orthographic: Whether to use orthographic projection
fovy: Camera field of view
azimuth: Initial azimuth angle (rotation around z-axis) of system camera
elevation: Initial elevation angle (pitch) of system camera

The system camera can be enabled/disabled.

system_camera = render.system_camera # get system camera
system_camera.active = True # enable camera
system_camera.active = False # disable camera

After render.launch(...), you can set the system camera view with orbit/look-at parameters. This controls the renderer’s system camera, not a model-defined MJCF camera. azimuth and elevation are in degrees.

render.system_camera.set_view(
    lookat=[0.0, 0.0, 0.75],
    distance=6.0,
    elevation=-20.0,
    azimuth=90.0,
)

Scene Camera#

In addition to the system camera, you can configure additional Camera tags in MJCF files. We refer to these as scene cameras. Scene cameras can provide additional viewing angles for visualization.

Custom Additional Cameras#

Taking go1.xml as an example, we define cameras in MJCF as follows:

<body name="trunk" pos="0 0 0.445" childclass="go1">
    <camera name="track" pos="0.846 -1.3 0.316" xyaxes="0.866 0.500 0.000 -0.171 0.296 0.940" mode="track"/>
    <camera name="top" pos="-1 0 1" xyaxes="0 -1 0 0.7 0 0.7" mode="track"/>
    <camera name="side" pos="0 -2 1" xyaxes="1 0 0 0 1 2" mode="track"/>
    <camera name="back" pos="-2.4 0 0.8" target="trunk" mode="targetbody"/>
</body>

This means we’ve defined 4 additional cameras on the go1’s trunk. The camera position and orientation are specified by pos and xyaxes, while mode specifies the camera’s movement mode.

Similarly, the scene camera can also be enabled/disabled.

camera_index = 0 # The first scene camera index
scene_camera = render.get_camera(camera_index) # get scene camera
scene_camera.active = True # enable camera
scene_camera.active = False # disable camera

Switching Scene Main Camera#

By default, MotrixSim uses the system camera as the main camera. If you want to use a scene camera as the main camera, you can switch via the Python API:

cameras = model.cameras # Get all cameras (note: system camera not included here)
preview_cameras = [None,*cameras] # None indicates system camera

Switch cameras using keyboard events:

if render.input.is_key_just_pressed("right"):
    # change to next camera
    preview_camera_idx = (preview_camera_idx + 1) % len(preview_cameras)
    render.set_main_camera(preview_cameras[preview_camera_idx])

if render.input.is_key_just_pressed("left"):
    # change to previous camera
    preview_camera_idx = (preview_camera_idx + len(preview_cameras) - 1) % len(preview_cameras)
    render.set_main_camera(preview_cameras[preview_camera_idx])

Switch scene main camera using left/right arrow keys

RGBD Sensors#

MotrixSim supports converting scene cameras into RGBD vision sensors, allowing you to obtain both RGB images and depth images. You can use the following interface to render a scene camera to an offline image:

cameras = model.cameras
cameras[0].set_render_target("image", 320, 240)  # let this camera render to a image with 320x240 resolution

By default, the camera uses RGB rendering mode. If you want the camera to only render depth images, you can set the camera’s depth_only property to True:

cameras[1].set_render_target("image", 640, 480)
cameras[1].depth_only = True  # This is a depth only camera
cameras[1].set_near_far(0.1, 1)  # Set the near and far plane of the camera

Note

The values stored in the depth map are in NDC (normalized device coordinates) space. If you want to convert them to actual distances from the camera in 3D space, you can use the following formula:

view_z = camera.near_plane / depth

The real-time effects of the RGBD camera will be displayed in the Camera panel on the left:

The top-right corner displays real-time RGB and depth camera effects

Reading Camera Image Data#

To read camera image data, you need to obtain the RenderCamera object from RenderApp and then perform a capture operation.

# press space to capture the rcamera.
if render.input.is_key_just_pressed("space"):
    rcam = render.get_camera(0)  # get render camera from index
    capture_tasks.append((capture_index, rcam.capture()))
    capture_index += 1

render.sync(data)

while len(capture_tasks) > 0:
    task: CaptureTask
    idx, task = capture_tasks[0]
    if task.state != "pending":
        capture_tasks.popleft()
        try:
            img = task.take_image()
            assert img is not None
            import os

            os.makedirs("shot", exist_ok=True)
            img.save_to_disk(f"shot/capture_{idx}.png")
            print(f"Captured image: shot/capture_{idx}.png")
        except Exception as e:
            print(f"Error saving image: {e}")
    else:
        break

Note that our capture operation is asynchronous, so the RenderCamera.capture method will return a CaptureTask object. You need to keep this CaptureTask object and continuously check its status until it is complete. Once finished, you can obtain the image data through the task.take_image() method.

You can refer to the API documentation for more details: