Skip to content

API Usage Example

As example on how to use the API for inference, we'll first take a look at the dcc.molab_maya.client module to see how to connect to the backend and send a job for inference.

Then we will look at some methods in dcc.molab_maya.motion_io to see how to handle input and output motions.

Tip

A Blender integration is an open issue and a good opportunity for a contribution!

Collect Input Motion

To be able to pass motion data around, we need to understand the motion format at input and output. The input format packed_motion is a bit special due to its sparse nature, while the output format is more straightforward.

Part of the InferenceArgs is the packed_motion field. This is a dictionary mapping frame indices to packed poses, where a packed pose contains the root position followed by all 22 joint rotations, so it has a shape of (23, 3). Values stored as NaN indicate sparse keyframes aka missing joint information. The NaN values are later converted to a joint mask for the inference model.

For details on how to pack keyframes, see the pack_keyframes function of the Maya integration below. It takes a list of positions and a list of rotations that can contain NaN values and converts them to a packed_motion dictionary.

dcc.molab_maya.motion_io.pack_keyframes(positions, rotations, verbose=False)

Pack the keyframes into a more compact format.

Parameters:

Name Type Description Default
positions ndarray

(N, 22, 3) array of joint positions

required
rotations ndarray

(N, 22, 3) array of joint rotations

required
verbose

Whether to print the compacted frames.

False

Returns:

Name Type Description
packed_motion dict[int, list]

A dictionary mapping frame indices to packed poses.

Source code in dcc/molab_maya/motion_io.py
def pack_keyframes(
    positions: np.ndarray, rotations: np.ndarray, verbose=False
) -> dict[int, list]:
    """Pack the keyframes into a more compact format.

    Arguments:
        positions: (N, 22, 3) array of joint positions
        rotations: (N, 22, 3) array of joint rotations
        verbose: Whether to print the compacted frames.

    Returns:
        packed_motion: A dictionary mapping frame indices to packed poses.
    """
    root_pos = positions[:, 0]
    frame_mask_pos = ~np.isnan(root_pos).any(axis=(1))
    frame_mask_rot = ~np.isnan(rotations).any(axis=(1, 2))

    # Combine all frames that have non-nan values
    frame_mask = frame_mask_pos | frame_mask_rot
    valid_frames = np.where(frame_mask)[0]

    if verbose:
        print(f"Compacting to (partial) Keyposes on Frames:\n{valid_frames}")

    packed_motion: dict = {}
    for frame in valid_frames.tolist():
        packed_motion[frame] = np.ones((23, 3)) * np.nan
        packed_motion[frame][0] = root_pos[frame]
        packed_motion[frame][1:] = rotations[frame]

    # return packed_motion
    return {frame: packed_motion[frame].tolist() for frame in packed_motion}

Send Job to Backend

Now that we have the packed_motion dictionary, we can craft our inference arguments and send them to the backend. Below is the documentation for MoLabQClient.infer, which does the following:

  • Take a dictionary inference arguments (see InferenceArgs)
  • Connect to the backend via WebSocket
  • Send the job and wait for the result
  • Return the result dictionary (see InferenceResults)

Example

Below is a full payload example for the MoLabQClient.infer method containing two keyposes on frame 0 and 42 as well as a text prompt.

{
    "packed_motion": {
        0: [
            [0.0, 0.966035, 0.0],
            [-2.461784, 1.602837, 3.02837205],
            [-1.15376, -0.314741, -3.40727],
            [1.123194, -0.57072, 9.6684651],
            [5.430555, 12.008284, 3.450807],
            [nan, nan, nan],
            [0.699208, 0.478575, -4.624878],
            [-0.736858, 0.472722, 8.868751],
            [1.046311, 0.36474, 4.513838],
            [nan, nan, nan],
            [-2.097379, 0.002352, 1.437366],
            [6.395149, -0.91336201, -23.965065],
            [0.203726, -2.443971, 29.728936],
            [0.572847, 0.573686, -19.469958],
            [nan, nan, nan],
            [-13.751465, 5.598898, -3.18948],
            [-67.052628, -7.37833, -6.440387],
            [-23.210149, -25.2472202, 7.097196],
            [nan, nan, nan],
            [7.634399, -1.97200502, -1.282972],
            [69.428653, 6.069861, -6.181875],
            [10.862561, 27.296937, 3.88993195],
            [nan, nan, nan],
        ],
        42: [
            [-0.339078, 0.965653, 2.350223],
            [-2.312009, 2.433177, -1.82445502],
            [3.73811, -1.674805, -25.165169],
            [-8.567456, 1.59987202, 19.844779],
            [17.532981, 30.0588684, 19.2560214],
            [nan, nan, nan],
            [-0.172514997, -1.10521, 10.58829],
            [1.696808, -0.023274, 7.59669],
            [-2.601748, -8.717805, 7.61317205],
            [nan, nan, nan],
            [-1.07286802, -0.173289, 5.375921],
            [4.720364, -0.666077, -21.613376],
            [5.729873, 8.757588, 32.877729],
            [-4.825603, 6.19364, -15.489852],
            [nan, nan, nan],
            [-12.313785, 3.51262305, -3.91042395],
            [-60.028995, -1.354013, -10.010531],
            [-21.320532, -21.962499, 4.864524],
            [nan, nan, nan],
            [1.25623, 5.556887, 2.128571],
            [49.840547, 17.558537, 0.472948984],
            [1.83383397, 29.8473266, 2.299576],
            [nan, nan, nan],
        ],
    },
    "text_prompt": "A person walks forward",
    "num_samples": 3,
    "type": "infer",
}

dcc.molab_maya.qclient.MoLabQClient.infer(inference_args)

Sends an inference request to the backend.

Parameters:

Name Type Description Default
inference_args dict

The data to be sent for inference.

required
Source code in dcc/molab_maya/qclient.py
def infer(self, inference_args):
    """
    Sends an inference request to the backend.

    Args:
        inference_args (dict): The data to be sent for inference.
    """
    print("Sending inference request ...")
    inference_args["type"] = "infer"
    self.websocket.sendTextMessage(json.dumps(inference_args))

Apply Output Motion

The InferenceResults contains root_positions and joint_rotations fields of shape (S, F, 3) and (S, F, 22, 3) respectively, where S is the sample count, F is the number of frames.

For debugging purposes the results also contain the input motion as obs_root_positions and obs_joint_rotations.

In the _apply_motion function below, we show how to apply the output motion to a character rig in Maya.

dcc.molab_maya.motion_io._apply_motion(skeleton_group, root_pos, rotations, joint_mask=None, start_frame=1, name='sample')

Apply the motion to the skeleton.

Parameters:

Name Type Description Default
skeleton_group Transform

The skeleton group to duplicate and apply the motion to.

required
root_pos ndarray

The root positions for each frame.

required
rotations ndarray

The joint rotations for each frame.

required
joint_mask Optional[ndarray]

The mask for each joint and frame.

None
start_frame int

The starting frame for the motion.

1
name str

A prefix for the duplicated skeleton.

'sample'
Source code in dcc/molab_maya/motion_io.py
def _apply_motion(
    skeleton_group: pmc.nodetypes.Transform,
    root_pos: np.ndarray,
    rotations: np.ndarray,
    joint_mask: Optional[np.ndarray] = None,
    start_frame: int = 1,
    name: str = "sample",
):
    """Apply the motion to the skeleton.

    Args:
        skeleton_group: The skeleton group to duplicate and apply the motion to.
        root_pos: The root positions for each frame.
        rotations: The joint rotations for each frame.
        joint_mask: The mask for each joint and frame.
        start_frame: The starting frame for the motion.
        name: A prefix for the duplicated skeleton.
    """
    # Duplicate the source skeleton
    root_grp = pmc.duplicate(skeleton_group, name=f"{name}_{skeleton_group}")
    root_grp = pmc.ls(root_grp)[0]
    root_obj = root_grp.listRelatives(children=True, type="joint")[0]
    joints = _get_hierarchy(root_obj)

    # Get the start and end frames
    start = start_frame
    end = start_frame + root_pos.shape[0] - 1

    print(f"Applying keyframes to '{joints[0].getParent()}' [{start}-{end}]")

    for frame_time in range(start, end + 1):
        frame_idx = frame_time - start
        for joint_idx, joint_name in enumerate(joints):
            # Skip joints that are not part of the mask
            if joint_mask is not None and not joint_mask[frame_idx, joint_idx]:
                continue

            # Apply root position to the root joint
            if joint_idx == 0:
                pmc.setKeyframe(
                    joint_name,
                    time=frame_time,
                    attribute="tx",
                    value=root_pos[frame_idx, 0],
                )
                pmc.setKeyframe(
                    joint_name,
                    time=frame_time,
                    attribute="ty",
                    value=root_pos[frame_idx, 1],
                )
                pmc.setKeyframe(
                    joint_name,
                    time=frame_time,
                    attribute="tz",
                    value=root_pos[frame_idx, 2],
                )

            # Apply joint rotations to the all joints
            pmc.setKeyframe(
                joint_name,
                time=frame_time,
                attribute="rx",
                value=rotations[frame_idx, joint_idx, 0],
            )
            pmc.setKeyframe(
                joint_name,
                time=frame_time,
                attribute="ry",
                value=rotations[frame_idx, joint_idx, 1],
            )
            pmc.setKeyframe(
                joint_name,
                time=frame_time,
                attribute="rz",
                value=rotations[frame_idx, joint_idx, 2],
            )