Skip to content

Inference Worker

The MotionInferenceWorker is the core of MoLab's inference system. It is responsible for generating motion sequences based on the given inference parameters as defined in the InferenceArgs. The output is defined by the InferenceResults and contains both the input and output motion sequences.

As underlying motion generation model we use a fork of setarehc/diffusion-motion-inbetweening aka CondMDI.

models.condmdi.molab_condmdi.inference_worker

InferenceArgs dataclass

Bases: GenerateOptions, CondSyntOptions, CustomSyntOptions

Arguments and options for motion inference.

Attributes:

Name Type Description
text_prompt str

Input text for motion generation.

num_samples int

The number of motion samples to generate. Default is 3.

packed_motion dict[int, list]

A mapping of frame indices to poses (J+1, 3), where each pose is defined by the root position and all J joint rotations. When no motion is supplied it will be initialized as "empty" motion. Alternatively you can use bvh_path to load a motion from file.

unpack_mode str

Mode for unpacking and converting packed_motion to features. Choices are "step" (stepped animation) and "linear" (interpolation). Default is "linear".

unpack_randomness float

Randomness applied during unpacking. Default is 0.0.

editable_features str

Indicates which features of the input are observed. Default is "pos_rot_vel", alternatives are "pos_rot" or just "pos".

bvh_path str

Path to the BVH file. Needs to be used in combination with edit_mode to specify the masking. Can not be used together with packed_motion.

edit_mode str

Masking strategy for the BVH input motion. For all options see parser_util.CondSyntOptions.

jacobian_ik bool

Flag to switch from Basic to Jacobian IK. Default is False.

foot_ik bool

Flag to enable Foot IK to reduce foot sliding. Default is False.

imputate bool

Flag to enable imputation between inference steps. Default is False.

reconstruction_guidance bool

Flag to enable reconstruction guidance during inference. Default is False.

InferenceResults

Bases: BaseModel

Inference results containing samples and input motions.

Attributes:

Name Type Description
root_positions list

List of root positions.

joint_rotations list

List of joint rotations.

obs_root_positions list

List of observed root positions.

obs_joint_rotations list

List of observed joint rotations.

ModelArgs dataclass

Bases: BaseOptions, DataOptions, ModelOptions, DiffusionOptions, TrainingOptions, SamplingOptions

Contains Mostly Model Options:

  • BaseOptions (cuda, device, seed)
  • DataOptions (Dataset Type and Path, Data Representation, Augmentation, ...)
  • ModelOptions (Model Architecture)
  • DiffusionOptions (Diffusion Hyperparameters)
  • TrainingOptions (Save path, Batchsize, LR, Loss Weights, ...)
  • SamplingOptions (Model and Output Path, Samples and Reps, CFG Guidance)

MotionInferenceWorker

__init__(name, model_args)

Initialize the worker and parse arguments without starting the model.

get_output_path(infer_config)

Get the output path for the results of the inference.

infer(infer_config, save_results=True)

Infer using InferenceArgs and return InferenceResults, optionally save intermediate results to output directory.

Parameters:

Name Type Description Default
infer_config InferenceArgs

The arguments for motion inference.

required
save_results bool

Whether to save outputs to disk. Defaults to True.

True

Returns:

Name Type Description
InferenceResults InferenceResults

The generated motion.

start()

Start the model and load the checkpoint.

get_abs_data_from_jointpos(joint_positions)

Convert joint positions to HML3D_abs format.

get_jointpos_from_bvh(filepath)

Load a BVH file and convert it to HML3D_abs format.

unpack_motion(packed_motion, mode='linear', randomness=0.0)

Unpack the packed motion input and return the joint positions and mask.

Packed Motion Format
  • A dictionary mapping frame indices to packed poses
  • A packed pose contains the root position followed by all 22 joint rotations
  • Values stored as nan indicate sparse keyframes and are converted to a joint mask

Parameters:

Name Type Description Default
packed_motion dict[int, list]

Packed motion data

required
mode str

Interpolation mode, either "linear" or "step". Defaults to "linear".

'linear'
randomness float

Random noise to add to the missing values. Defaults to 0.0.

0.0

Returns:

Type Description
ndarray

np.ndarray: Root positions (n_frames, 3)

ndarray

np.ndarray: Joint rotations (n_frames, 22, 3)

ndarray

np.ndarray: Joint mask (n_frames, 22, 1)

unpacked_motion_to_jointpos(root_pos, rotations)

Convert the unpacked motion to joint positions using the BVH template and Forward Kinematics.

This is based on the following list of assumptions:

  • The skeleton is the same as in template.bvh (22 joints)
    • This is used to get parents, offsets and names
  • Frametime is 1/20s (as per HML3D dataset)
  • Rotation order is XYZ