Overview and Purpose

UsdSkel provides an encoding of simple skeletons and blend shapes, with the goal of interchanging basic skeletons and skinned models, both for game studios as well for the performant, scalable interchange of large-scale crowds. This encoding supports both interchanging skinnable rigs, as well as the interchange of skeletal control rigs on their own.

Motivation & Trade-Offs

UsdSkel was designed with the goal in mind of both being able to represent small numbers of characters, while also being efficient enough to be able to scale up to encoding crowds, consisting of hundreds of thousands of agents.

Towards this end, some expressibility trade-offs have been made in the name of scalability:

In order to enable UsdSkel to efficiently encode very large numbers of characters, joint animations are stored using a compact, vectorized encoding on each Skeleton. This avoids both excess prim and property bloat, while also greatly improving both read and write performance. Although these factors are negligible on a small number of characters, this provides substantial performance improvements when interchanging large numbers of characters.

As a consequence of this, although UsdSkel allows a sparse skeletal animation to be defined, sparse layering of joint transforms is not supported. Sparse animation-layering capabilities may eventually be introduced as part of an encoding of animation blending.
UsdSkel leverages scene graph instancing to describe rest state instancing, by which the rest state of characters may be shared across multiple instances of a character, each of which has been assigned a unique animation – without deinstancing. This is a critical encoding feature for dealing with large numbers of characters. For the sake of enabling this feature, the encoding of a skeleton has been broken into two parts: The Skeleton itself, and an Animation associated with that skeleton.

What UsdSkel Is Not

UsdSkel does not signal an intent to provide general rigging and execution behaviors as core USD features. We strongly believe that such features are beyond the scope of USD's core concerns.

Terminology

Before discussing any concepts in detail, it's important to establish some common terminology:

Skeleton:

An encoding of an animated joint hierarchy. A Skeleton may additional hold other properties that affect deformations, such as weights for a set of blend shapes.
Skeleton Topology:

The topology is the description of the parent<->child relationships of all joints in a joint hierarchy.
Skel Animation:

An encoding of animation for a Skeleton's joints and blendshapes.
Binding:

An encoding of the binding of a Skeleton to a geometry hierarchy, describing which geometric primitves the Skeleton deforms (if any), and how.
Joint Influences:

The set of joint weights and indices used by skinning algorithms to describe how each point is influenced by the change in joint transformations.

What Can Be Skinned?

UsdSkel makes an attempt at encoding skinning information in a way that generalizes to many types of prims, rather than targeting a strict subset of known types.

As a general rule, if a primitive's type inherits from UsdGeomBoundable, and if that primitive is not itself a UsdSkelSkeleton, then that primitive is considered by UsdSkel to be a valid candidate for skinning. The utility method UsdSkelIsSkinnablePrim can be used to test this rule.

When clients iterate over the set of skinnable primitives – for example, when traversing the skinning targets of a UsdSkelBinding – it is the client's responsibility to check prim types to determine whether or not they know how to skin each candidate primitive.

Transforms and Transform Spaces

There are a few special transform spaces and transforms that will frequently be referenced:

Joint-Local Space:

The local transform space of each joint.
Skeleton Space:

The object space of a Skeleton. This space does not contain the transformation that positions the root of the Skeleton in the world, and therefore serves as the canonical space in which to apply joint animations.

A Skeleton is transformed into world space using the world space transform of the Skeleton primitive.
Geom Bind Transform:

The transform that positions a skinned primitive in its binding state in world space.
Skinning Transform:

The transform that describes a joint's change in transformation from its world space bind pose, to its animated, skeleton space pose.

I.e., inv(jointWorldSpaceBindTransform)*jointSkelSpaceTransform.

For the sake of planned animation blending capabilities, UsdSkel stores all joint transforms in Joint Local Space.

To clarify the description of these spaces, a Skeleton Space joint transform is computed from a Joint Local transform as follows:

jointSkelSpaceTransform = jointLocalSpaceTransform * parentJointSkelSpaceTransform

Where the parentJointSkelSpaceTransform is an identity matrix for root joints, or jointSkelSpaceTransform of the parent joint otherwise.

The World Space transform of a joint is computed as:

jointWorldSpaceTransform = jointLocalSpaceTransform *

parentJointSkelSpaceTransform * skelLocalToWorldTransform

Where skelLocalToWorldTransform is the world space transform of the Skeleton primtiive.

Skinning a Point (Linear Blend Skinning)

Using the terminology established above, in linear blend skinning, the Skeleton Space position of a skinned point is computed as:

skelSpacePoint = geomBindTransform.Transform(localSpacePoint)
p = (0,0,0)
for jointIndex,jointWeight in jointInfluencesForPoint:
    p += skinningTransforms[jointIndex].Transform(skelSpacePoint)*jointWeight

Where localSpacePoint is a point of a skinned primitive, given in the local space of that primitive.