![]() |
|
USD's composition arcs allow timeSampled animation to be assembled from a variety of sources into a single composition. However, because stage composition must not (for scalability) take time into account when "indexing" layers, the value resolution behavior we are able to provide for layers reached through composition arcs stipulates that the first (strongest) layer that contains any timeSample for an attribute is the source of all timeSamples for the attribute. For many uses of USD this is sufficient, and additionally flexible because each Reference and SubLayer can specify a constant time offset and scale to be applied to the referenced or sublayered timeSamples. However, sometimes more flexibility is required!
The USD Value Clips feature allows users to decompose time-varying data across many layers that can then be sequenced and re-sequenced back together in flexible ways. This feature is purely a value resolution -level feature, not a composition-level feature. Value clips allow users to retime sequences in various ways. This allows users to reuse a set of value clips in different scenarios, with only the sequencing metadata changing. At Pixar, we have found value clips useful for efficiently animating medium to large crowds, and for representing very large, simulated effects. For more detail on these use cases, see the glossary entry for Value Clips.
At a very high level, value clips consume special metadata on a prim, indicating:
Before going further, let's establish some terminology:
A "clip set" is a named group of value clips. The set of value clips along with sequencing and timing information and other value resolution behaviors are specified in the clip set's definition metadata.
In this example, the prim "Prim" has two clip sets, "clip_set_1" and "clip_set_2", each with a different definition:
The clip set definitions are stored in a dictionary-valued metadata field named "clips", which is composed according to the rules in Dictionary-valued Metadata. This allows users to define clip sets in various layers and have them compose together, or sparsely override metadata in clip sets non-destructively.
Users can specify the clip set to author to when using the UsdClipsAPI schema to author clip metadata. If no clip set is specified, UsdClipsAPI will author to a clip set named "default".
Clip sets authored on multiple prims are ordered by distance from the attribute. Clip sets authored on an attribute's owning prim are strongest, followed by those authored on the owning prim's parent, and so on.
Clip sets authored on a single prim are ordered lexicographically by name. However, users can control the strength ordering or even remove a clip set from consideration by specifying the ordering/membership in the clipSets list-op metadata field via UsdClipsAPI::SetClipSets.
Clip sets may be defined using one of two possible forms: template and explicit metadata. Explicit metadata encodes the exact assets and sequence timings. Template metadata, on the other hand, authors a regex-style asset path template, and infers the explicit metadata when a UsdStage is opened. Template metadata is strictly less powerful than explicit metadata (it can't achieve behaviors such as looping, reversing, or holding clip data), but provides an extremely compact and easy to debug encoding for situations in which animation is broken up into a large number of regularly named files. Regardless of which form a value clip application takes, there are also a set of "universal" metadata common to both.
USD provides schema level support for authoring this metadata via UsdClipsAPI. This gives a typesafe way to interact with the relevant metadata as well as various helper functions.
If a clip set is defined using template clip metadata, USD will use that data to derive the explicit clip metadata with the following logic:
The entries in the active metadata determine when a particular clip is active. Value resolution will retrieve values from the active clip at a given time.
A (stageTime, assetIndex) entry indicates that the clip in the assetPaths metadata at position assetIndex is active from time stageTime up to the stageTime of the next entry in the list. As special cases, the first clip in the active metadata is also considered active for all earlier times, and the last clip is considered active for all later times.
For example, given:
Clip "foo.usd" is considered active in the time range [-inf, 105), "bar.usd" is active in the time range [105, 110), and baz.usd is active in the time range [110, +inf).
Conceptually, the (stageTime, clipTime) entries in the times metadata define a timing curve that specifies the time in the active clip to retrieve samples from when requesting an attribute's value at a given stage time. This timing curve is made up of linear segments whose endpoints are specified by the entries in times, sorted by stageTime. (see Ordering)
For example, given this times metadata:
When an attribute value at time 0 is requested, UsdStage will retrieve the time sample value authored at time 5 in the active clip, and at time 10 UsdStage will ask for the value authored at time 15. As mentioned above, these entries are the endpoints for a linear segment in the timing curve, so times between these entries will be linearly interpolated. For example, requesting an attribute value at time 3 will cause UsdStage to ask for the value authored at time 8 in the active clip.
The times metadata can be used to offset and scale animation from clips, providing flexibility in how they are applied to the stage.
Jump discontinuities in the timing curve can be represented in the times metadata by authoring two entries with the same stage time, but different clip times. The clip time in the left-most entry is used for time mappings up to the specified stage time, while the clip time in the right-most entry is used for time mappings at that stage time and afterwards.
For example, let's say you had two clips and wanted to use animation from times 0 to 10 in the first clip followed by times 25 to 35 in the second clip. This could be specified with the active and times metadata like this:
A jump discontinuity has been specified at stage time 10. For times in the range [0, 10), UsdStage will retrieve values from the first clip at times [0, 10). For times in the range [10, 20], UsdStage will retrieve values from the second clip at times [25, 35].
See Looping for a common use-case for this functionality.
A given stageTime may appear at most twice in the times metadata. In the typical case, a stageTime will only appear once; the only time it may appear twice is to specify a jump discontinuity (see Jump Discontinuities).
USD will perform a stable sort of the times metadata by stageTime to establish the timing curve described above. This means the order of the entries authored in times does not matter, except for jump discontinuities: the left-most entry with a given stageTime represents the left side of the discontinuity and the right-most entry represents the right side.
The clip manifest is a layer that declares the attributes that have time samples in the value clips for the associated clip set. This serves as an index that allows value resolution to determine whether an attribute has time samples in a clip set without having to examine every value clip.
If a clip set's value clips contain data for an attribute, that attribute must be declared in the manifest. Otherwise, that data will be ignored.
Each clip set has one manifest which may be specified via the manifestAssetPath metadata. If no manifest is specified, USD will generate one automatically at runtime. See Generating a Manifest for more details.
In its simplest form, the clip manifest just contains declarations for attributes. For example,
Clip 1 | Clip 2 |
---|---|
#usda 1.0
def "Model"
{
double attr.timeSamples = {
0: 100
}
def "Child"
{
double childAttr.timeSamples = {
0: 200
}
}
}
| #usda 1.0
def "Model"
{
double attr.timeSamples = {
1: 200
}
def "Child"
{
double childAttr.timeSamples = {
1: 300
}
}
}
|
Manifest | |
#usda 1.0
def "Model"
{
double attr
def "Child"
{
double childAttr
}
}
|
Like value clips, metadata, relationships, and composition arcs in the manifest are ignored. Attributes in the manifest may have default values or time samples containing value blocks. See Value Resolution Semantics for how these values may be used.
The Usd and Sdf authoring APIs can be used to manually create a manifest. For convenience, clients can use UsdClipsAPI::GenerateClipManifest or UsdClipsAPI::GenerateClipManifestFromLayers to generate a manifest from a given clip set or set of clip layers.
If a clip set does not have a manifest specified, USD will automatically generate a manifest at runtime from the value clips in the clip set using the methods described above. This is convenient but imposes the extra cost of opening and traversing every clip layer. To avoid this cost, you can use the UsdClipsAPI methods above to generate a clip manifest, save it out, and then set that as the clip set's manifest via UsdClipsAPI::SetManifestAssetPath.
A clip set may provide values for attributes on the prim on which the clip set is defined and any attributes on descendants of that prim. It is important to note that value clips do not define attributes on a UsdStage, they just provide values. If a clip set has values for an attribute but that attribute is not defined on the UsdStage (for example, the attribute is not a built-in attribute of a schema), the clip set will not cause the attribute to come into existence.
The strength of data in a set of value clips is based on the anchor point. The clip data is just weaker than the "Local" (L in LIVRPS) data of the anchoring layer. Clip data can be overridden by adding overrides to a stronger layer or in a local opinion, just as for any other kind of data.
During attribute value resolution, if clip sets are defined on the attribute's owning prim or any ancestors, USD will do the following:
A clip set has "gaps" if some of the value clips in the set do not contain authored time samples for an attribute that has been declared in the manifest.
By default, if a value clip does not contain time samples for an attribute, a time sample at the clip's active time will be generated using the default value for the attribute authored in the clip manifest. If no default value has been authored, the fallback value for the attribute's data type will be a value block.
In the example below, the value for /TestModel.a at time 2 will be 10.0 since clip2.usd does not have time samples for this attribute and 10.0 is the default value authored in the manifest. The value for /TestModel.b at time 2 will be a value block, since no default is authored in the manifest.
clip1.usd | clip2.usd | clip3.usd |
---|---|---|
#usda 1.0
def "Model"
{
double a.timeSamples = {
1: 1
}
double b.timeSamples = {
1: 1
}
}
| #usda 1.0
def "Model"
{
}
| #usda 1.0
def "Model"
{
double a.timeSamples = {
3: 3
}
double b.timeSamples = {
3: 3
}
}
|
manifest.usd | ||
#usda 1.0
def "Model"
{
double a = 10.0
double b
}
| ||
stage.usd | ||
#usda 1.0
def "TestModel" (
clips = {
dictionary default = {
double2[] active = [(1, 0), (2, 1), (3, 2)]
asset[] assetPaths = [@./clip1.usd@, @./clip2.usd@, @./clip3.usd@]
asset manifestAssetPath = @./manifest.usd@
string primPath = "/Model"
}
}
)
{
double a
double b
}
|
The above behavior allows USD to avoid opening an arbitrary number of clips if a gap is encountered in the clip set and can be useful in some situations. For example, see Animated Visibility. However, in these cases USD can also optionally interpolate values based on the surrounding clips. This makes value clips behave like time samples split up into different files, which is more intuitive but comes at a performance cost.
This feature can be enabled for a clip set by setting interpolateMissingClipValues to true in a clip set definition. When enabled, if a query is made at a time when the clip set has a gap, and the attribute does not have a default value specified in the manifest, USD will search forward and backwards from the active clip at that time to find the nearest clips that contain authored time sample values. The final value will be interpolated from these time samples.
Note that in the pessimal case, this may wind up opening and querying all clips in the set. To accelerate this search, users can author time sample blocks in the manifest at the active time for each clip that does not have time samples for a given attribute. Value resolution will use this information to determine what clips have time samples without actually opening the clips themselves.
In the example below, the value for /TestModel.a at time 2 will be 2.0, which is interpolated from the time sample in clip1.usd at time 1 and the time sample in clip4.usd at time 4. Similarly, the value at time 3 will be 3.0. If interpolateMissingClipValues was not set to true, these values would be a value block instead.
clip1.usd | clip2.usd / clip3.usd | clip4.usd |
---|---|---|
#usda 1.0
def "Model"
{
double a.timeSamples = {
1: 1
}
}
| #usda 1.0
def "Model"
{
}
| #usda 1.0
def "Model"
{
double a.timeSamples = {
4: 4
}
}
|
manifest.usd | ||
#usda 1.0
def "Model"
{
double a.timeSamples = {
2: None,
3: None
}
}
| ||
stage.usd | ||
#usda 1.0
def "TestModel" (
clips = {
dictionary default = {
double2[] active = [(1, 0), (2, 1), (3, 2), (4, 3)]
asset[] assetPaths = [@./clip1.usd@, @./clip2.usd@, @./clip3.usd@, @./clip4.usd@]
asset manifestAssetPath = @./manifest.usd@
string primPath = "/Model"
bool interpolateMissingClipValues = true
}
}
)
{
double a
}
|
Layer offsets affect value clips in the following ways:
The flexibility and reuse of animated data that clips provides does come with some performance characteristics with which pipeline builders may want to be familiar.
In Pixar use of clips, it is not uncommon for a single UsdStage to consume thousands to tens of thousands of clip layers. If the act of opening a stage were to discover and open all of the layers consumed by clips, it would, in these cases, add considerable time and memory to the cost. Further, many clients of the stage (such as a single-frame render) only require data from a small time range, which generally translates to a small fraction of the total number of clip layers. Therefore, clip layers are opened lazily, only when value resolution must interrogate a particular clip. Of course, since USD supports value resolution in multiple threads concurrently, it means that resolving attributes affected by clips may require acquiring a lock that is unnecessary during "normal" value resolution, so there is some performance penalty.
Further, the broader the time interval over which an application extracts attribute values, the more layers that will be opened and cached (until the stage is closed). We deem this an acceptable cost since it is in keeping with our general principle of paying for what you use. The alternative would be adding a more sophisticated caching strategy to clip-layer retention that limits the number of cached layers; however, since the most memory-conscious clients (renderers) are generally unaffected, and the applications that do want to stream through time generally prioritize highest performance over memory consumption, we are satisfied with the caching strategy for now.
Flattening a UsdStage with value clips will merge the appropriate time samples from the value clips into the time samples on the attribute on the flattened stage and remove the clip set definitions. Querying for time samples and values on the flattened stage should always give the same result as on the unflattened stage.
The usdstitchclips utility will generate a stage that uses value clips to stitch together the time samples in a given set of clip layers. This utility will generate the necessary clip set definitions (using either explicit or template metadata) and also generate a topology layer defining the attributes and a manifest layer.
For example, given a directory containing three clip files clip.101.usd, clip.102.usd and clip.103.usd:
Will generate the following result.usda:
and the following result.topology.usd:
For generating template metadata:
Will generate the following result.usda:
The UsdUtils library contains several utility functions for stitching together multiple layers using value clips in usdUtils/stitchClips.h.
A common use case is to loop over animation authored in a clip or set of clips. For example, at Pixar clips containing a handful of frames of keep-alive animation are applied to background characters with looping so they remain in motion throughout an entire shot.
Looping can be specified using the times metadata and jump discontinuities (see Stage Times and Clip Times) This example shows 25 frames of animation from a clip being looped on the UsdStage from time 0 to 25, then 25 to 50.
For proper looping, we want the UsdStage to retrieve animation at all times in the range [0, 25) from the clip at times [0, 25). However, at exactly time 25 we want the UsdStage to jump back to using animation in the clip at time 0. This is represented by a jump discontinuity at time 25. (See Jump Discontinuities for more details.)
Note that we were able to achieve this solely through the metadata. No additional asset loads or restructuring needed to happen.
Value clips are used at Pixar to stitch together the results of simulators or procedural generation tools like Houdini that are run in parallel for each frame of a shot or effect. There are often situations where geometry data (e.g. points) are generated for some of the frames but not others. In these cases we would like to set the "visibility" attribute to "invisible" at the times where no geometry was generated. However, since the processes for these times didn't generate any geometry and are being run independently from the other times, they don't know where to author the "visibility" attribute to achieve this.
To solve this, we use the fact that value clips will use the default value authored in the manifest if no value is authored in a clip. (See Missing Values in Clip Set). In the manifest, we author a default value of "invisible" for the "visibility" attribute. Then, if the processes that generate the value clips write out any geometry data, they also write out a time sample for the "visibility" attribute making the prim visible at that time. If they do not write out geometry, they don't write out the "visibility" attribute. This makes value resolution use the "invisible" value for "visibility" at times when the clips have no geometry.