Overview

USD's composition arcs allow timeSampled animation to be assembled from a variety of sources into a single composition. However, because stage composition must not (for scalability) take time into account when "indexing" layers, the value resolution behavior we are able to provide for layers reached through composition arcs stipulates that the first (strongest) layer that contains any timeSample for an attribute is the source of all timeSamples for the attribute. For many uses of USD this is sufficient, and additionally flexible because each Reference and SubLayer can specify a constant time offset and scale to be applied to the referenced or sublayered timeSamples. However, sometimes more flexibility is required!

The USD Value Clips feature allows users to decompose time-varying data across many layers that can then be sequenced and re-sequenced back together in flexible ways. This feature is purely a value resolution -level feature, not a composition-level feature. Value clips allow users to retime sequences in various ways. This allows users to reuse a set of value clips in different scenarios, with only the sequencing metadata changing. At Pixar, we have found value clips useful for efficiently animating medium to large crowds, and for representing very large, simulated effects. For more detail on these use cases, see the glossary entry for Value Clips.

At a very high level, value clips consume special metadata on a prim, indicating:

the targeted "clip" layers (which are assets) to be sequenced
the intervals over which each clip is active
how "stage time" maps into each clip

Terminology

Before going further, let's establish some terminology:

Value Clip: An individual layer containing time varying data over some interval. All metadata, relationships, and default values present in a layer are ignored when the layer is consumed as a Value Clip.
Clip Set: A named set of value clips defined by a group of clip metadata. A prim may have multiple clip sets.
Clip Metadata: A set of prim-level metadata which control USD's value resolution semantics.
Clip Manifest: An individual layer that declares the attributes that have time samples in a clip set's value clips. An attribute must be declared in the manifest in order for value clips to be considered when resolving values for that attribute.
Anchor Point: The strongest layer in which either assetPaths or templateAssetPath is authored for a given clip set. This determines the strength of clips with respect to value resolution, see Value Resolution Semantics for details.

Clip Sets

A "clip set" is a named group of value clips. The set of value clips along with sequencing and timing information and other value resolution behaviors are specified in the clip set's definition metadata.

In this example, the prim "Prim" has two clip sets, "clip_set_1" and "clip_set_2", each with a different definition:

#usda 1.0
 
def "Prim" (
    clips = {
        dictionary clip_set_1 = {
            double2[] active = [(101, 0), (102, 1), (103, 2)] 
            asset[] assetPaths = [@./clip1.usda@, @./clip2.usda@, @./clip3.usda@]
            asset manifestAssetPath = @./clipset1.manifest.usda@
            string primPath = "/ClipSet1"
            double2[] times = [(101, 101), (102, 102), (103, 103)]
        }
        dictionary clip_set_2 = {
            string templateAssetPath = "clipset2.#.usd"
            double templateStartTime = 101
            double templateEndTime = 103
            double templateStride = 1
            asset manifestAssetPath = @./clipset2.manifest.usda@
            string primPath = "/ClipSet2"
        }
    }
    clipSets = ["clip_set_2", "clip_set_1"]
)
{
}

The clip set definitions are stored in a dictionary-valued metadata field named "clips", which is composed according to the rules in Dictionary-valued Metadata. This allows users to define clip sets in various layers and have them compose together, or sparsely override metadata in clip sets non-destructively.

Users can specify the clip set to author to when using the UsdClipsAPI schema to author clip metadata. If no clip set is specified, UsdClipsAPI will author to a clip set named "default".

Strength Ordering

Clip sets authored on multiple prims are ordered by distance from the attribute. Clip sets authored on an attribute's owning prim are strongest, followed by those authored on the owning prim's parent, and so on.

Clip sets authored on a single prim are ordered lexicographically by name. However, users can control the strength ordering or even remove a clip set from consideration by specifying the ordering/membership in the clipSets list-op metadata field via UsdClipsAPI::SetClipSets.

Clip Set Definitions

Clip sets may be defined using one of two possible forms: template and explicit metadata. Explicit metadata encodes the exact assets and sequence timings. Template metadata, on the other hand, authors a regex-style asset path template, and infers the explicit metadata when a UsdStage is opened. Template metadata is strictly less powerful than explicit metadata (it can't achieve behaviors such as looping, reversing, or holding clip data), but provides an extremely compact and easy to debug encoding for situations in which animation is broken up into a large number of regularly named files. Regardless of which form a value clip application takes, there are also a set of "universal" metadata common to both.

Universal Clip Metadata
- primPath
  - A prim path (SdfPath) that will be substituted for the stage prim's path when querying data in the clips. For instance, let's say clip metadata is authored on prim '/Prim_1', and primPath is '/Prim'. If we want to get values for the attribute '/Prim_1.size', we will substitute '/Prim' for '/Prim_1', yielding '/Prim.size'. This is the path that will be used to look for values in each clip.
- manifestAssetPath
  - An asset path (SdfAssetPath) representing the path to a layer that contains an index of the attributes with time samples authored in the set of clips. See Clip Manifest for more details.
- interpolateMissingClipValues
  - A boolean flag indicating whether values for clips that do not have authored time samples for attributes in the manifest should be interpolated from surrounding clips. See Interpolating Missing Values in Clip Set for more details.
Explicit Clip Metadata
- assetPaths
  - An ordered list of asset paths to the clips holding time varying data.
- active
  - A list of pairs of the form (stageTime, assetIndex) representing when a particular clip in assetPaths is active and should be considered during value resolution. See Active Clips for more details.
- times
  - A list of pairs of the form (stageTime, clipTime) representing the mapping from stage time to clip time, for whichever clip is active at the given stage time. Note that every unique stageTime in this list will be in the list of time samples obtained by calling UsdAttribute::GetTimeSamples() . See Stage Times and Clip Times for more details.
Template Clip Metadata
- templateAssetPath
  - A regex-esque template string representing the form of our asset paths' names. This can be of two forms: 'path/basename.###.usd' and 'path/basename.###.###.usd'. These represent integer stage times and sub-integer stage times respectively. In both cases the number of hashes in each section is variable, and indicates to USD how much padding to apply when looking for asset paths. Note that USD is strict about the format of this string: there must be exactly one or two groups of hashes, and if there are two, they must be adjacent, separated by a dot.
- templateStartTime
  - The (double precision float) first number to substitute into our template asset path. For example, given 'path/basename.###.usd' as a template string, and 12 as a template start time, USD will populate the internal asset path list with 'path/basename.012.usd' as its first element, if it resolves to a valid identifier through the active ArResolver . If the template asset path represents integer frames and the start time has a fractional component, USD will truncate this to an integer.
- templateEndTime
  - The (double precision float) last number to substitute into our template string. If the template asset path represents integer frames and the end time has a fractional component, USD will truncate this to an integer.
- templateStride
  - A (double precision float) number indicating the stride at which USD will increment when looking for files to resolve. For example, given a start time of 12, an end time of 25, a template string of 'path/basename.#.usd', and a stride of 6, USD will look to resolve the following paths: 'path/basename.12.usd', 'path/basename.18.usd' and 'path/basename.24.usd'.
- templateActiveOffset
  - An optional (double precision float) number indicating the offset USD will use when calculating the clipActive value.
  - Given a start time of 101, an endTime of 103, a stride of 1, and an offset of 0.5, USD will generate the following:
    - times = [(100.5,100.5), (101,101), (102,102), (103,103), (103.5,103.5)]
    - active = [(101.5, 0), (102.5, 1), (103.5, 2)]
  - Note that USD generates two additional clip time 'knots' on the ends of the clipTime array. This allows users to query time samples outside the start/end range based on the absolute value of their offset.
  - Note that templateActiveOffset cannot exceed the absolute value of templateStride.

Warning: In the case where both explicit clip metadata and template clip metadata are authored, USD will prefer the explicit metadata for composition.

USD provides schema level support for authoring this metadata via UsdClipsAPI. This gives a typesafe way to interact with the relevant metadata as well as various helper functions.

Template Clip Metadata

If a clip set is defined using template clip metadata, USD will use that data to derive the explicit clip metadata with the following logic:

The set of explicit asset paths (assetPaths) is derived by taking the template pattern (templateAssetPath) and substituting times from templateStartTime to templateEndTime, incrementing by templateStride.
Once the set of relevant asset paths has been determined. The times and active metadata can be derived. For each time t specified in each derived assetPath, the time (t, t) will be authored; similarly, the active (t, n) will be authored, where n represents the index of the derived assetPath. If templateActiveOffset is specified, it will be applied to the times and active derivation as described in the previous section.

Active Clips

The entries in the active metadata determine when a particular clip is active. Value resolution will retrieve values from the active clip at a given time.

A (stageTime, assetIndex) entry indicates that the clip in the assetPaths metadata at position assetIndex is active from time stageTime up to the stageTime of the next entry in the list. As special cases, the first clip in the active metadata is also considered active for all earlier times, and the last clip is considered active for all later times.

For example, given:

asset[] assetPaths = [ \@foo.usd\@, \@bar.usd\@, \@baz.usd\@ ]

double2[] active = [ (101, 0), (105, 1), (110, 2) ]

Clip "foo.usd" is considered active in the time range [-inf, 105), "bar.usd" is active in the time range [105, 110), and baz.usd is active in the time range [110, +inf).

Stage Times and Clip Times

Conceptually, the (stageTime, clipTime) entries in the times metadata define a timing curve that specifies the time in the active clip to retrieve samples from when requesting an attribute's value at a given stage time. This timing curve is made up of linear segments whose endpoints are specified by the entries in times, sorted by stageTime. (see Ordering)

For example, given this times metadata:

double2[] times = [(0, 5), (10, 15)]

When an attribute value at time 0 is requested, UsdStage will retrieve the time sample value authored at time 5 in the active clip, and at time 10 UsdStage will ask for the value authored at time 15. As mentioned above, these entries are the endpoints for a linear segment in the timing curve, so times between these entries will be linearly interpolated. For example, requesting an attribute value at time 3 will cause UsdStage to ask for the value authored at time 8 in the active clip.

The times metadata can be used to offset and scale animation from clips, providing flexibility in how they are applied to the stage.

Jump Discontinuities

Jump discontinuities in the timing curve can be represented in the times metadata by authoring two entries with the same stage time, but different clip times. The clip time in the left-most entry is used for time mappings up to the specified stage time, while the clip time in the right-most entry is used for time mappings at that stage time and afterwards.

For example, let's say you had two clips and wanted to use animation from times 0 to 10 in the first clip followed by times 25 to 35 in the second clip. This could be specified with the active and times metadata like this:

double2[] active = [(0, 0), (10, 1)]

double2[] times = [(0, 0), (10, 10), (10, 25), (20, 35)]

A jump discontinuity has been specified at stage time 10. For times in the range [0, 10), UsdStage will retrieve values from the first clip at times [0, 10). For times in the range [10, 20], UsdStage will retrieve values from the second clip at times [25, 35].

See Looping for a common use-case for this functionality.

Ordering

A given stageTime may appear at most twice in the times metadata. In the typical case, a stageTime will only appear once; the only time it may appear twice is to specify a jump discontinuity (see Jump Discontinuities).

USD will perform a stable sort of the times metadata by stageTime to establish the timing curve described above. This means the order of the entries authored in times does not matter, except for jump discontinuities: the left-most entry with a given stageTime represents the left side of the discontinuity and the right-most entry represents the right side.

Clip Manifest

The clip manifest is a layer that declares the attributes that have time samples in the value clips for the associated clip set. This serves as an index that allows value resolution to determine whether an attribute has time samples in a clip set without having to examine every value clip.

If a clip set's value clips contain data for an attribute, that attribute must be declared in the manifest. Otherwise, that data will be ignored.

Each clip set has one manifest which may be specified via the manifestAssetPath metadata. If no manifest is specified, USD will generate one automatically at runtime. See Generating a Manifest for more details.

What Data Is In a Manifest?

In its simplest form, the clip manifest just contains declarations for attributes. For example,

Clip 1	Clip 2
#usda 1.0 def "Model" { double attr.timeSamples = { 0: 100 } def "Child" { double childAttr.timeSamples = { 0: 200 } } }	#usda 1.0 def "Model" { double attr.timeSamples = { 1: 200 } def "Child" { double childAttr.timeSamples = { 1: 300 } } }
Manifest
#usda 1.0 def "Model" { double attr def "Child" { double childAttr } }

Like value clips, metadata, relationships, and composition arcs in the manifest are ignored. Attributes in the manifest may have default values or time samples containing value blocks. See Value Resolution Semantics for how these values may be used.

Generating a Manifest

The Usd and Sdf authoring APIs can be used to manually create a manifest. For convenience, clients can use UsdClipsAPI::GenerateClipManifest or UsdClipsAPI::GenerateClipManifestFromLayers to generate a manifest from a given clip set or set of clip layers.

If a clip set does not have a manifest specified, USD will automatically generate a manifest at runtime from the value clips in the clip set using the methods described above. This is convenient but imposes the extra cost of opening and traversing every clip layer. To avoid this cost, you can use the UsdClipsAPI methods above to generate a clip manifest, save it out, and then set that as the clip set's manifest via UsdClipsAPI::SetManifestAssetPath.

Value Resolution Semantics

A clip set may provide values for attributes on the prim on which the clip set is defined and any attributes on descendants of that prim. It is important to note that value clips do not define attributes on a UsdStage, they just provide values. If a clip set has values for an attribute but that attribute is not defined on the UsdStage (for example, the attribute is not a built-in attribute of a schema), the clip set will not cause the attribute to come into existence.

The strength of data in a set of value clips is based on the anchor point. The clip data is just weaker than the "Local" (L in LIVRPS) data of the anchoring layer. Clip data can be overridden by adding overrides to a stronger layer or in a local opinion, just as for any other kind of data.

During attribute value resolution, if clip sets are defined on the attribute's owning prim or any ancestors, USD will do the following:

Determine the path we will consult within clip layers when looking for values. The path will be constructed using the attribute's path within the local LayerStack, with a prefix substitution based on the clipset's primPath metadata. Please note that this implies that composition arcs are ignored within clip files, i.e. all data must be recorded directly, not inside variants or across reference arcs.
Find the strongest clip set that has the attribute at the above path declared in the manifest. This involves looking at the clip sets authored on the attribute's owning prim as well as that prim's ancestors. See Strength Ordering for more details.
If no clip set is found, end clip value resolution and move to the next data source in the LIVRPS strength ordering.
Query the clip set for the attribute value at the specified time. This "external" time will be mapped to the "internal" time of the clip set using the times metadata. The active clip will be opened and queried using this "internal" time. See Active Clips and Stage Times and Clip Times for more details.
If an authored time sample at the "internal" time is found in the active clip, that is the final value. If there is no sample at that time, but there are other samples in the active time range of the clip, the final value will be interpolated from those samples. See Missing Values in Clip Set and Interpolating Missing Values in Clip Set for the behavior when the active clip does not have any authored time samples.

Missing Values in Clip Set

A clip set has "gaps" if some of the value clips in the set do not contain authored time samples for an attribute that has been declared in the manifest.

By default, if a value clip does not contain time samples for an attribute, a time sample at the clip's active time will be generated using the default value for the attribute authored in the clip manifest. If no default value has been authored, the fallback value for the attribute's data type will be a value block.

In the example below, the value for /TestModel.a at time 2 will be 10.0 since clip2.usd does not have time samples for this attribute and 10.0 is the default value authored in the manifest. The value for /TestModel.b at time 2 will be a value block, since no default is authored in the manifest.

clip1.usd	clip2.usd	clip3.usd
#usda 1.0 def "Model" { double a.timeSamples = { 1: 1 } double b.timeSamples = { 1: 1 } }	#usda 1.0 def "Model" { }	#usda 1.0 def "Model" { double a.timeSamples = { 3: 3 } double b.timeSamples = { 3: 3 } }
manifest.usd
#usda 1.0 def "Model" { double a = 10.0 double b }
stage.usd
#usda 1.0 def "TestModel" ( clips = { dictionary default = { double2[] active = [(1, 0), (2, 1), (3, 2)] asset[] assetPaths = [@./clip1.usd@, @./clip2.usd@, @./clip3.usd@] asset manifestAssetPath = @./manifest.usd@ string primPath = "/Model" } } ) { double a double b }

Interpolating Missing Values in Clip Set

The above behavior allows USD to avoid opening an arbitrary number of clips if a gap is encountered in the clip set and can be useful in some situations. For example, see Animated Visibility. However, in these cases USD can also optionally interpolate values based on the surrounding clips. This makes value clips behave like time samples split up into different files, which is more intuitive but comes at a performance cost.

This feature can be enabled for a clip set by setting interpolateMissingClipValues to true in a clip set definition. When enabled, if a query is made at a time when the clip set has a gap, and the attribute does not have a default value specified in the manifest, USD will search forward and backwards from the active clip at that time to find the nearest clips that contain authored time sample values. The final value will be interpolated from these time samples.

Note that in the pessimal case, this may wind up opening and querying all clips in the set. To accelerate this search, users can author time sample blocks in the manifest at the active time for each clip that does not have time samples for a given attribute. Value resolution will use this information to determine what clips have time samples without actually opening the clips themselves.

In the example below, the value for /TestModel.a at time 2 will be 2.0, which is interpolated from the time sample in clip1.usd at time 1 and the time sample in clip4.usd at time 4. Similarly, the value at time 3 will be 3.0. If interpolateMissingClipValues was not set to true, these values would be a value block instead.

clip1.usd	clip2.usd / clip3.usd	clip4.usd
#usda 1.0 def "Model" { double a.timeSamples = { 1: 1 } }	#usda 1.0 def "Model" { }	#usda 1.0 def "Model" { double a.timeSamples = { 4: 4 } }
manifest.usd
#usda 1.0 def "Model" { double a.timeSamples = { 2: None, 3: None } }
stage.usd
#usda 1.0 def "TestModel" ( clips = { dictionary default = { double2[] active = [(1, 0), (2, 1), (3, 2), (4, 3)] asset[] assetPaths = [@./clip1.usd@, @./clip2.usd@, @./clip3.usd@, @./clip4.usd@] asset manifestAssetPath = @./manifest.usd@ string primPath = "/Model" bool interpolateMissingClipValues = true } } ) { double a }

Layer Offsets

Layer offsets affect value clips in the following ways:

If using template metadata encoding:
- Layer offsets are applied to generated times and active metadata relative to the anchor point. Note that layer offsets will not affect the generated set of asset paths, as they are not applied to templateStartTime, templateEndTime and templateStride.
If using explicit metadata encoding:
- Layer offsets are applied to times and active metadata relative to the strongest layer in which they were authored. Note that this layer may be different from the anchor point.

Additional Notes

The flexibility and reuse of animated data that clips provides does come with some performance characteristics with which pipeline builders may want to be familiar.

Clip Layers Opened On-Demand

In Pixar use of clips, it is not uncommon for a single UsdStage to consume thousands to tens of thousands of clip layers. If the act of opening a stage were to discover and open all of the layers consumed by clips, it would, in these cases, add considerable time and memory to the cost. Further, many clients of the stage (such as a single-frame render) only require data from a small time range, which generally translates to a small fraction of the total number of clip layers. Therefore, clip layers are opened lazily, only when value resolution must interrogate a particular clip. Of course, since USD supports value resolution in multiple threads concurrently, it means that resolving attributes affected by clips may require acquiring a lock that is unnecessary during "normal" value resolution, so there is some performance penalty.

Further, the broader the time interval over which an application extracts attribute values, the more layers that will be opened and cached (until the stage is closed). We deem this an acceptable cost since it is in keeping with our general principle of paying for what you use. The alternative would be adding a more sophisticated caching strategy to clip-layer retention that limits the number of cached layers; however, since the most memory-conscious clients (renderers) are generally unaffected, and the applications that do want to stream through time generally prioritize highest performance over memory consumption, we are satisfied with the caching strategy for now.

Flattening

Flattening a UsdStage with value clips will merge the appropriate time samples from the value clips into the time samples on the attribute on the flattened stage and remove the clip set definitions. Querying for time samples and values on the flattened stage should always give the same result as on the unflattened stage.

usdview

Usdview supports value clip debugging through the layer stack viewer(lower left). When a particular attribute(who's value is held in a clip layer) is highlighted, the layer stack viewer will show which clip the value is coming from.
The metadata tab will display the value of each piece of metadata authored on the prim introducing clips.

usdstitchclips

The usdstitchclips utility will generate a stage that uses value clips to stitch together the time samples in a given set of clip layers. This utility will generate the necessary clip set definitions (using either explicit or template metadata) and also generate a topology layer defining the attributes and a manifest layer.

For example, given a directory containing three clip files clip.101.usd, clip.102.usd and clip.103.usd:

$ usdstitchclips --clipPath /World/model --out result.usda clip*

Will generate the following result.usda:

#usda 1.0
(
    endTimeCode = 103
    startTimeCode = 101
    subLayers = [
        @./result.topology.usda@
    ]
)
 
over "World" 
{
    over "model" (
        clips = {
            dictionary default = {
                double2[] active = [(101, 0), (102, 1), (103, 2)]
                asset[] assetPaths = [@./101.usd@, @./102.usd@, @./103.usd@]
                asset manifestAssetPath = @./result.topology.usda@
                string primPath = "/World/model"
                double2[] times = [(101, 101), (102, 102), (103, 103)]
            }
        }
    )
    {
    }
}

and the following result.topology.usd:

#usda 1.0
(
    endTimeCode = 103
    startTimeCode = 101
    upAxis = "Z"
)
 
def "World"
{
    def "model"
    {
        int x
    }
}

For generating template metadata:

$ usdstitchclips --clipPath /World/model 
                 --templateMetadata
                 --startTimeCode 101
                 --endTimeCode 103
                 --stride 1
                 --templatePath clip.#.usd
                 --out result.usda clip* 

Will generate the following result.usda:

#usda 1.0
(
    endTimeCode = 103
    startTimeCode = 101
    subLayers = [
        @./result.topology.usda@
    ]
)
 
def "World" 
{
    over "model" (
        clips = {
            dictionary default = {
                string templateAssetPath = "clip.#.usd"
                double templateStartTime = 101
                double templateEndTime = 103
                double templateStride = 1
                asset manifestAssetPath = @./result.topology.usda@
                string primPath = "/World/model"
            }
        }
    )
    {
    }
}

UsdUtils Utility Functions

The UsdUtils library contains several utility functions for stitching together multiple layers using value clips in usdUtils/stitchClips.h.

Examples

Looping

A common use case is to loop over animation authored in a clip or set of clips. For example, at Pixar clips containing a handful of frames of keep-alive animation are applied to background characters with looping so they remain in motion throughout an entire shot.

Looping can be specified using the times metadata and jump discontinuities (see Stage Times and Clip Times) This example shows 25 frames of animation from a clip being looped on the UsdStage from time 0 to 25, then 25 to 50.

#usda 1.0
(
    endTimeCode = 50
    startTimeCode = 1
    subLayers = [
        @./shot.topology.usd@
    ]
)
 
over "World" 
{
    over "Model" (
        clips = {
            dictionary default = {
                asset manifestAssetPath = @./shot.manifest.usd@
                string primPath = "/World/Model"
                asset[] assetPaths = [@./clip1.usd@]
                double2[] active = [(0, 0)]
                double2[] times = [(0, 0), (25, 25), (25, 0), (50, 25)]
            }
        }
    )
    {
    }
}

For proper looping, we want the UsdStage to retrieve animation at all times in the range [0, 25) from the clip at times [0, 25). However, at exactly time 25 we want the UsdStage to jump back to using animation in the clip at time 0. This is represented by a jump discontinuity at time 25. (See Jump Discontinuities for more details.)

Note that we were able to achieve this solely through the metadata. No additional asset loads or restructuring needed to happen.

Warning: Note that this supposes that the final frame and the first frame of the clip transitions smoothly.

Animated Visibility

Value clips are used at Pixar to stitch together the results of simulators or procedural generation tools like Houdini that are run in parallel for each frame of a shot or effect. There are often situations where geometry data (e.g. points) are generated for some of the frames but not others. In these cases we would like to set the "visibility" attribute to "invisible" at the times where no geometry was generated. However, since the processes for these times didn't generate any geometry and are being run independently from the other times, they don't know where to author the "visibility" attribute to achieve this.

To solve this, we use the fact that value clips will use the default value authored in the manifest if no value is authored in a clip. (See Missing Values in Clip Set). In the manifest, we author a default value of "invisible" for the "visibility" attribute. Then, if the processes that generate the value clips write out any geometry data, they also write out a time sample for the "visibility" attribute making the prim visible at that time. If they do not write out geometry, they don't write out the "visibility" attribute. This makes value resolution use the "invisible" value for "visibility" at times when the clips have no geometry.