Behavior monitoring
Coverage and Performance Metrics
Metrics collected during test execution are used to answer two critical questions: how well was the SUT (System Under Test) tested, and how well did the SUT perform within these tests. The first question is answered by the coverage grade, the multi-dimensional representation of all situations encountered during testing. The second question is answered by performance grade, the collection of Key Performance Indicators, normalized within their context.
Together, these metrics provide insight to the following questions
Coverage:
- What is the current coverage grade (overall and specifically for a given scenario)?
- What are the main coverage holes for a scenario? How do they cluster, in other words, are there big uncovered areas?
Performance:
- What were the values for a specific KPI (overall and specifically for a given scenario)? Do those cluster in some interesting way?
- How well does the SUT perform on specific KPI grades (overall and for a scenario)? Where is this worse / better than the previous SW release?
- How may runs actually failed with a SUT error, in other words, with a grade below the threshold? How do they cluster?
- What is the trend in all of these relative to the previous week? Which metrics improved and which degraded?
Both coverage and performance metrics defined in OSC2 typically implement a verification plan, specifying goals and thresholds. The verification plan is a result of an engineering effort driven by requirements such as AV performance, ODD, safety standards and so on.
cover()
Define a coverage data collection point.
Struct, actor, or scenario member
Note
cover() is also allowed in the with block of field declarations.
cover([name: ] <name>
[, expression: <exp>]
[, <param>* ])
<name>- (Required) Is a user-defined identifier composed of any number of characters A–Z, a-z, 0-9, and underscore (_). Identifiers beginning with a digit or an underscore are not allowed.
<exp>- (Optional) Is an expression using objects in the enclosing construct. The expression must be of scalar type. The value of the cover item is the value of the expression when the cover group event occurs. If <exp> is not provided, the expression is derived from the name.
<param>*- (Optional) See cover() and record() parameters.
Coverage is a mechanism for sampling key parameters related to scenario execution. Analyzing aggregate coverage helps determine how safely the AV behaved and what level of confidence you can assign to the results.
For example, to determine the conditions under which a cut_in_and_slow_down scenario failed or succeeded, you might need to measure:
- The speed of the sut.car
- The relative speed of the passing car
- The distance between the two cars
You can specify when to sample these items. For example, the key events for this scenario are the start and end events of the change_lane phase.
Cover items that have the same sampling event are aggregated into a single metric group, along with record data sampled by the same event. The default event for collection coverage is end.
If the range of data that you want to collect is large, you might want to slice that range into subranges or buckets. For example, if you expect the SUT to travel at a speed between 10 kph and 130 kph, specifying a bucket size of 10 gives you 12 buckets, with 10 kph - 19 kph as the first bucket.
Note
Buckets are always open on the right end, so [1..2] includes 1 but not 2.
You can also specify an explanatory line of text to display about the cover item during coverage analysis.
This example defines a name, a unit of measurement, a line of display text, and a range and a range slice for the field speed:
speed1: speed
cover(speed1, unit: kph,
text: "Absolute speed of ego (in km/h)",
range: [10..130], every: 10)
See complete example.
You can declare a field and define coverage for it at the same time. The following examples are equivalent.
# Example 1: use 'it' for 'expression' or omit if it can be derived from the name
current_speed: speed with: cover(current_speed, expression: it, unit: kph)
# Example 2: 'expression' is omitted because it can be derived from name
current_speed: speed
cover(current_speed, unit: kph)
See complete example.
See Also: Defining coverage metrics.
The following example extends the sut.cut_in_and_slow scenario to add a variable field called rel_d_slow_end. It assigns that variable the value returned by the map.abs_distance_between_positions() method at the end event of the slow phase of the scenario. It then defines coverage for that field, including a unit, display text and so on.
extend sut.cut_in_and_slow:
var rel_d_slow_end: length = sample(map.abs_distance_between_positions(
sut.car.state.msp_pos.road_position,
car1.state.msp_pos.road_position), @slow.end) with:
cover(rel_d_slow_end,
unit:centimeter,
text:"How far ahead is car1 relative to dut at slow end (in cm)",
range:[0..6000],
every:50)
See complete example.
Cross coverage: combining coverage from different items
You can combine the coverage of two or more items by specifying the items as a list. This coverage, sometimes called cross coverage, creates a Cartesian product of the two cover vectors, showing every combination of values of the first and second items, every combination of the third item and the first item, and so on.
The following example creates a Cartesian product of three cover vectors at the start of the change_lane phase of a scenario: the relative distance between two cars, the absolute velocity of the SUT vehicle, and the relative speed of the other vehicle.
cover(cross_dist_vel, items: [rel_d_cls, dut_v_cls,rel_v_cls],
event: change_lane_start,
text: "Cross coverage of relative distance and absolute velocity")
See complete example.
You can only cross cover items that have the same sampling event. To overcome this limitation, define a secondary cover point with the common sampling event. For example, if you want to include a field car1.speed in a cross with other items that are sampled at the start of a scenario, you have to define a second field with that sampling event and cover the second field. The reason is that the default sampling event for car1.speed is end, not start.
var speed1:= sample(car1.state.speed, @start) with:
cover(speed1, unit:kph)
See complete example.
record()
Define a performance metric or other data collection point. Any scalar or string value can be captured by record().
Struct, actor, or scenario member
Note
record() is also allowed in the with block of field declarations.
record([name: ] <name>
[, expression: <exp>]
[, <param>* ])
<name>- (Required) Is a user-defined identifier composed of any number of characters A–Z, a-z, 0-9, and underscore (_). Identifiers beginning with a digit or an underscore are not allowed.
<exp>- (Optional) Is an expression using objects in the enclosing construct. The expression must be of scalar type. The value of the record item is the value of the expression when the record group event occurs. If <exp> is not provided, the expression is derived from the name.
<param>*- (Optional) See cover() and record() parameters.
record() is used to capture performance indicators and other data items that are not part of the coverage model, such as the name and version strings identifying the SUT. Record metrics are typically raw, and thus require interpretation or a user-defined formula for grading.
The purpose of performance evaluation is to see how well the AV performed in specific conditions as they occur within a test. Performance is evaluated along multiple dimensions, like safety, ride comfort and so on.
Performance metrics can provide pass/fail indication: sampled values that cross a specified threshold will raise an error message, indicating that the SUT (System Under Test) performance was outside the acceptable range.
KPIs or Key Performance Indicators are the raw metrics measured to see how well the AV performed. There can be safety-related KPIs (such as min-Time-To-Collision or min-TTC, measured in seconds), comfort-related KPIs (such as max-deceleration, measured in meter/second2), and so on.
Often, raw KPI values need to be interpreted in the context of a specific scenario. For example, it may be acceptable to cross the max-deceleration threshold if emergency braking is required. For this purpose, raw KPIs are converted to performance grades. A performance grade (also called a normalized KPI) is a context-dependent number between 0 ("really bad") and 1 ("excellent") that is attributed to some aspect of the SUT behavior. This grade is computed using a user-defined grading formula, which converts one or more raw KPIs into a grade in a context-dependent way.
See Also: Defining Key Performance Indicators.
The following example shows record() used to capture time-to-collision into the metric group associated with the end of a change-lane maneuver.
extend sut.cut_in_and_slow:
# Sample the time-to-collision KPI at the end of change_lane
var ttc_at_end_of_change_lane:= sample(sut.car.get_ttc_to_object(car1), @change_lane.end)
# Record the KPIs into the cut_in_and_slow.end metric group
record(ttc_at_end_of_change_lane,
unit:s,
text: "Time to collision of ego car to cut-in car at end of change_lane")
See complete example.
Cross record - Combining metrics
You can combine record and coverage metrics of two or more previously defined items, by specifying the items as a list. This creates a Cartesian product of the specified metrics, showing every combination of values of the items. Only items that belong to the same metrics group (same sample event) can be crossed.
The following example creates a cross record: a Cartesian product of the time-to-collision record item and the SUT velocity cover item, both sampled at the end of the change_lane phase. This is considered a cross-record because it includes at least one record item.
extend sut.cut_in_and_slow:
var dut_v_cle:= sample(sut.car.state.speed, @change_lane.end) with:
cover(dut_v_cle,
text: "Speed of dut at change_lane end (in kph)",
unit: kph,
range: [10..200],
every: 10)
record(ttc_dut_vel, items: [ttc_at_end_of_change_lane, dut_v_cle],
text: "Cross record of TTC and absolute SUT velocity")
cover() and record() parameters
Cover and record members accept a comma-separated list of zero or more of the following:
unit: <unit>
-
Specifies a unit for a physical quantity such as time, length, speed. The field’s value is converted into the specified unit, and that value is used as the coverage value.
Note
You must specify a unit for cover items that have a physical type. For example:
extend top.main: speed1: speed cover(speed1, unit: kph)
range: <range>
-
Specifies a range of values for the physical quantity in the unit specified with unit.
OSC2 code: rangeSee complete example.extend top.main: speed2: speed cover(speed2, unit: kph, range: [10..130])Note
The compiler issues an error if you specify buckets with every or range.
every: <value>
-
Specifies when to slice the range into subranges. If the range is large, for example [0..200], you might want to slice that range into subranges every 10 or 20 units.
OSC2 code: everyextend top.main: speed3: speed cover(speed3, unit: kph, range: [10..130], every: 10)See complete example.
Notes
- Buckets are always open on the right end, so [1..2] includes 1 but not 2.
- The compiler issues an error if you specify buckets with every or range.
event: <event-name>
-
Specifies the event when the field is sampled. The default is the @end event of the scenario. You can specify one of the other events defined in every scenario, @start or @fail, or a user-defined event. Note that if the scenario fails, the end event is not emitted and any cover() or record() defined with the @end event is not collected. If you want to collect data even if a scenario fails, it is better to create a user-defined event or for coverage used as part of global modifiers use @top.end.
Items that have the same sampling event are aggregated into a metric group. Note that both coverage and performance items can be collected in the same metric group.
You can sample a field value on one event and cover it on another. This way you can capture the speed of a car when changing lanes but associate the cover item with the scenario end metric group. In the following example, the relative lane position is sampled on the change_lane_sut.start event and associated with the scenario's end event.
OSC2 code: cover() with @start eventvar sut_lane: lane_relative_side = sample( sut.car.get_lane_position(), @change_lane_sut.start) cover(sut_lane, text: "Relative SUT lane within road (innermost/middle/outermost)")The cover event must be local (an event defined in the enclosing struct/actor/scenario). No dotted path expressions are allowed. If necessary, you can define a local event to be derived from some path expression and use that for coverage, for example:
OSC2 code: cover() with local eventextend top.main: speed1: speed event sim_clock is @top.clk cover(speed1, unit: kph, event: sim_clock, range: [10..130], every: 10)See complete example.
text: <string>
-
Is explanatory text, enclosed in double quotes, about this metric point. For example:
OSC2 code: cover() with explanatory textextend top.main: speed1: speed cover(speed1, unit: kph, event: change_lane_start, text: "Absolute speed of ego at change_lane start (in km/h)", range: [10..130], every: 10)See complete example.
items: <list>
-
Declares a list of identifiers whose metrics are combined and displayed as a single cross item. The identifiers must be separated by commas and enclosed in square brackets. The list can include a mix of cover and record identifiers. If a record identifier is included, the resulting item is a cross record item. For example:
OSC2 code: cross itemsextend sut.cut_in_and_slow: cover(cross_dist_vel, items: [rel_d_cls, dut_v_cls], text: "Cross coverage of relative distance and absolute velocity")See complete example.
buckets: <list-of-bucket-boundaries>
-
Provides a way to declare different-sized buckets, by specifying bucket boundaries. A list of N values defines N-1 buckets. Each value must be >= the previous one, else this is an error. For example:
OSC2 code: bucket boundariesextend top.main: speed1: speed cover(speed1, unit: kph, buckets: [1, 2, 6.5, 10])See complete example.
Creates the following three buckets
1..2, 2..6.5, 6.5..10Notes
- Buckets are always open on the right end, so [1..2] includes 1 but not 2.
- The compiler issues an error if you specify buckets with every or range.
buckets: <list-of-explicit-buckets>
-
The syntax of <list-of-explicit-buckets> is a list of bucket definitions separated by commas, where bucket definition is:
OSC2 code: bucket syntaxbucket([values:][<from-value>..<to-value>][,[target:] <value>])To create a bucket for a single value, use that value as both the <from-value> and <to-value>. For example, the following code creates three buckets, each with a single value:
OSC2 code: bucket examplebuckets: [1..1], [6..6], [250..250])target is a positive integer specifying how many hits (samples) are required for each bucket in order to consider the item covered. The default is 1.
OSC2 code: target for bucketcover(c5, expression: x, buckets: [bucket(values: [1..4], target: 5), bucket([4..8]), bucket([8..50], 2)])See complete example.
If you specify a target for the item and a target for some buckets, the larger target is taken. For example:
OSC2 code: target for item and bucketThe target for the [1..20] bucket is 3, while the target for the [20..70] bucket is 5.cover(sut_speed_at_slow, expression: sut_speed_at_slow, unit: meter_per_second, target:3, buckets: [bucket(values: [1..20]), bucket([20..70],target:5)])Notes
- Buckets are always open on the right end, so [1..2] includes 1 but not 2.
- The compiler issues an error if you specify buckets with every or range.
If neither range nor buckets are provided
The behavior is as follows if neither a range nor buckets are provided:
Item type Resolved buckets - cover Resolved buckets - record int, uint, float, physical A single [MIN_VALUE..MAX_VALUE] bucket, where MIN_VALUE and MAX_VALUE are the minimal and maximal values that can be represented in the type. A new bucket will be opened by the runtime for each sampled value. enum A single-value bucket for each member. A single-value bucket for each member. bool Single-value 'true' and 'false' buckets. Single-value 'true' and 'false' buckets. string A new bucket will be opened by the runtime for each distinct value. A new bucket will be opened by the runtime for each sampled value.
ignore: <item-bool-exp>
-
Defines values that are to be completely ignored. The expression is a Boolean expression that can contain only the item name and constants. If the ignore expression is true when the data is sampled, the sampled value is ignored (not added to the bucket count). Ignored buckets will not be visible for this coverage item in Foretify Manager. For example, the following cover definition does not add to the bucket count if the lane is the closest to the divider on a divided highway:
msdl title="OSC2 code: cover() with ignore" extend sut.cut_in_and_slow: cover(dut_lane, text: "Relative dut lane within road (leftmost/center/rightmost)", ignore: (dut_lane == center))See complete example.
sample_if: <bool-exp>
-
Defines a condition that must be true for a sample to be collected. The expression is a Boolean expression that is evaluated at the sampling event. If the sample_if expression evaluates to false when the sampling event occurs, no sample is collected for that event occurrence. This is useful for conditional sampling based on runtime conditions. For example, the following cover definition only samples the speed when the vehicle is in a specific lane:
OSC2 code: cover() with sample_ifcover(sut_speed, unit: kph, range: [0..200], every: 10, sample_if: sut.car.get_lane_position() == middle)When used with cross coverage or cross record items, if any item's sample_if condition is false, the entire cross sample is not collected for that event.
See the complete example here.
disable: <bool>
-
Must be set to either true or false. When set to true, the metric group is completely disabled. This parameter is used in conjunction with override to disable an existing metric item. (Default : false). For example, the following override prevents coverage collection for dut_v_cls:
OSC2 code: cover() with disableextend sut.cut_in_and_slow: cover(override: dut_v_cls, disable: true)See complete example.
target: <value>
-
Is a positive integer specifying how many hits (samples) are required for each bucket in order to consider the item covered. The default is 1. For example, the following cover definition creates two buckets (right and left) and requires twenty hits in each bucket:
OSC2 code: target for itemextend top.main: side: av_side cover(side, target: 20)See complete example.
Note
The target is applied to an aggregation of runs (a test suite). If multiple test suites with different targets for the same item are included in the same workspace, the target specified in the original test suite is applied to subsequent test suites.
override option
The override option is used to override a previously defined name or a parameter in an already defined cover() or record() member.
cover | record(override:<name>
[,event: <event-name>]
[, <param>* ])
cover | record(override:<name>,
rename: <new-name>)
The override feature works as follows:
- (Required) <name> must be the name of an existing cover or record item.
- (Optional) <event-name> is the name of the event as specified for the cover or record item. If an event is not specified for the metric group, no event should be specified in override.
- (Required) <new-name> is an alias that you want to apply to an existing cover or record item. This lets you name items differently in different projects without changing the scenario file. You must define the alias in the same type where the coverage item was originally declared; otherwise, an error message is issued. From the point in the code where you defined the alias, the previous name of the coverage point is replaced with the new name, including cross coverage items that were defined with it. Foretify Manager knows only the alias, not the original name.
- Any other specified parameter in the <param> list overrides the corresponding original parameter. Parameters not provided in the override retain their original values.
- Multiple overrides are allowed. The last value provided for each parameter prevails.
- The override automatically percolates to any cross metric using this item.
# Original definition
cover(speed_diff, unit: kph, range: [1..20], every: 5)
# New definition adds ignore
cover(override: speed_diff, ignore: (speed_diff in [10..13]))
# Now speed_diff is completely disabled
cover(override: speed_diff, disable: true)
See complete example.
# Original definition
cover(sut_start_speed, unit: kph, range: [1..20], every: 5)
# New definition renames item
cover(override: sut_start_speed, rename: ego_start_speed)
See complete example.
-
The name of the item and the event specification must be consistent with the original declaration. The following override generates a compile-time error.
OSC2 code: invalid cover() definition# Original definition cover(speed_diff, event: sim_clock, unit: kph, range: [1..20], every: 5) # [ERROR] cover item 'speed_diff' does not exist for event 'end' cover(override: speed_diff, ignore: (speed_diff in [10..13])) -
You cannot override the expression or the unit as originally defined.
- Override can only be used in the same type where the cover or record modifier was originally defined. It cannot be used to override a cover or record modifier in a subtype.
trace()
Define values to be collected when a scenario is running
Actor or scenario member
trace([name: ] <name>
[, expression: <exp>]
[, <param>* ])
<name>- (Required) Is a user-defined identifier composed of any number of characters A–Z, a-z, 0-9, and underscore (_). Identifiers beginning with a digit or an underscore are not allowed. This name is used in all targets (the graphical timeline, the CSV file, and the log file) unless a title is specified. You can also use it for filtering.
<exp>- (Optional) Is an expression defining the value you want to trace. Typically the expression is the name of a field or an expression referring to multiple fields, such as field1 - field2. If <exp> is not provided, the expression is derived from the name.
<param>*-
(Optional) Is a comma-separated list of zero or more of the following:
-
unit: <unit> specifies a unit for a physical quantity such as time, length, speed. The expression's value is converted into the specified unit, and that value is used as the trace value. You must specify a unit for trace expressions that have a physical type. Conversely, do not specify a unit for all other items.
-
title: <string> defines an explanatory title for the expression to be used in all targets. The default title is the <name>.
-
ignore: <bool_exp> specifies that the trace is not collected if <bool_exp> evaluates to true. The check is done once, at the start of the run, after generation.
-
category: <signal_category> specifies the category this signal belongs to. This parameter allows activating multiple signals that have a common meaning in single interactive trace command in Foretify. For example, executing trace sut at the Foretify prompt displays all the predefined trace signals for the SUT as well as others that you have added to that category. The allowed values for this parameter are those that are defined in the enum signal_category. The default values for this enum are npc, npc_ego_relation and sut. If a category is not specified, a signal does not belong to any category.
-
enabled: <bool> specifies if signal should be recorded by default. The default is false.
-
Tracing an item is useful because you can see exactly when the value of an expression changes during the scenario execution. Foretify defines many standard traces for you. See the trace command for more information.
Note
Defining a trace with trace() only declares the items to be traced. To activate the trace, you can issue interactive trace commands in Foretify, or set the flag enabled: true to automatically record the signal with the timeline target. See trace command.
Note
The name or category of the defined trace must be enabled in Foretify with the trace command. See trace command.
extend signal_category: [cut_in_params] # optional, when new category is needed to enable grouping of multiple signals together
extend sut.vehicle_cut_in_and_slow:
# cut_in_vehicle_speed is added to the 'npc' category
trace(cut_in_vehicle_speed, expression: cut_in_vehicle.state.speed, unit: kph, category: npc)
# dut_speed and speed_diff are not added to any category
trace(dut_speed, expression: sut.car.state.speed, unit: kph)
trace(speed_diff, expression: cut_in_vehicle.state.speed - sut.car.state.speed, unit: kph)
# cutin_side is added to the 'cut_in_params' category
trace(cutin_side, expression: vehicle_cut_in_at_essence.gen_cut_in_side, category: cut_in_params, enabled: true)
extend top.main:
do c: sut.vehicle_cut_in_and_slow()
See complete example.
Failure response constructs
sut_error()
Report an SUT error and print message to STDOUT.
Method
sut_error(<kind>
, <string>)
<kind>-
(Required) Is of type issue_kind. The predefined values are described in kind.
You can extend issue_kind to include project-specific kinds of issues.
<string>- (Required) Is a message that describes an SUT error that occurred, enclosed in double quotes.
The sut_error() method is one of a set of methods you can use to define a response to a failure or issue, including:
- sut_warning()
- sut_issue()
- scenario_completion_error()
- scenario_completion_warning()
- other_issue()
- other_warning()
- other_error()
Note
These methods must be invoked by the call directive. Alternatively you can invoke similar scenarios within the do block of a scenario.
extend top.main:
on sut.car.state.speed > 100kph:
call sut_error(assertion, "SUT drove too fast")
See complete example.
log*()
Print message to STDOUT.
Action
log(<string>)
log_info(<string>)
log_debug(<string>)
log_trace(<string>)
<string>- Is an informational message, enclosed in double quotes.
The following constructs are used to print messages at various levels of verbosity:
| Name | Description |
|---|---|
| log() | Used to report major events and messages |
| log_info() | More detailed reporting |
| log_debug() | Verbose information that may be useful for debug |
| log_trace() | Most detailed information used to trace execution |
These actions can be invoked within the do block of a scenario, but not within an on modifier. However, you can call similar methods from an on block:
- logger.log_info(<string>)
- logger.log_debug(<string>)
- logger.log_trace(<string>)
import "$FTX_BASIC/exe_platforms/sumo_ssp/config/sumo_config.osc"
extend test_config:
set map = "$FTX_PACKAGES/maps/straight_long_road.xodr"
struct my_ints:
x: int
y: int
extend top.main:
var z: my_ints = new
do serial:
sut.car.drive() with: duration([3..7]s)
call logger.log_info("z.x = $(z.x), z.y = $(z.y)")
See complete example.
Statistics modifier
The statistics modifier performs a statistical computation on samples.
Scenario modifier
[<label>:] statistics(sample_type: <numeric type>, measurement: <measurement type>)
<sample_type>- The type of the value to sample must be numeric. It can be of type 'int' or any physical type.
<measurement>- The statistical computation to perform: minimum, maximum, average, standard_deviation, or average_absolute_deviation.
add_sample(<expression>)- A method that adds an expression as a new sample.
compute() -> <numeric type>- A method that computes the statistical value and returns a value of type sample_type.
reset()- Resets the modifier’s state.
consume() -> <numeric type>- A method that computes the statistical value, resets the state, and returns a value of type sample_type, equivalent to calling compute() followed by reset().
The statistics modifier performs a statistical computation on samples. To perform the desired statistical computation, you must add the samples using the add_sample() method and call the compute() method.
-
minimum: the smallest sample value
-
maximum: the largest sample value
-
average: the average of the sample values
-
standard_deviation: the standard deviation of the sample values
-
average_absolute_deviation: the average absolute deviation of the sample values
extend top.main:
v1: vehicle
do v1.drive(duration: 10sec)
avg_speed: statistics(sample_type: speed, measurement: average)
on @top.clk:
avg_speed.add_sample(v1.state.speed)
on @end:
var avg := avg_speed.consume()
logger.log_info("v1 average speed: $(avg)")
See complete example.
extend top.main:
avg_acc_stats: statistics(sample_type: acceleration, measurement: average)
watcher above_50(my_interval_data) is above_w(sample_type: speed, sample_expression: my_car.state.speed, threshold: 50kph, tolerance: 5kph)
on @above_50.i_clock:
avg_acc_stats.add_sample(my_car.state.road_acceleration.lon)
on @above_50.i_end:
above_50.data.average_acceleration = avg_acc_stats.consume()
struct my_interval_data inherits above_w_speed_data:
var average_acceleration: acceleration
record(average_acceleration, expression: average_acceleration, unit: mpsps)
See complete example.
The float data type is not supported yet for the sample_type.