Skip to content

98. Invoking the Evaluation Pipeline flow

To invoke the Evaluation Pipeline flow, you must use the Evaluation Solution Reference (ESR) Kit. This involves setting up the ESR Kit environment, preparing the required configuration, and running the provided scripts.

98.1 Unified ODD coverage on simulation runs

Unified coverage is collected by monitoring execution and detecting behaviors that match conditions specified in evaluation scenarios. These evaluation scenarios are defined using the same OSC2 code as generation scenarios, but are more abstract, and are designed to detect behavior rather than trigger it. Evaluation scenarios are located in the Evaluation V-Suite scenario library.

Coverage can be collected for both physical and simulated drives. The following image shows how both flows contribute to a comprehensive coverage view.

Figure 1: Unified coverage collected from physical and simulated drives

The Evaluation Pipeline process enables you to collect unified ODD coverage on simulation runs. The flow uses the same OSC2 code to collect coverage data from both simulation and drive logs. Foretify Manager consolidates the coverage results from these two sources, providing a comprehensive view of ODD coverage.

See Introducing the Evaluation Pipeline for more information about evaluation scenarios and the Evaluation Pipeline flow.

98.2 Combined Foretify flow

The combined flow is a streamlined method for running simulation followed by the Evaluation Pipeline, as illustrated in the diagram below.

Figure 2: The combined flow runs simulation followed by matching

98.3 Run the combined Foretify and Evaluation Pipeline flow

The combined flow requires two top-level files: one that contains the generation definitions used during Foretify scenario generation and another that provides the evaluation definitions used for scenario detection.

While both generation and evaluation scenario definitions can include coverage and KPI collection constructs, only the constructs defined in the evaluation scenarios are collected uniformly on both simulation and drive-log runs.

To run the combined Foretify and Evaluation Pipeline flow:

Invoke the combined flow by running Foretify with the --match flag on the command line, followed by the evaluation OSC2 top-level file:

foretify --load $FTX/logiq/smoke/generative_test/generative_vehicle_cut_in.osc --match $FTX/logiq/smoke/generative_test/scenarios.osc --run --batch

98.3.1 Simulation input file

The following input file is used in the above invocation example. This file imports a generative library scenario called vehicle_cut_in

# Copyright (c) 2025 Foretellix Ltd. All Rights Reserved.

import "$FTX_PACKAGES/base_scenarios/scenarios/vehicle_cut_in/vehicle_cut_in/vehicle_cut_in_top.osc"

# To run test with a different simulator,
# replace this import with the appropriate simulator configuration file
import "$FTX/config/sim/sumo_default.osc"

extend map_config:
    set export_msp_topology=true

extend sim_config:
    set enable_gui=false

# Map can be replaced with any OpenDRIVE map containing 2 lanes road
extend test_config:
    set map = "highway.xodr"

extend top.main:
    do sut.vehicle_cut_in()

98.3.2 Evaluation Pipeline input file

The following file defines the evaluation scenario used during the Evaluation Pipeline execution. It imports an evaluation scenario from the scenario library and instantiates it with specific parameters.

import "$FTX/logiq/scenario_library/vehicle_cut_in/vehicle_cut_in/vehicle_cut_in_top.osc"

match:
    vehicle_cut_in: sut.vehicle_cut_in(min_distance_from_sut_in_time_units: 0s, max_distance_from_sut_in_time_units: 6s,
                           min_init_drive_phase_duration: 0s, max_init_drive_phase_duration: 3s,
                           min_change_lane_phase_duration: 0s, max_change_lane_phase_duration: 3s,
                           min_post_phase_duration: 0s, max_post_phase_duration: 3s, speed_gap_threshold: 5kph)

The Evaluation Pipeline will detect matching behavior in the simulation execution. Coverage will be collected for every matching behavior, according to the coverage definitions found in the evaluation scenario definition.