97. Evaluation object list proto definition
The goal of the object list is to allow the evaluation pipeline to ingest customer data using the minimal data required for proper functionality and accuracy of the evaluation scenarios.
The denoiser tool in the evaluation pipeline can fill in or fix much of the missing data, but if the data is available from the customer, we recommend using it for better accuracy.
Note
Your original data does not have to comply with all the rules of the object list — the denoiser can resolve issues such as inconsistent sampling rates. You can ignore any field not listed in the tables below; it will only be needed for special cases.
97.1 Overview of the object list
At the highest level, the Root message defines your log's timing (step interval and optional start timestamp), versioning, global key–value metadata (custom_data), and an optional region‑of‑interest (RoiConfig).
Root contains a repeated list of TimeSlot entries, each capturing the world state at a specific frame. Within each TimeSlot, you'll find the Ego, a list of objects (all other actors), and any per‑frame traffic_lights showing each signal's current state.
Each Object record carries a unique tracking_id and broad classification (ObjectKind), its 3D position plus optional motion vectors (velocity, acceleration, jerk, angular speed), and orientation angles (yaw, pitch, roll). It also includes dimensions (length, width, height) or an eight‑point BoundingBox for precise shape. Additional fields include a custom_data array, flags like is_stationary and is_emergency_mode, and a utility enum for special roles.
97.2 Field description
97.2.1 Root message
The Root message is the top-level container for a log file. It defines the timing parameters, versioning, and global metadata for the log, and holds the full sequence of per-frame world states.
| Field | Description | Required? |
|---|---|---|
step_time |
Sampling interval between frames in milliseconds. The denoiser can resolve mismatches between Ego and object sampling rates. | Yes |
start_time |
Log start time in the AV's time domain. Use it to cross-reference evaluation timing with other tools. | No |
times |
List of TimeSlot messages representing the per-frame world state, with strictly increasing timestamps. |
Yes |
custom_data |
List of Pair messages for log-level metadata (for example, software version, mission type, hardware profile). |
No |
97.2.2 TimeSlot message
Each TimeSlot represents the full state of the world at a single point in time. It contains the Ego state, all detected actors, and optionally lane markings and traffic light states.
| Field | Description | Required? |
|---|---|---|
time |
Frame timestamp in milliseconds, relative to log start. Must align to Root.step_time; the denoiser can correct misalignment. |
Yes |
ego |
Ego state for this frame. | Yes |
objects |
List of all non-Ego actors (vehicles, VRUs, and other objects) in this frame. | No (omit if the Ego is driving without other actors) |
traffic_lights |
List of all traffic lights visible to the Ego. Required for scenarios where match or coverage depends on traffic light state (for example, protected vs. unprotected turns). | No (without it, the matcher cannot infer traffic light state, affecting junction scenarios and coverage) |
97.2.3 Object message
The Object message represents the state of a single actor at a specific frame. It is used for both the Ego and all other detected actors (vehicles, VRUs, and other objects).
| Field | Description | Required? |
|---|---|---|
tracking_id |
A unique identifier that tracks the object across frames, typically derived from the perception stack's track ID. Noise in this value across frames is permitted and can be fixed with the denoiser. | Yes |
kind |
A basic object classification corresponding to the ObjectKind enum. Noise in this value across frames is permitted and can be fixed with the denoiser. |
Yes |
position |
The x, y, z coordinates of the object. | Yes (x, y required; z optional — elevation is ignored) |
velocity |
The x, y, z velocity values. | No (can be computed from position) |
acceleration |
The x, y, z acceleration values. | No (can be computed from velocity) |
yaw |
Yaw in radians. | No (can be computed from velocity for forward-moving objects) |
length, width, height |
Size of the object. | No (the denoiser can infer size per object type, but providing actual values improves match and metric accuracy) |
confidence_info |
Structured confidence scores:
|
No (if omitted, all samples are weighted equally) |
97.3 Object list Protobuf definition
The following Protobuf definition is used to encode log data for Evaluation Pipeline ingestion.
// Copyright (c) 2021-2026 Foretellix Ltd. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in
// compliance with the License. You may obtain a copy of the License at
// http://www.apache.org/licenses/LICENSE-2.0
// Unless required by applicable law or agreed to in writing, software distributed under the License
// is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
// implied. See the License for the specific language governing permissions and limitations under the
// License.
syntax = "proto3";
package ftx_re.proto.object_list;
message Data3d {
double x = 1;
double y = 2;
double z = 3;
}
// Rough classification of objects
// Detailed classification can be provided using the Object custom-data array
enum ObjectKind{
KIND_OBJECT = 0; // Unclassified object
KIND_PERSON = 2;
KIND_CYCLIST = 3;
KIND_VEHICLE = 4;
KIND_TRUCK = 5;
KIND_TRAILER = 6;
KIND_FOD = 7;
KIND_ANIMAL = 8;
KIND_SIGN = 10;
KIND_BUS = 11;
KIND_MOTORCYCLE = 12;
}
enum Utility{
NONE = 0;
EMERGENCY = 100; // Emergency actor like ambulance, fire truck, police car
SCHOOL = 200; // School bus
}
message BoundingBox {
// An 8 point vector
// Order convention:
// bottom-front-left, bottom-front-right, bottom-back-right, bottom-back-left
// top-front-left, top-front-right, top-back-right, top-back-left
repeated Data3d points = 1; // (x,y,z) coordinates in meters
}
message Pair {
string key = 1;
string value = 2;
}
message Confidence {
optional double detection = 1; // Confidence for object size, range [0.0, 1.0]
optional double classification = 2; // Confidence for object type, range [0.0, 1.0]
optional double localization = 3; // Confidence for x,y location, range [0.0, 1.0]
}
message Object {
// Object is anything detected by perception, except lane markings
// tracking_id is a unique identifier for the object identity, not to be reused within a trace
string tracking_id = 2;
ObjectKind kind = 3;
Data3d position = 4; // (x,y,z) coordinates in meters
// Velocity, acceleration, jerk and angular_speed data are optional and can be omitted.
// For moving objects these properties can be computed from position data
Data3d velocity = 5; // in m/s
Data3d acceleration = 6; // in m/s^2
Data3d jerk = 7; // in m/s^3
Data3d angular_speed = 8; // yaw rate of change in rad/s
// Only yaw is mandatory for moving objects - all angles can be omitted for static objects
double yaw = 10; // in radians
double pitch = 11; // in radians
double roll = 12; // in radians
// Lane number, if object on a road, optional
// Lane numbering is relative to the ego:
// 0 - same as ego
// -1, -2 - slower than ego lane (right of ego in the US)
// 1, 2 - faster than ego lane (left of ego in the US)
// 100 represents uninitialized field
int32 lane = 15;
double position_in_lane = 16; // in meters, measured from center line, positive direction is toward faster lane
// In case bounding box data is omitted
double length = 17; // in meters
double width = 18; // in meters
double height = 19; // in meters
// Bounding box data is optional
// if present, length width and height are computed using bounding box data
BoundingBox bbox = 20;
// Custom data passes object specific information
// Data used for checking, coverage collection and in support of custom modifiers
repeated Pair custom_data = 21; // pair of key value strings. Key should be a string following variable name conventions
reserved 22;
bool is_stationary = 23; // true if the object remains stationary throughout all time slots
bool is_emergency_mode = 24; // true if the object is in emergency mode, for e.g. lights flashing or siren on, also school bus stopping for passengers
Utility utility = 25; // The utility of the object, for e.g. an emergency vehicle
string internal_id = 26; // Internal field, should not be used
string child_tracking_id = 27; // The tracking id of the followed attachd object (e.g trailer)
double front_hitch_point = 28; // Position of the hitch point upon the attached object in meters
double back_hitch_point = 29; // Position of the hitch point upon the current object in meters
optional double confidence = 30 [deprecated = true]; // DEPRECATED: Use confidence_info instead. Confidence score for the object detection, in range [0.0, 1.0]
optional Confidence confidence_info = 31; // Structured confidence scores for detection, classification, and localization, each in range [0.0, 1.0]
}
enum LaneKind {
// Provide kind if known from perception
LANE_UNKNOWN = 0;
LANE_FORWARD_DRIVING = 1;
LANE_ONCOMING_DRIVING = 2;
LANE_ANY_DRIVING = 3; // E.g. lane in junction, parking lot
LANE_SHOULDER = 4; // Any non-drivable lane
}
enum LaneBoundaryKind {
// Provide kind if known from perception
BOUNDARY_UNKNOWN = 0;
BOUNDARY_SOLID = 1; // solid line or divider - no passing
BOUNDARY_DASHED = 2; // passing allowed
}
enum TrafficLightState {
TL_STATE_UNKNOWN = 0; // Unknown state
TL_STATE_INACTIVE = 1; // Light is off
TL_STATE_STOP_SIGN = 2; // Light is flashing red (in US) - like a stop sign
TL_STATE_YIELD_SIGN = 3; // Light is flashing yellow (US) - like a yield sign, also flashing yellow arrow
TL_STATE_GO = 4; // Green light, unprotected (no arrow)
TL_STATE_PROTECTED_GO = 5; // Green arrow
TL_STATE_STOP = 6; // Red light
TL_STATE_SLOW = 7; // Yellow light
TL_STATE_CHANGE_TO_GO = 8; // Light is about to turn green (red and yellow together in IL)
TL_STATE_CHANGE_TO_SLOW = 9; // Light is about to turn yellow (flashing green in IL)
}
enum TrafficLightDirection {
TL_DIRECTION_UNKNOWN = 0;
TL_DIRECTION_ALL = 1;
TL_DIRECTION_STRAIGHT = 2;
TL_DIRECTION_STRAIGHT_AND_LEFT = 3;
TL_DIRECTION_STRAIGHT_AND_RIGHT = 4;
TL_DIRECTION_LEFT = 5;
TL_DIRECTION_RIGHT = 6;
TL_DIRECTION_U_TURN = 7;
}
enum TrafficLightType {
TL_TYPE_UNKNOWN = 0;
TL_TYPE_VEHICLE = 1;
TL_TYPE_PED = 2;
TL_TYPE_BICYCLE = 3;
TL_TYPE_RAILROAD = 4;
}
message LaneBoundary {
LaneBoundaryKind kind = 1;
Data3d boundary = 2; // x,y,z coordinate of the closest point on the boundary
double distance = 3; // lateral distance in meters from the boundary
}
message Lane {
// Lane data from marking
// lane ID: 0 is ego lane,
// 1 is adjacent faster lane, 2, 3 are faster still
// -1 is adjacent slower lane, -2, -3 are slower still
int32 id = 1;
LaneKind kind = 2;
Data3d center = 3; // (x,y,z) coordinates in meters
double width = 4; // length in meters
LaneBoundary boundary_fast = 5; // Boundary with faster lane (left in LHS driving)
LaneBoundary boundary_slow = 6; // Boundary with slower lane (right in LHS driving)
}
message TrafficLight {
// Traffic light state in specific time slot - same TL ID can show up multiple times, once for each direction
// Traffic light ID needs to map to the lane the vehicle is on
// Traffic light direction needs to match the direction vehicle is traveling to
string id = 1; // TL identifier on map
TrafficLightDirection direction = 2; // With respect to vehicle direction
TrafficLightState state = 3; // Current TL state
TrafficLightType type = 4;
}
message TimeSlot {
uint32 time = 1; // in milliseconds relative to start time
Object ego = 2; // The ego object
repeated Object objects = 3; // The objects in the time slot, not including ego
repeated Lane lanes = 4; // Optional perception lane detection data
repeated TrafficLight traffic_lights = 5; // TL states for the time slot
}
message GlobalPosition {
// WGS84 LLA format (e.g. GPS)
double latitude = 1;
double longitude = 2;
double altitude = 3;
}
message LocalFrameOriginPosition {
// Used to position the local frame origin and rotation with respect to the global frame
GlobalPosition lla = 1;
double yaw = 2;
}
message RoiConfig {
// Region of Interest (ROI) configuration
// defined by a list of rectangles to include and exclude from the ROI
message Rect {
double min_x = 1;
double max_x = 2;
double min_y = 3;
double max_y = 4;
}
// [optional] list of rectangles to include in the ROI
repeated Rect positive_rectangles = 1;
// [optional] list of rectangles to exclude from the ROI
repeated Rect negative_rectangles = 2;
}
message Root {
bool is_absolute = 1; // When true - all coordinates are global, if false all coordinates are ego relative
uint32 step_time = 2; // time in milliseconds between cycles
double start_time = 3; // The absolute time in milliseconds in which the data starts
repeated TimeSlot times = 4; // times start at time 0 and are relative to start_time
LocalFrameOriginPosition local_frame = 5; // localization data of the local frame in the world
int32 version = 10;
double origin_start_time = 11; // DEPRECATED - use the custom_data array for all metadata
repeated Pair custom_data = 12; // Global metadata that describes log level properties. Key should be a string following variable name conventions
RoiConfig roi_config = 13; // Optional Region of Interest (ROI) configuration. Filled by the denoiser if ROI config is enabled
}