24. Error Classification, Severity Levels, and SLA Escalation Rules
This section provides detailed error classifications, including priority based on severity levels.
24.1 Error log locations
- Primary Log: $ESR_KIT_WORK_DIR/logs/ref_kit_commands_{TIMESTAMP}.log
- Foretify Logs: Location specified in Foretify execution output
- Script Output: Standard error and standard output streams
24.2 SLA priority definitions
The following table describes each SLA priority level, its meaning, and the expected system impact.
| Priority Level | Name | Classification |
|---|---|---|
| SS | Show Stopper Priority (Showstopper) | Service outage. Production workflows are unavailable due to service outage. Services must be restarted, or changes reverted and restarted. |
| P0 | High Priority | A critical problem impacting the software where major functionality is unavailable or deviates significantly from expected behavior; workarounds may exist to allow progress. |
| P1 | Medium Priority | A problem that impacts the system in a minor or non-critical way. Functionality is not available or deviates from expected behavior. A workaround may be available and/or the problem occurs occasionally. |
| P2 | Low Priority | A minor problem. Functionality is available but deviates from expected behavior. A workaround is available and implemented, and/or the problem occurs infrequently. |
24.2.1 Volume-Based escalation rules
To account for unusually high volumes of errors, P2 failures are escalated as indicated in the following table.
| % of P2 Failures (Relative to All Failures) | Escalated Category | Explanation |
|---|---|---|
| 0–10% | P2 | Normal volume of failures; treated as low priority issues. |
| 11–40% | P1 | Elevated failure volume; treated as medium priority issues due to increased impact. |
| > 40% | SS (Show Stopper) | High failure volume; considered service-impacting and categorized as Show Stopper. |
24.3 Error messages
For ease of use, error messages have been organized by category with abstracted reasons and reporting mechanisms. The priority is aligned with the SLA definition.
- Error messages include file paths, environment variable names, and log file locations
- Some errors provide additional context messages (e.g., "Check detailed logs at: {LOG_FILE}")
- Execution errors may include exit codes and duration information
25. Error Reference Documentation
This document categorizes errors encountered during execution, input validation, license checks, evaluation pipeline processing, and resource management.
25.1 Input Data Errors
| Error | Error Message in Logs | Impact | Priority |
|---|---|---|---|
| Missing required environment variable | "ESR_KIT_DRIVE_LOGS env var not set" "ESR_KIT_WORK_DIR env var not set" "FTX_ESR_KIT_HOME environment variable is not set" "FTX environment variable not set" |
Script exits with non-zero code | P2 |
| Invalid environment variable path | "Directory {ESR_KIT_DRIVE_LOGS} does not exist" "ESR_KIT_DRIVE_LOGS does not exist: '{ESR_KIT_DRIVE_LOGS}'" |
Script exits with non-zero code | P2 |
| Map file not found | "Error: Map file not found: {MAP_PATH}" "Map file {MAP_FILE} not found at: {MAP_FILE}" |
Task fails, script may continue or exit | P2 |
| Invalid object list format | "Failed to read object list: {e}" "Object list step time is 0" |
Task fails, may indicate missing .proto fields (object ID, timestamp, world state) | P2 |
25.2 License Errors
These errors are logged in Foretify execution logs.
| Error | Error Message in Logs | Impact | Priority |
|---|---|---|---|
| License server connection failure | "[ERROR] License server error: Can't Connect to License Server (-15,3002) (connect timed out)" | Execution fails before test runs | SS |
| License checkout failure | "[ERROR] Foretify license error: Failed to checkout FTLX_EVALUATE_RUNTIME" | Log: Foretify execution logs Execution fails before test runs |
SS |
25.3 Evaluation Pipeline Errors
| Error | Error Message in Logs | Impact | Priority |
|---|---|---|---|
| Missing configuration file | "Configuration file {CONFIG_FILE} not found" | Script exits with non-zero code | P2 |
| Invalid configuration file format | "The configuration file must be a YAML file" | Script exits before execution | P2 |
| Missing required executable/binary | "FTX_EVAL_YAML_CONFIG_PARSER_EXE is not set" "$FTX_EVAL_YAML_CONFIG_PARSER_EXE does not exist at: {path}" "logiq_ingest binary not found" "foretify command not found" "interval_to_run_data binary not found at {path}" |
Script exits with non-zero code | P2 |
| Invalid command-line arguments | "Error: Unknown option {option}" "Error: --input parameter not provided" "Error: --map must be specified" "Error: Either --object_list or --object_list_dir must be specified" "Error: --ingestion_runs must be specified" |
Script exits with non-zero code | P2 |
| No tasks specified in configuration | "No tasks specified in configuration. Please specify at least one task." | Script exits with non-zero code | P2 |
| Object list file not found | "Object list directory not found: {DRIVE_LOGS_PATH}" "Object list file {OBJ_LIST} does not exist" "Error: Object list file not found: {OBJECT_LIST}" |
Task fails, script may continue or exit | P2 |
| No object lists found in directory | "Error: No object lists found in directory: {OBJECT_LIST_DIR}" "Ingestion step failed - no object lists found to process" |
Task fails | P2 |
| OSC configuration file not found | "Error: OSC configuration file not found: {OSC_FILE_PATH}" "OSC source file {INGESTION_OSC_PATH} does not exist" |
Task fails | P2 |
| Evaluation scenarios file not found | "Error: Evaluation scenarios file not found: {EVAL_SCENARIOS_PATH}" | Task fails | P2 |
| ROI file not found | "ROI file {ROI} does not exist" | Task fails | P2 |
| MSP file not found | "MSP file {MSP} does not exist" | Task fails | P2 |
| Invalid denoise configuration | "Error: Config file not found at {DENOISE_CONFIG_PATH}" "Failed to setup configuration: {e}" "Invalid denoise level provided (not 0-4 or YAML file path)" |
Task fails | P2 |
| Invalid ROI configuration | "Error: Config file not found at {ROI_CONFIG_PATH}" | Task fails | P2 |
25.4 Resource Errors
| Error | Error Messages / Logs | Impact / Notes | Priority |
|---|---|---|---|
| Insufficient disk space | "No space left on device" "Directory creation failures" |
Task fails | P2 |
| Permission denied | "Permission denied" errors "File write/read failures" |
Task fails | P2 |
| Directory creation failure | "Error: Failed to create destination directory: {DEST_DIR}" "Error: Failed to create directory: {POST_MATCH_WORK_DIR}" |
Task fails | P2 |
| Evaluate-service endpoint failure | Unable to establish a connection to the dedicated endpoint https://evaluate-nvidia.tests.foretellix.com/docs#/ |
— | SS |
| Foretify Manager access failure | Unable to access Foretify Manager dedicated to NVIDIA users https://fmanager-nvidia.tests.foretellix.com/login |
— | SS |
| Compute Cluster Crash (insufficient resources) | Unable to provision the desired number of resources to process pending jobs | — | P1 |
25.5 Execution Errors
| Error | Error Messages / Logs | Impact / Notes | Priority |
|---|---|---|---|
| Dispatcher failure | Failed to copy object list file from S3: <s3://file-url> |
P1 | |
| Parse task failure | "Error: Parsing failed"Context: "Check detailed logs at: {LOG_FILE}" | Task fails, evaluation may continue | P2 |
| Denoise task failure | "Error: Denoising failed" "Failed to denoise object list: {e}" "Denoiser error encountered. Exiting" Context: "Check detailed logs at: {LOG_FILE}" |
Task fails | P2 |
| Ingestion task failure | "Error: Ingestion failed" "Command failed after {duration} seconds with exit code {exit_code}" "Failed to ingest: {object_list_file}" "All ingestions failed" Context: "Check detailed logs at: {LOG_FILE}" |
Task fails, may report individual failures | P2 |
| EXT Match task failure | "Error: EXT Matching failed" "Command failed after {duration} seconds with exit code {exit_code}" "Failed to process: {dir_name}" "All processing operations failed" Context: "Check detailed logs at: {LOG_FILE}" |
Task fails | P2 |
| Post-match task failure | "Error: Post-match failed" "Error: Failed processing {LAST_RUN} (exit code {exit_code})" "[ERROR] ActivatorPostMatchRun: No scenario found for label {label} interval_id={id}" Context: "Check detailed logs at: {LOG_FILE}" |
Task fails | P2 |
| Foretify execution failure | "Foretify run failed" "Test failed." "Test is not executed. Output files will be written to {path}" Context: "Foretify log can be found at: {log_path}" |
Task fails | P2 |
| No valid ingestion runs found | "Error: No valid ingestion run directories found in {INGESTION_RUNS}" "Error: No run directories found in {INGESTION_RUNS}" |
Task fails | P2 |
| Ingestion runs directory not found | "Error: Ingestion runs directory not found: {INGESTION_RUNS}" | Task fails | P2 |
| Unknown task type | "Unknown task: {task}" | Task skipped, evaluation may continue | P2 |
| Overall evaluation failure | "Evaluation tasks failed, LOG: {LOG_FILE}" | Script exits with non-zero code | P2 |