524. Dispatcher Configuration
The dispatcher is configured using a configuration file that sets environment variables within the dispatcher process. This file is named dispatcher.env, and its location is provided to the dispatcher executable using the -c command-line option.
The following table shows the dispatcher configuration settings.
| Name | Description |
|---|---|
| HTTP_PORT | The port used for the REST API. Foretify Manager connects to the dispatcher on this port. Default: 8081. |
| GRPC_PORT | The port used for the orchestrator API. Orchestrators, such as the Kubernetes orchestrator, communicate with the dispatcher through this port. Default: 8085. |
| DISCONNECTED_JOB_TIMEOUT | The time (in seconds) a job can remain active without being connected to any orchestrator before its status changes to "error." Default: 3600. |
| DEFAULT_ORCHESTRATOR_ID | The ID of the default orchestrator. If not specified, the first orchestrator in the list is used. |
| LOG_LEVEL | Specifies the logging verbosity level (error, warn, info, debug). Default: info. |
| LOG_DIRECTORY | The directory where log files are stored. Default: log. |
| LOG_MAX_DAYS | Determines the maximum number of days to retain log files. Log files are rotated daily. Leave blank to keep all log files. Default: 14. |
| LOG_TO_CONSOLE | Specifies whether logs should be written to the console. Default: true. |
| DATABASE_TYPE | Defines the type of database to use (postgresql, none). Default: postgresql. |
| DATABASE_URL | The connection string/URL of the database, typically pointing to the Foretify Manager PostgreSQL database. |
| SKIP_SCHEMA_VERSION_VALIDATION | A flag to indicate whether schema version validation for test runs uploaded by the dispatcher should be skipped. Setting this to true allows uploads without schema version restrictions. Default: false. |
| DISPATCHER_ID | The ID used to identify the dispatcher among others. This value must be unique per dispatcher. It is a required field. Used in the license permission flow. |
| SOFT_LICENSE_LIMIT | The soft license limit for the dispatcher. When the number of used licenses is below this value, the dispatcher is prioritized to acquire a permission license. The default value is 80. It must be a positive number less than or equal to TOTAL_LICENSES. This value can be the same or different for dispatchers sharing the licenses. Used in the license permission flow. |
| TOTAL_LICENSES | The total number of licenses available for all dispatchers. The default value is 100. It must be a positive number greater than or equal to the highest SOFT_LICENSE_LIMIT. This value must be the same for all dispatchers sharing the licenses. Used in the license permission flow. |
| LICENSE_USAGE_DATABASE_URL | The URL pointing to the shared database used by all dispatchers for license granting logic. This value must be the same for all dispatchers sharing the licenses. It is a required field. Used in the license permission flow. |
| LICENSE_USAGE_TIMEOUT | The idle timeout in seconds before resetting license usage for a dispatcher. The default value is 14400 (4 hours). It must be a positive number. This value must be the same for all dispatchers sharing the licenses. Used in the license permission flow. |
524.1 Kubernetes Orchestrator Configuration
The Kubernetes orchestrator is configured using a configuration file in JSON format. The structure of the configuration file is defined in configuration.proto, which is provided with the orchestrator installation.
524.1.1 General configuration options
The following table shows the general configuration settings.
| Name | Description |
|---|---|
| name | The name of the orchestrator. |
| id | A unique identifier for the orchestrator. If multiple orchestrators are configured, each must have a distinct ID. |
| path | A colon-separated list of paths to add to the PATH environment variable for all jobs. |
| foretifyJobRunTimeout | The default timeout (in seconds) for Foretify jobs. If set to 0, no run timeout is applied. This value can be overridden by individual Foretify job definitions. |
| pluginJobRunTimeout | The default run timeout (in seconds) for plugin jobs. If set to 0, no run timeout is applied. This value can be overridden by individual plugin job definitions. |
| foretifyJobMaximumRunTime | The maximum run time (in seconds) allowed for Foretify or plugin jobs by default. |
| maximumJobs | The maximum number of jobs that can run simultaneously in the cluster. |
| maximumJobRetries | The maximum number of retry attempts for a disrupted job. |
| licenseLimits | A mapping of licensed feature strings to their respective maximum number of simultaneous jobs allowed in the cluster. This prevents exceeding the limits of licensed features. |
| askLicensePermission | Enables the use of the dispatcher license permission flow. When set to true, the maximum_jobs setting is ignored, as job limits are now managed by the dispatcher’s SOFT_LICENSE_LIMIT. The orchestrator must still enforce job type limits and license limits. The default value is false. Used in the license permission flow. |
| toolsImage | The fully qualified name of the ftx_tools Docker image. See the Tools Image section below for more information. |
524.1.2 Kubernetes configuration options
The following table shows the Kubernetes configuration options.
| Name | Description |
|---|---|
| kubeconfig | Path to the kubeconfig file used to access the Kubernetes API or, if running within the cluster, set to "in-cluster" to load the in-cluster kubeconfig. |
| namespace | The namespace the Kubernetes orchestrator should use for all jobs and associated resources. |
| systemNodeLabel | Defines a node selector label used to prevent Foretify and plugin jobs from being assigned to system nodes. Choose a label used on all system nodes and set the key and value. See below for an example. |
| gpuNodeLabel | Defines a node selector label used to ensure that Foretify or plugin jobs that require a GPU get assigned to a node with a GPU. Choose a label used on all nodes with GPU and set the key and value. See below for an example. |
| cpuTolerations | Defines tolerations for jobs not requiring a GPU |
| gpuTolerations | Defines tolerations for jobs requiring a GPU |
| numberOfPriorities | The number of Kubernetes priorities to use. |
| basePriority | The base priority value. The orchestrator will use priority values in the range [basePriority, basePriority + numberOfPriorities) |
| maximumImagePullAttempts | The maximum number of times to retry pulling an image after an image pull error. |
| cpuLimitMultiplier | Multiplier used to set the CPU limit for all containers. A value of 0 indicates to not set a limit. A value >= 1 indicates to set the limit to cpu_request*value. |
| runAsNonRoot | If set to true, all containers will have the security context run_as_non_root field set true. |
| privileged | If set to true, all containers will have the security context privileged field set true. |
| Name | Description |
|---|---|
| kubeconfig | The path to the kubeconfig file used to access the Kubernetes API. If running within the cluster, set this to "in-cluster" to load the in-cluster kubeconfig. |
| namespace | The Kubernetes namespace that the orchestrator should use for all jobs and associated resources. |
| systemNodeLabel | Specifies a node selector label to prevent Foretify and plugin jobs from being assigned to system nodes. Choose a label that is used on all system nodes, and set both the key and value. See the example below for reference. |
| gpuNodeLabel | Specifies a node selector label to ensure that Foretify or plugin jobs requiring a GPU are assigned to nodes with GPUs. Choose a label that is applied to all GPU-equipped nodes, and set both the key and value. See the example below for reference. |
| cpuTolerations | Defines tolerations for jobs that do not require a GPU. |
| gpuTolerations | Defines tolerations for jobs that require a GPU. |
| numberOfPriorities | The number of Kubernetes priorities to use. |
| basePriority | The base priority value. The orchestrator uses priority values in the range [basePriority, basePriority + numberOfPriorities). |
| maximumImagePullAttempts | The maximum number of retry attempts for pulling an image after an image pull error occurs. |
| cpuLimitMultiplier | A multiplier used to set the CPU limit for all containers. A value of 0 means no limit is set. A value >= 1 sets the limit as cpu_request * value. |
| runAsNonRoot | If set to true, all containers will have their security context's run_as_non_root field set to true. |
| privileged | If set to true, all containers will have their security context's privileged field set to true. |
"systemNodeLabel": {
"key": "mode",
"value": "system"
}
"gpuNodeLabel": {
"key": "nvidia.com/gpu",
"value": "true"
}
524.1.3 Volumes options
The volumes section of the configuration file holds an array of volume definitions. The following are the typical volumes used:
-
jobs: The jobs directory (required).
-
shared: The shared directory (required).
A volume definition consists of the following options:
| Name | Description |
|---|---|
| name | The name of the volume. |
| localPath | The path to the directory on the Kubernetes orchestrator machine (if applicable). |
| podPath | The path where the directory will be mounted inside the pod. For jobs and shared volumes, this value should match the localPath value. |
| gpuOnly | Indicates whether the volume should be used exclusively with GPU nodes. |
| nfs | For volumes mounted using NFS, this section specifies the directory path on the NFS server and the server's hostname or IP address. |
| hostPath | For volumes mounted from the node, this section specifies the directory path on the node. |
| persistentVolumeClaim | For volumes that use a persistent volume claim (PVC), this section specifies the PVC. |
| configMap | For volumes that use a config map, this section specifies the config map. |
| s3 | For volumes referring to an AWS S3 bucket, this section specifies the AWS S3 bucket. |
| gcs | For volumes referring to a Google Cloud Storage (GCS) bucket, this section specifies the GCS bucket. |
| azureBlob | For volumes referring to an Azure Blob Storage, this section specifies the Azure blob. |
| azureFile | For volumes referring to an Azure File system, this section specifies the Azure File system. |
{
"name": "jobs",
"localPath": "/clusters/prod01/jobs",
"podPath": "/clusters/prod01/jobs",
"nfs": {
"path": "/jobs",
"server": "nfs-server-01"
}
}
{
"name": "nvidia",
"podPath": "/usr/local/nvidia",
"gpuOnly": true,
"hostPath": {
"path": "/usr/local/nvidia"
}
}
524.1.4 Environment Variables
The environment variables section holds an array of environment variable definitions. These environment variables will be set in the environment when a job runs.
| Name | Description |
|---|---|
| name | The name of the environment variable. |
| value | The value of the environment variable. |
| jobType | The job type (foretify or plugin) that the environment variable used with. An empty string indicates the environment variable should be used with both foretify and plugin jobs. |
524.1.5 Results
The Results options are as follows.
| Name | Description |
|---|---|
| useLocalDirectory | Set to true or false. Indicates whether Foretify job results should be written locally while the job is running, then copied to a permanent location. Enabling this option may reduce load on the shared file system. |
| volume | Defines the volume where job results should be copied once a job completes. Currently, the volume name should be set to jobs to indicate the jobs volume. |
524.1.6 Docker Registries
The Docker registries section defines one or more Docker registries. Foretify and plugin job definitions include the name of the Docker registry where their Docker image should be pulled from.
| Name | Description |
|---|---|
| name | The name of the Docker registry. This value is used in Foretify and plugin job definitions. |
| url | The URL of the Docker registry. |
| secret | If authentication is required for the Docker registry, a Kubernetes Docker registry secret should be created, and its name should be specified here. |
524.1.7 Foretify
The Foretify options are as follows.
| Name | Description |
|---|---|
| licenseServer | The hostname or IP address and port of the Foretify license server in the format: port@host. |
| defaultImage | The default Docker image name and tag to be used for Foretify jobs if not specified in the Foretify job definition. |
| computeRequirements | The default compute requirements for the Foretify container if not specified in the Foretify job definition. |
524.1.8 Plugin
The Plugin options are as follows.
| Name | Description |
|---|---|
| computeRequirements | The default compute requirements for the plugin job container, if not specified in the plugin job definition. |
524.1.9 Logging
The logging options are as follows.
| Name | Description |
|---|---|
| logToConsole | Set to true or false. Indicates whether the Kubernetes orchestrator should log to the console. |
| logDirectory | The path where log files will be written. Log files are rotated daily. |
| maximumDays | The maximum number of log files to keep. Set to 0 to keep all files. |
524.1.10 Job Configuration
Job configurations are defined as follows.
| Name | Description |
|---|---|
| name | The name of the job configuration. |
| tolerations | Tolerations applied for this configuration. |
| nodeLabels | Node selector labels applied for this configuration. |
| nodeLabelRequirements | Node selector requirements applied for this configuration. |
| maximumJobs | The maximum number of simultaneous jobs that can be created for this configuration. |
| podLabels | Labels to add to pods by using this configuration. |
| disabled | Indicates whether jobs using this configuration are prevented from running. Useful for disabling specific job types. |
524.1.11 Appendix
524.1.11.1 Tools Image
The tools image holds various utilities used by jobs, such as utilities for reading and writing to cloud storage. To build the tools image, run the following from within the Foretify Manager installation:
cd dispatcher/k8s-orchestrator/tools/ftx_tools
./build.sh
After building the Docker image, tag the image with the Docker registry used by Kubernetes, and then push it to the registry.
docker tag ftx_tools:latest <registry>/ftx_tools:latest
docker push <registry>/ftx_tools:latest
Finally, set the toolsImage field of the configuration file to the fully qualified image name (e.g. <registry>/ftx_tools:latest).