Task submission mechanism

The picture illustrates the whole mechanism of the task submission:

  • the forge of the scripts
  • the submission as-the-user
  • the batch submission

The first part of the submission mechanism is the forge.

The forge consists in producing a task specific submission script for each phase (prolog, execute, epilog) of a task. Each one of the script will result into a specific job.

Submission script

The submission script is a standalone shell script that can be submitted to the batch scheduler.

The submission script is standalone, it does not have any dependency on another file (like the application script).
The submission script contains all the mechanisms to configure the environment for the application script and the application script itself.

The submission script is part of the scheduler configuration and it is stored on GridFS (ex: clusters/rnd01/forge)
See script_execute and script_common for the templates.

Application script

The application script is the script that contains the code that effectively launch an application.

Each application has different scripts for every phase and they can be written in shell or in any other interpreted language.

The application script is part of the application definition and it is stored on GridFS (ex: applications/675A865A765A65765A/app_execute)

Task parameters and inputs

The task parameters and inputs are the parameters that are given by the user when submitting the task.

The parameters and inputs are stored in the task JSON that it stored on the DB.

The parameters and inputs are user to replace the markup that appear in the submission scripts and the application scripts (ex: @@HPCG_RUN_DIR @@).

  "inputs": [
      {
          "autoCheck": false,
          "description": "The time the test script will sleep before exiting",
          "editable": true,
          "hidden": false,
          "label": "Sleep value",
          "labelVisible": true,
          "name": "sleep",
          "order": 1,
          "required": true,
          "strict": false,
          "type": "int",
          "valid": true,
          "value": 20
      },
      {
          "autoCheck": false,
          "description": "Select input files that will be copied in the run directory",
          "editable": true,
          "files": [
              {
                  "action": "COPY",
                  "mods": "----------",
                  "path": "/home/hpcgadmin/test",
                  "server": {
                      "id": "59b1302f6eea4a7e979b2c56",
                      "name": "rnd01"
                  },
                  "size": 42
              }
          ],
          "hidden": false,
          "label": "Input files",
          "labelVisible": true,
          "name": "input_files",
          "order": 2,
          "required": true,
          "type": "file",
          "valid": true
      }
  ]

Cluster environment

The cluster environment is a SHELL script stored in gridFS that contains all the environment to run the application on a specific cluster (typically the application home dir).

The cluster environment is part of the application configuration and it is stored on GridFS (ex: applications/675A865A765A65765A/cluster/rnd01)

The submission mediator is a python script that does the effective submission of the submission script to the batch system. The mediator waits for the completion of the submission and returns the job ID.