Task Monitor

Applications submitted through the HPC Gateway are tracked with the Task Monitor tool. Observe the progress of each job through the system, and plot simulation results with the dynamic Data Monitor feature.



Select Task Monitor from the start menu tool list.

This window displays all job that have been run by the connected account. Use the controls at the bottom of the pane to move the context across the collection of job entries.

ASIDE:
HPC Gateway sustains a permanent store of all activity that passes though its platform. This provides the raw material for comprehensive analytics and accumulation of business intelligence information.

Authorised accounts may also have restricted views on the jobs of other users. This view will at most be limited to the run status of the job. The Task Monitor does not grant access to the data files of jobs not owned by the connected account.

The columns of the Task Monitor can be hidden and reordered. Entries can be sorted according to the values in a selected column.



Clicking on the arrowhead next to the task name will reveal further components of the full task. In practice each HPC Gateway task can comprise up to three individual batch jobs representing phases of the overall task.

  1. Prolog phase – A serial job generally used for validation of inputs and environment, checking available licenses, etc. If this stage generates an abort condition then the following stages will not be run.
  2. Execute phase – The main body of the task comprising the core application execution. This jobs can be run in parallel mode.
  3. Epilog phase – A serial job for cleanup, file archiving, automated reporting, etc.

The Execute phase is generally the minimum component of a task. Other phases can be considered optional.

Note that the ID for the overall task is different from that of its component jobs, and that each phase ID is different.

  • The task ID is an attribute assigned by the Gateway.
  • The component phase ID's are the job identifiers assigned by the underlying batch resource manager.



Right-click on a Task Monitor entry will show the actions that can be performed.

The number of options can vary, though there are two basic actions:

  • Open Task Details reveals a new window that gives detailed system information for the jobs of the task. A double-click on the task entry performs the same action.
  • Open data monitor displays a table and charting tools for specific data items extracted from result files of the job. This features is separately detailed below.

The Task Details windows has several sections.

Information section

Summary attributes of the complete task. Recognise the name and description that would have been provided in the task profile.

Application Parameters section

Shows the input parameters as provided in the task profile. This is useful to validate that the task being monitored corresponds to what was expected at submission time.

File type parameters come with a direct file selector to further validated that the input data was as intended.

Output parameters are also shown where defined. The presence of explicit output streams depends on the high-level function of the method when designed. Output file streams can be useful to copy result files to a specific location at the end of the task, eg. an archive.

Execution Details section

Jobs:
Lists the component phases of the task.

  • Click the Open Job Details icon (right) to see the full system details for the individual job.
  • Click the Open <name>.out icon (left) to reveal the actual standard output file from the job script.

Context:
Provides the job timing and scheduler options. It also gives a file selector to open the directory containing all job log files – batch logs, standard output, standard error, status files.

Running information:
Gives low-level details that are not normally relevant to standard job monitoring.



Application methods developed for the HPC Gateway can incorporate a data extraction script to locate and tabulate key metrics from result files of the simulation executable. This tabular data is displayed directly within the Data monitor. For running jobs the table is automatically updated at regular intervals with any new data detected.

The user can then easily setup charts to compare one or more metrics. Usually a simulated property would be plotted on the vertical access against iteration or time on the horizontal. Multiple curves may be defined in the same plot, though they should have comparable scales.




Return to topics page if followed as part of the training programme.

Return to Quickstart page.