Developer.TACC.cloud

Developer documentation for Agave, Abaco, and other TACC APIs

Wrapper Templates

In order to run your application, you will need to create a wrapper template that calls your executable code. The wrapper template is a simple script that Agave will filter and execute to start your app. The filtering Agave applies to your wrapper script is to inject runtime values from a job request into the script to replace the template variables representing the inputs and parameters of your app.

The order in which wrapper templates are processed in HPC and Condor apps is as follows.

  1. environment variables injected.
  2. startupScript run.
  3. Scheduler directives prepended to the wrapper template.
  4. additionalDirectives concatenated after the scheduler directives.
  5. Custom modules concatenated after the additionalDirectives.
  6. inputs and parameters template variables replaced with values from the job request.
  7. Blacklist commands, if present, are disabled in the scripts.
  8. Resulting script is written to the remote job execution folder and executed.

The order in which wrapper templates are processed in CLI apps is as follows.

  1. Shell environment sourced
  2. environment variables injected
  3. startupScript run
  4. Custom modules prepended to the top of the wrapper
  5. inputs and parameters template variables replaced with values from the job request
  6. Blacklist commands, if present, are disabled in the scripts.
  7. Resulting script is forked into the background immediately.

Environment

Comes from the system definition. Handle in your script if you cannot change the system definition to suit your needs. Ship whatever you need with your app’s assets.

Modules

See more about Modules and Lmod. Can be used to customize your environment, locate your application, and improve portability between systems. Agave does not install or manage the module installation on a particular system, however it does know how to interact with it. Specifying the modules needed to run your app either in your wrapper template or in your system definition can greatly help you during the development process.

Default job macros

Agave provides information about the job, system, and user as predefined macros you can use in your wrapper templates. The full list of runtime job macros are give in the following table.

Variable Description
AGAVE_JOB_APP_ID The appId for which the job was requested.
AGAVE_JOB_ARCHIVE Binary boolean value indicating whether the current job will be archived after the wrapper template exits.
AGAVE_JOB_ARCHIVE_SYSTEM The system to which the job will be archived after the wrapper template exits.
AGAVE_JOB_ARCHIVE_URL The fully qualified URL to the archive folder where the job output will be copied if archiving is enabled, or the URL of the output listing
AGAVE_JOB_ARCHIVE_PATH The path on the archiveSystem where the job output will be copied if archiving is enabled.
AGAVE_JOB_BATCH_QUEUE The batch queue on the AGAVE_JOB_EXECUTION_SYSTEM to which the job was submitted.
AGAVE_JOB_EXECUTION_SYSTEM The Agave execution system id where this job is running.
AGAVE_JOB_ID The unique identifier of the job.
AGAVE_JOB_MEMORY_PER_NODE The amount of memory per node requested at submit time.
AGAVE_JOB_NAME The slugified version of the name of the job. See the section on Special Characters for more information about slugs.
AGAVE_JOB_NAME_RAW The name of the job as given at submit time.
AGAVE_JOB_NODE_COUNT The number of nodes requested at submit time.
AGAVE_JOB_OWNER The username of the job owner.
AGAVE_JOB_PROCESSORS_PER_NODE The number of cores requested at submit time.
AGAVE_JOB_SUBMIT_TIME The time at which the job was submitted in ISO-8601 format.
AGAVE_JOB_TENANT The id of the tenant to which the job was submitted.
AGAVE_JOB_ARCHIVE_URL The Agave url to which the job will be archived after the job completes.
AGAVE_JOB_CALLBACK_RUNNING Represents a call back to the API stating the job has started.
AGAVE_JOB_CALLBACK_CLEANING_UP Represents a call back to the API stating the job is cleaning up.
AGAVE_JOB_CALLBACK_ALIVE Represents a call back to the API stating the job is still alive. This will essentially update the timestamp on the job and add an entry to the job's history record.
AGAVE_JOB_CALLBACK_NOTIFICATION Represents a call back to the API telling it to forward a notification to the registered endpoint for that job. If no notification is registered, this will be ignored.
AGAVE_JOB_CALLBACK_FAILURE Represents a call back to the API stating the job failed. Use this with caution as it will tell the API the job failed even if it has not yet completed. Upon receiving this callback, Agave will abandon the job and skip any archiving that may have been requested. Think of this as kill -9 for the job lifecycle.

Input data

Agave will stage the files and folders you specify as inputs to your app. These will be available in the top level of your job directory at runtime. Additionally, the names of each of the inputs will be injected into your wrapper template for you to use in your application logic. Please be aware that Agave will not attempt to resolve namespace conflicts between your app inputs. That means that if a job specifies two inputs with the same name, one will overwrite the other during the input staging phase of the job and, though the variable names will be correctly injected to the wrapper script, your job will most likely fail due to missing data.

See the table below for fields that must be defined for an app’s inputs:

Field Mandatory Type Description
id X string This is the "name" of the file. You will use this in your wrapper script later whenever you need to refer to the BAM file being sorted
value.default string The path, relative to X, of the default value for the input
value.order integer Ignore for now
value.required X boolean Is specification of this input mandatory to run a job?
value.validator string Perl-format regular expression to restrict valid values
value.visible boolean When automatically generated a UI, should this field be visible to end users?
semantics.ontology array[string] List of ontology terms (or URIs pointing to ontology terms) applicable to the input format
semantics.minCardinality integer Minimum number of values accepted for this input
semantics.maxCardinality integer Maximum number of values accepted for this input
semantics.fileTypes X array[string] List of Agave file types accepted. Always use "raw-0" for the time being
details.description string Human-readable description of the input. Often implemented as contextual help in automatically generated UI
details.label string Human-readable label for the input. Often implemented as text label next to the field in automatically generated UI
details.argument string The command-line argument associated with specifying this input at run time
details.showArgument boolean Include the argument in the substitution done by Agave when a run script is generated

Variable injection

If you refer back to the app definition we used in the App Management Tutorial, you will see there are multiple inputs and parameters defined for that app. Each input and parameter object had an id attribute. That id value is the attribute name you use to associate runtime values with app inputs and parameters. When a job is submitted to Agave, prior to physically running the wrapper template, all instances of that id are replaced with the actual value from the job request. The example below shows our app description, a job request, and the resulting wrapper template at run time.

Type declarations

During the jobs submission process, Agave will store your inputs and parameters as serialized JSON. At the point that variable injection occurs, Agave will replace all occurrences of your input and parameter with their value provided in the job request. In order for Agave to properly identify your input and parameter ids, wrap them in brackets and prepend a dollar sign. For example, if you have a parameter with id param1, you would include it in your wrapper script as ${param1}. Case sensitivity is honored at all times.

Boolean values

Boolean values are passed in as truthy values. true = 1, false is empty.

Cardinality

Cardinality is not used in resolving wrapper template variables.

Parameter Flags

If your parameter was of type “flag”, Agave will replace all occurences of the template variable with the value you provided for the argument field.

App packaging

Agave API apps have a generalized structure that allows them to carry dependencies around with them. In the case below, package-name-version.dot.dot</em> is a folder that you build on your local system, then store in your Agave Cloud Storage in a designated location (we recommend /home/username/applications/app_folder_name). It contains binaries, support scripts, test data, etc. all in one package. Agave basically uses a very rough form of containerized applications (more on this later). We suggest you set your apps up to look something like the following:

package-name-version.dot.dot
|--system_name
|----bin.tgz (optional)
|----lib.tgz (optional)
|----include.tgz (optional)
|----test.sh
|----script.template
|----test_data (optional)
|----app.json

Agave runs a job by first transferring a copy of this directory into temporary directory on the target executionSystem. Then, the input data files (we’ll show you how to specify those are later) are staged into place automatically. Next, Agave writes a scheduler submit script (using a template you provide i.e. script.template) and puts it in the queue on the target system. The Agave service then monitors progress of the job and, assuming it completes, copies all newly-created files to the location specified when the job was submitted. Along the way, critical milestones and metadata are recorded in the job’s history.

Agave app development proceeds via the following steps:

  1. Build the application locally on the executionSystem
  2. Ensure that you are able to run it directly on the executionSystem
  3. Describe the application using an Agave app description
  4. Create a shell template for running the app
  5. Upload the application directory to a storageSystem
  6. Post the app description to the Agave apps service
  7. Debug your app by running jobs and updating the app until it works as intended
  8. (Optional) Share the app with some friends to let them test it

Application metadata

Field Mandatory Type Description
checkpointable X boolean Application supports checkpointing
defaultMemoryPerNode integer Default RAM (GB) to request per compute node
defaultProcessorsPerNode integer Default processor count to request per compute node
defaultMaxRunTime integer Default maximum run time (hours:minutes:seconds) to request per compute node
defaultNodeCount integer Default number of compute nodes per job
defaultQueue string On HPC systems, default batch queue for jobs
deploymentPath X string Path relative to homeDir on deploymentSystem where application bundle will reside
deployementSystem X string The Agave-registered STORAGE system upon which you have write permissions where the app bundle resides
executionSystem X string An Agave-registered EXECUTION system upon which you have execute and app registration permissions where jobs will run
helpURI X string A URL pointing to help or description for the app you are deploying
label X string Human-readable title for the app
longDescription string A short paragraph describing the functionality of the app
modules array[string] Ordered list of modules on systems that use lmod or modules
name X string unique, URL-compatible (no special chars or spaces) name for the app
ontology X array[string] List of ontology terms (or URIs pointing to ontology terms) associated with the app
parallelism X string Is your application capable of using more than a single compute node? (SERIAL or PARALLEL)
shortDescription X string Brief description of the app
storageSystem X string The Agave-registered STORAGE system upon which you have write permissions. Default source of and destination for data consumed and emitted by the app
tags array[string] List of human-readable tags for the app
templatePath X string Path to the shell template file, relative to deploymentPath
testPath X string Path to the shell test file, relative to deploymentPath
version X string Preferred format: Major.minor.point integer values for app

:warning: The combination of name and version must be unique the entire iPlant API namespace.

Parameter metadata

Field Mandatory Type Description
id X string This is the "name" of the parameter. At runtime, it will be replaced in your script template based on the value passed as part of the job specification
value.default string If your app has a fixed-name output, specify it here
value.order integer Ignore for now. Supports automatic generation of command lines.
value.required boolean Is specification of this parameter mandatory to run a job?
value.type string JSON type for this parameter (used to generate and validate UI). Valid values: "string", "number", "enumeration", "bool", "flag"
value.validator string Perl-formatted regular expression to restrict valid values
value.visible boolean When automatically generated a UI, should this field be visible to end users?
semantics.ontology array[string] List of ontology terms (or URIs pointing to ontology terms) applicable to the parameter. We recommend at least specifying an XSL Schema Simple Type.
details.description string Human-readable description of the parameter. Often used to create contextual help in automatically generated UI
details.label string Human-readable label for the parameter. Often implemented as text label next to the field in automatically generated UI
details.argument string The command-line argument associated with specifying this parameter at run time
details.showArgument boolean Include the argument in the substitution done by Agave when a run script is generated

Output metadata

Field Mandatory Type Description
id X string This is the "name" of the output. It is not currently used by the wrapper script but may be in the future
value.default string If your app has a fixed-name output, specify it here
value.order integer Ignore for now
value.required X boolean Is specification of this input mandatory to run a job?
value.validator string Perl-format regular expression used to match output files
value.visible boolean When automatically generated a UI, should this field be visible to end users?
semantics.ontology array[string] List of ontology terms (or URIs pointing to ontology terms) applicable to the output format
semantics.minCardinality integer Minimum number of values expected for this output
semantics.maxCardinality integer Maximum number of values expected for this output
semantics.fileTypes X array[string] List of Agave file types that may apply to the output. Always use "raw-0" for the time being
details.description string Human-readable description of the output
details.label string Human-readable label for the output
details.argument string The command-line argument associated with specifying this output at run time (not currently used)
details.showArgument boolean Include the argument in the substitution done by Agave when a run script is generated (not currently used)

:information_source: Note: If the app you are working on doesn’t natively produce output with a predictable name, one thing you can do is add extra logic to your script to take the existing output and rename it to something you can control or predict.

Tools and Utilities

  1. Stumped for ontology terms to apply to your Agave app inputs, outputs, and parameters? You can search EMBL-EBI for ontology terms, and BioPortal can provide links to EDAM.
  2. Need to validate JSON files? Try JSONlint or JSONparser