# Task descriptor reference
WARNING
App Engine and its Task system are still in BETA. Therefore, all information presented in this documentation is subject to potentially breaking changes.
The Task descriptor is a YAML file that documents in a declarative way what the Task is and does. It serves as a contract between the developer and the App Engine, detailing:
- How the App Engine should run the Task
- What the Task expects as inputs
- What the App Engine should expect as outputs
Additionally, it provides general information about the Task, such as its name, author, and identification details.
The structure of the Task is dictated by a JSON schema available at https://cytomine.com/schemas/appengine/task.latest.json.
# Table of content
- General information
- Technical configuration
- Inputs and Outputs
inputs
inputs.{PARAM}.description
inputs.{PARAM}.display_name
inputs.{PARAM}.default
inputs.{PARAM}.optional
inputs.{PARAM}.type
inputs.{PARAM}.dependencies.matching
outputs
outputs.{PARAM}.description
outputs.{PARAM}.display_name
outputs.{PARAM}.type
outputs.{PARAM}.dependencies.derived_from
outputs.{PARAM}.dependencies.matching
- Types
- Parameter reference
- Parameter dependencies
- Memory unit
# General information
# $schema
Description: Base JSON schema reference. Should be one of the Cytomine Task reference schema.
Type | string |
Required | true |
Format | uri |
# name
Description: The name of the Task.
Type | string |
Required | true |
Format | Must match regex ^[a-zA-Z0-9_\s\-]+$ |
Min length | 3 |
Max length | 30 |
# namespace
Description: The namespace of the Task, must follow the reverse domain name notation. Associated with the Task version, it must uniquely identify the Task.
Type | string |
Required | true |
Format | Must match regex ^[a-zA-Z0-9_]*(\.[a-zA-Z0-9_-]+)+$ |
# version
Description: The version of the Task. Associated with the Task namespace, it must uniquely identify the Task.
Type | string |
Required | true |
Format | Semantic versionning (opens new window) |
# description
Description: A text description of the Task.
Type | string |
Required | false |
Default | "" |
Max length | 2048 |
# authors
Description: Lists the authors of the Task.
Type | array |
Required | true |
Min items | 1 |
Example:
authors:
- first_name: John
last_name: Doe
email: jd@email.com
- first_name: Jane
last_name: Doe
email: jad@email.com
organization: JaD Corp.
is_contact: true
2
3
4
5
6
7
8
9
The underlying author objects must have at least one property other than is_contact
defined.
# authors.[].first_name
Description: This author's first name.
Type | string |
Required | false |
# authors.[].last_name
Description: This author's last name.
Type | string |
Required | false |
# authors.[].organization
Description: This author's organization name.
Type | string |
Required | false |
# authors.[].email
Description: This author's email address.
Type | string |
Required | false |
# authors.[].is_contact
Description: Whether this author is a contact person for the Task.
Type | boolean |
Required | false |
# external
Description: Lists external references (source code, scientific articles,...).
Type | object |
Required | false |
Example:
external:
source_code: https://github.com/cytomine/my-awesome-task
dois:
- https://doi.org/10.1093/bioinformatics/btw013
2
3
4
# external.source_code
Description: URI pointing to the source code of the Task.
Type | string |
Required | false |
Format | uri |
# external.doi
Description: List of DOIs relevant to the Task.
Type | array of string[uri] |
Required | false |
All configuration properties are under a configuration
object.
# Technical configuration
# configuration.input_folder
Description: Full path where the input folder must be mounted inside the container
Type | string |
Required | false |
Default | "/inputs" |
Format | path |
# configuration.output_folder
Description:
Type | string |
Required | false |
Default | "/outputs" |
Format | path |
# configuration.image
Description: information about where the Docker image can be found and in which format
Default behavior if unspecified: App Engine looks for an image.tar
file at the root of the Task archive bundle.
Type | object |
Required | false |
Example:
configuration:
image:
file: /directory/image.tar
2
3
# configuration.image.file
Description: The absolute path to the Docker image archive relative to the Task archive bundle root.
Type | string |
Required | false |
Default | "/image.tar" |
Format | path |
# configuration.resources.ram
Description: The minimum amount of RAM that the Task requires.
Type | integer |
Required | false |
Default | "1GiB" |
Format | Memory unit |
# configuration.resources.gpus
Description: The number of GPUs that the Task requires.
Type | integer |
Required | false |
Default | 0 |
# configuration.resources.cpus
Description: The number of CPU cores that the Task requires.
Type | integer |
Required | false |
Default | 1 |
# configuration.resources.internet
Description: Whether the Task requires internet access, or not.
Type | boolean |
Required | false |
Default | false |
# Inputs and outputs
# inputs
Description: The set of input parameters of the Task.
Type | object |
Required | true |
Each input parameter must be provided as a properties of inputs
where the property name is the identifier of the parameter (PARAM
) and must match the following regex: ^[a-zA-Z0-9_]+$
.
# inputs.{PARAM}.description
Description: A text description of the parameter.
Type | string |
Required | true |
# inputs.{PARAM}.display_name
Description: A human-readable name for the parameter
Type | string |
Required | false |
# inputs.{PARAM}.default
Description: Default value for the parameter. Only available for some types (see Types).
Type | depends on inputs.{PARAM}.type |
Required | false |
# inputs.{PARAM}.optional
Description: Whether or not this parameter is optional.
Type | boolean |
Required | false |
Default | false |
# inputs.{PARAM}.type
Description: Type of the parameter.
Type | object|string |
Required | true |
Format | Types |
# inputs.{PARAM}.dependencies.matching
Description: References to other parameters (of type array
) that matches this parameter. See matching
dependency.
Note: Only supported for parameters with the array
.
Type | array of string |
Required | false |
Format | Parameter reference |
# outputs
Description: The set of output parameters of the Task.
Type | object |
Required | true |
Each output parameter must be provided as a properties of outputs
where the property name is the identifier of the parameter (PARAM
) and must match the following regex: ^[a-zA-Z0-9_]+$
.
# outputs.{PARAM}.description
Description: A text description of the parameter.
Type | string |
Required | true |
# outputs.{PARAM}.display_name
Description: A human-readable name for the parameter
Type | string |
Required | false |
# outputs.{PARAM}.type
Description: Parameter type, see Types.
Type | object|string |
Required | true |
Format | Types |
# outputs.{PARAM}.dependencies.derived_from
Description: Reference to a input parameter this output parameter is derived from. See derived_from
dependency.
Type | string |
Required | false |
Format | Parameter reference (input only) |
# outputs.{PARAM}.dependencies.matching
Description: References to other parameters (of type array
) that matches this parameter. See matching
dependency.
Note: Only supported for parameters with the array
.
Type | array of string |
Required | false |
Format | Parameter reference |
# Types
App Engine Task system supports a wide array of types for inputs and outputs. This section presents how these type are documented in the descriptor.
In general, types have a string identifier (e.g. integer
). They may or may not support default values. Typically, default values are only supported by simple primitive types. They may or may not support short forms: a short form is a way to define the type with only its identifier rather than a whole YAML object. This is the case when all type properties of a parameter can be left to their defaults.
Example of short form vs. long form:
inputs:
short_form_param:
# ...
type: integer
long_form_param:
# ...
type:
id: integer
2
3
4
5
6
7
8
9
# Type boolean
Description: Type for a boolean parameter.
Identifier | boolean |
Supports default | true |
YAML type for default | boolean |
Short form | true |
Examples:
# type {BOOLEAN}.type.id
Description: Identifier of the type.
Type | string |
Required | true |
Format | constant "boolean" |
# Type integer
Description: type for an integer number
Identifier | integer |
Supports default | true |
YAML type for default | integer |
Short form | true |
Available constraints:
Examples:
# {INTEGER}.type.id
Description: Identifier of the type.
Type | string |
Required | true |
Format | constant "integer" |
# {INTEGER}.type.lt
Description: Constrains the integer parameter to an exclusive upper bound. This constraint is mutually exclusive with leq
.
Type | integer |
Required | false |
# {INTEGER}.type.leq
Description: Constrains the integer parameter to an inclusive upper bound. This constraint is mutually exclusive with lt
.
Type | integer |
Required | false |
# {INTEGER}.type.gt
Description: Constrains the integer parameter to an exclusive lower bound. This constraint is mutually exclusive with geq
.
Type | integer |
Required | false |
# {INTEGER}.type.geq
Description: Constrains the integer parameter to an inclusive lower bound. This constraint is mutually exclusive with gt
.
Type | integer |
Required | false |
# Type number
Description: Type for a floating point number.
Identifier | number |
Supports default | true |
YAML type for default | number |
Short form | true |
Available constraints:
lt
: less thanleq
: less or equalgt
: greater thangeq
: greater or equalinfinity_allowed
nan_allowed
Examples:
# {NUMBER}.type.id
Description: Identifier of the type.
Type | string |
Required | true |
Format | constant "number" |
# {NUMBER}.type.lt
Description: Constrains the number parameter to an exclusive upper bound. This constraint is mutually exclusive with leq
.
Type | integer |
Required | false |
# {NUMBER}.type.leq
Description: Constrains the number parameter to an inclusive upper bound. This constraint is mutually exclusive with lt
.
Type | integer |
Required | false |
# {NUMBER}.type.gt
Description: Constrains the number parameter to an exclusive lower bound. This constraint is mutually exclusive with geq
.
Type | integer |
Required | false |
# {NUMBER}.type.geq
Description: Constrains the number parameter to an inclusive lower bound. This constraint is mutually exclusive with gt
.
Type | integer |
Required | false |
# {NUMBER}.type.infinity_allowed
Description: Whether or not this parameter accepts inf
(infinity) as a valid floating point number.
Type | boolean |
Required | false |
Default | false |
# {NUMBER}.type.nan_allowed
Description: Whether or not this parameter accepts nan
(not a number) as a valid floating point number.
Type | boolean |
Required | false |
Default | false |
# Type string
Description: Type for a sequence of characters.
Identifier | string |
Supports default | true |
YAML type for default | string |
Short form | true |
Available constraints:
Examples:
# {STRING}.type.id
Description: Identifier of the type.
Type | string |
Required | true |
Format | constant "string" |
# {STRING}.type.min_length
Description: Constrains the length of this string parameter to be at least a minimum number of characters (inclusive bound). If omitted, the string can be empty.
Type | integer |
Required | false |
Default | 0 |
# {STRING}.type.max_length
Description: Constrains the length of this string parameter to be at least a minimum number of characters (inclusive bound).
Type | integer |
Required | false |
# Type enumeration
Description: Type for an enumeration of fixed values.
Identifier | enumeration |
Supports default | true |
YAML type for default | string |
Short form | false |
Note: default value of the parameter, if any, must match one the of enumeration values
# {ENUM}.type.id
Description: Identifier of the type.
Type | string |
Required | true |
Format | constant "enumeration" |
# {ENUM}.type.values
Description: List of accepted values for this enumeration parameter.
Type | array of string |
Required | true |
Constraints for the enumeration values:
- Min length: 1
- Max length: 256
- Format: matching regex
^[^\\r\\n]+$
# Type geometry
Description: Type for a geometry parameter.
Identifier | geometry |
Supports default | false |
Short form | true |
No available constraint.
Examples:
# Type file
Description: Type for a file parameter.
Identifier | file |
Supports default | false |
Short form | true |
Available constraints:
Examples:
# {FILE}.type.id
Description: Identifier of the type.
Type | string |
Required | true |
Format | constant "file" |
# {FILE}.type.max_file_size
Description: Constrains the file to a maximum size. If omitted, no file size constraint applied.
Type | string |
Required | false |
Format | Memory unit |
# Type image
Description: Type for an image parameter.
Identifier | image |
Supports default | false |
Short form | true |
Available constraints:
Examples:
# {IMG}.type.id
Description: Identifier of the type.
Type | string |
Required | true |
Format | constant "image" |
# {IMG}.type.max_file_size
Description: Constrains the image file to a maximum size. If omitted, no file size constraint applied.
Type | string |
Required | false |
Format | Memory unit |
# {IMG}.type.max_width
Description: Constraints the image maximum accepted width in pixel. If omitted, no width constraint.
Type | integer |
Required | false |
# {IMG}.type.max_height
Description: Constraints the image maximum accepted height in pixel. If omitted, no height constraint.
Type | integer |
Required | false |
# {IMG}.type.formats
Description: Constraints the image format to one of the listed. If omitted, all plain image formats supported by the App Engine are accepted:
png
jpeg
tiff
(RGB, planar)
Type | array of string |
Required | false |
Default | ['png', 'jpeg', 'tiff'] |
# Type wsi
Description: Type for an whole slide image parameter.
Identifier | wsi |
Supports default | false |
Short form | true |
Available constraints:
Examples:
# {WSI}.type.id
Description: Identifier of the type.
Type | string |
Required | true |
Format | constant "wsi" |
# {WSI}.type.max_file_size
Description: Constrains the whole slide image to a maximum size. If omitted, no file size constraint applied.
Type | string |
Required | false |
Format | Memory unit |
# {WSI}.type.max_width
Description: Constraints the whole slide image maximum accepted width in pixel. If omitted, no width constraint.
Type | integer |
Required | false |
# {WSI}.type.max_height
Description: Constraints the whole slide image maximum accepted height in pixel. If omitted, no height constraint.
Type | integer |
Required | false |
# {WSI}.type.formats
Description: Constraints the wsi format to one of the listed. If omitted, all wsi formats supported by the App Engine are accepted:
dicom
tiff
(Pyramidal)
Type | array of string |
Required | false |
Default | ['dicom', 'tiff'] |
# Type array
Description: Type for an array parameter.
Identifier | array |
Supports default | false |
Short form | false |
Available constraints:
Examples:
# {ARRAY}.type.id
Description: Identifier of the type.
Type | string |
Required | true |
Format | constant "array" |
# {ARRAY}.type.subtype
Description: Type of the underlying array items.
Type | object|string |
Required | true |
Format | Types |
Note: the subtype
field follows the same specifications as the parameter type
field.
# {ARRAY}.type.min_size
Description: Constrains the number of items in the array to a minimum number (inclusive bound). If omitted, minimum size is 0
.
Type | integer |
Required | false |
Default | 0 |
# {ARRAY}.type.max_size
Description: Constrains the number of items in the array to a maximum number (inclusive bound). If omitted, no maximum size constraint applied.
Type | integer |
Required | false |
# Parameter reference
Some descriptor properties require to specify a reference to another input or output parameter. A reference is encoded using the following format:
(inputs|outputs)/{PARAMETER_ID}
Examples:
inputs/my_param
: a reference to an input parametermy_param
outputs/my_param2
: a reference to an output parametermy_param2
# Parameter dependencies
Because the relations between parameters of a Task are hidden in the Task implementation, it is not possible to document these dependencies without explicitely declaring them in the descriptor. Therefore, the descriptor specification supports a set of pre-defined dependencies:
derived_from
matching
In general, interpretation of this dependency is left for the platform (e.g. Cytomine) which can decided or not to use it.
# Dependency: derived_from
The derived_from
dependency can only be associated with an output parameter and indicates that this parameter is derived from a specified input parameter. The declared dependency should reference this input parameter using a parameter reference (see specification in descriptor reference).
Use case: for Cytomine to properly display the annotations generated for an input image, it needs to know that the geometries produced by the Task are related to a certain input image
# Dependency: matching
The matching
dependency can only exist between array
parameters. It indicates that:
- array
A
and linked arrayB
must have the same size (i.e. number of items) when the Task is executed - item at index
i
in arrayA
and item at indexi
in arrayB
are matched together
The matching
dependency is
- bidirectional: the dependency only needs to be defined for one of the matching parameters to be considered valid for all matched parameters
- transitive: parameters
A
andB
and matching parametersB
andC
results in parametersA
andC
to be matched as well
This dependency is declared in the descriptor as documented in:
This dependency will help external systems to further interpret inputs/outputs that are matched together. Use cases:
- array of tuples encoded as matched arrays: array of geometries, their probabilities as an array of floats and their classes as another output array of enumeration. Each item of each collection contains one information for one prediction of the system
- the input array mapped with corresponding outputs: one input is an array of images and one output is an array of arrays of geometries where each array of geometries are the prediction for one image of the input array
# Memory unit
Some descriptor properties expect their value to represent an amount of memory of some kind (e.g. RAM availability, maximum file size, etc). In this section, we present how such information should be formatted. The memory information will be presented as:
{number} {unit}
where the unit
part is optional and the spacing between number
and unit
is also optional.
# Numeric value
The App Engine accepts both floating-point and integer numbers as the numeric value preceding the unit. After performing the unit conversion, the resulting value is rounded up to the nearest integer. The rounding occurs either at the bit or byte level, depending on the context.
Example (see units below):
Notation | In bytes | In bytes (rounded) | In bits | In bits (rounded) |
---|---|---|---|---|
2.25455 Kib | 2.25455 * 2^10 / 8 = 288.5824 | 289 | 2.25455 * 2^10 = 2308.6592 | 2309 |
# Units
The unit is composed of an optional (decimal or binary) prefix and a suffix.
Supported decimal prefixes:
Symbol | Name | Amount |
---|---|---|
none | / | 100 |
k | kilo | 103 |
M | mega | 106 |
G | giga | 109 |
T | tera | 1012 |
P | peta | 1015 |
E | exa | 1018 |
Z | zetta | 1021 |
Y | yotta | 1024 |
R | ronna | 1027 |
Q | quetta | 1030 |
Supported binary prefixes:
Symbol | Name | Amount |
---|---|---|
Ki | kibi | 210 |
Mi | mebi | 220 |
Gi | gibi | 230 |
Ti | tebi | 240 |
Pi | pebi | 250 |
Ei | exbi | 260 |
Zi | zebi | 270 |
Yi | yobi | 280 |
The supported unit suffixes indicate whether the value is in bytes or bits:
B
,Byte
,byte
→ byteb
,bit
→ bit
Examples:
Notation | In bytes | In bits |
---|---|---|
25KiB | 25 * 2^10 = 25600 | 25 * 2^13 = 204800 |
25Kib | 25 * 2^10 / 8 = 3200 | 25 * 2^10 = 25600 |
25KB | 25 * 10^3 = 25000 | 25 * 8 * 10^3 = 200000 |