Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents
outlinetrue

Package Software in Cytomine-Core

Interactions between components

Data on Cytomine (annotations, properties, terms, ...) generally come from two different sources:

  1. In an human usage, a user (e.g. an expert) identifies interesting structures on images and adds so-called user annotations or other content using the web client interface.
  2. It is sometimes very useful to get the help of a program to annotate or create new content automatically. 

In the second case, a human user asks for the execution of an available program (Software). This execution (Job) is determined by its arguments (JobParameter), that is, values for parameters required by the program (SoftwareParameter). The job is run on a particular execution platform (ProcessingServer) and interacts with the Cytomine using the REST API, meaning that the job must authenticate itself. A new special user (UserJob) that inherits authorities from the human user that asked for execution is created. To be available in a project, a software must be linked to it (SoftwareProject).

The files required to execute the software (algorithm's code, ...) is retrieved from a VCS (version content system such as Git) provider (e.g Github) (SoftwareUserRepository).

The annotations produced by jobs are called algo annotations and belongs to a user job.


DomainDescriptionExample
SoftwareA representation of an executable program

Sample Detector: A script using adaptive threshold to identify samples in a whole-slide image

SoftwareParameterA parameter of the given software.Integer imageID
SoftwareParameterConstraintA constraint that an instance of the software parameter (i.e. a job parameter) must satisfy.imageID > 0
ParameterConstraintA generic constraint to apply to a software parameter.x > y
SoftwareUserRepositoryA repository from a provider (e.g: Github) that contain the software code.A Github account with the code of SampleDetector
SoftwareProjectA link to make a given software available in a specific project.A link between SampleDetector and MyProject
ProjectA Cytomine project, containing images, annotations, users and jobs.MyProject: A project with whole-slide images
JobA particular execution of a software launched by a user.An execution at time T of SampleDetector
JobParameterA value given for a specific parameter for a specific job.imageID=123456
ProcessingServerAn execution server where the job will be launched.Local-server
UserJob

A special user created for a job. All data generated by the job belongs to this special user.

This user inherits authorities from human user that launched job.



Tip

In short, it is essential to understand the distinction between a Software and a Job. See a Software as a program or a function and a job, an instanciation of this program or function with arguments (parameter values) which is runned on a particular machine, i.e. a ProcessingServer. It can be the local server (Cytomine server) or a distant one in a data-center, in the cloud...

Software details

Software unicity constraint is checked by the couple (name, version) meaning that a same algorithm with several versions results in several software with the same name but a different version. Each software should be tagged with a version number to ensure reproducibility of results. When a new version of a software is available, previous version is set to deprecated and the new release is added into Cytomine as a new software. This new software is automatically added into projects having the previous version.

It is only accepted to have a software with version number in the case where you are developping a new software. In this case, the software won't be executable as the code is stored on your local machine.

Example:

Software nameSoftware versionComment
MySoftnull

development version. Not executable.

MySoft1.0the first release of MySoft. Executable.
MySoft1.1the second release of MySoft. Executable. Deprecate Mysoft (1.0)


The status that a software can take are:

StatusExplanationWeb UI representation
executable

True if software resource has an executable command.

In practice, only under developmeent software shouldn't be executable.

deprecatedTrue if software with a given version is not the latest release.

Job details

A Cytomine job lifecycle can be summarized with the following status:

StatusExplanation
not launchedTrue if the job has been asked for execution from Cytomine-Core but Cytomine-software-router doesn't have handled the request yet.
waitingTrue if the job is being transfered to its processing server.
in queueTrue if the job is in the queue of its processing server.
runningTrue if the job is running. It is the responsability of software maintainer to set status to running in the running script.
successTrue if the job has been finished successfully.
errorTrue if an error occurred during execution (in queue or running).
killedTrue if software has been killed. Killing a software is only possible if it is in queue or running.


Tip

When the status switches from running to success/error/killed, the log (standard output of running job) is linked to the job as an attached file and can be downloaded from web interface.

Software router architecture

The software router is a component external to Cytomine-Core and is responsible to 

  • automate the adding of new softwares from trusted repositories and manage their new versions
  • communicate with distant servers to manage running jobs

Communication between Cytomine-Core and Cytomine-software-router is done through AMQP queues, an open standard protocol for message-oriented middleware. 

Software management in software router

One of the major role of the software router is to automate the adding of new algorithms. To accomplish this, a thread is pulling at a defined interval the list of built Docker containers (each container represents an algorithm). Older versions are replaced by newer version detected by the pulling mechanism. If a new version is detected the algorithm descriptor will be retrieved from the corresponding GitHub repository and will be used to add the algorithm (execution command, arguments, …) to the Cytomine interface. The descriptor is an adaptation of existing Boutiques descriptors with additional features proper to Cytomine.

The images used to run a job are pulled from Docker Hub and converted as a Singularity image. This technology allows you to benefit from containers in the HPC world. The images will be refreshed at the same time as the algorithms and will be transferred to a specific processing server via SCP.

Job management in software router

The execution of algorithms can be done by using a default processing server or another one. The architecture provides a slurm container instance for local execution. For each execution demand, the request will be sent from the Cytomine-Core to the software router through a specific queue associated to the chosen processing server. The execution command will be transformed by a processing method to be understandable by a specific type of processing server (GPU, CPU-only, …). The log file will be retrieved via SCP and be added as an attached file to the job domain.

The package ProcessingMethod contains all the implemented processing methods associated with processing servers. The role of this package is to build a specific command understandable by the targeted processing server. Currently, only SLURM processing method is implemented, meaning that the router is able to convert a Cytomine job execution request into a regular SLURM job. The package has been designed to easily add implementation for other processing methods such as other job schedulers or related (Kubernetes, ...)