Testing Playground
The Testing Playground provides a flexible and robust framework for validating and profiling CLI commands that are being used for managing Ersilia models from various sources, such as GitHub, DockerHub, and local directories.
The Testing Playground only works with Linux systems, and is oriented towards Ersilia developers with experience in our tools.
TL:DR
To use the Test Playground, you need to have Ersilia installed in test mode and the Ersilia repository cloned into your local.
The Playground runs on a nox
environment, isolated from the local source. Each time we will activate a nox
session from the test/playground
folder in Ersilia:
The nox
command nox -s execute
will initiate a session. A few built-in flags can be passed to it:
-s
Required
Specifies the session to be run by Nox.
-p
Not required
Specifies the python environment. If none is specified, it will test everything in py3.8 to py3.12. More than one environment can be specified simply using nox -s execute -p 3.8 3.9
.
-fb
Not required
Used to change python backends (conda, mamba, micromamba, virtualenv, venv, uv, none). Defaults to conda.
-v
Not required
Verbose output printed in the terminal.
The first time you use Nox in your system you will be required to grant sudo privileges so that actions like DockerHub activation can be performed
Structure overview
The idea behind the playground is to cover all sorts of tests we might want to do on any model, more extensively than the test command itself. Therefore, it is a highly customizable functionality and only addressed to Ersilia developers.
Nox will create an isolated environment and store the files used for testing under ~/eos/playground/files
and the logs generated under ~/eos/playground/logs
. Those will be eliminated with the nox -s clean
command.
The playground is adapted to many use cases, for example:
Test several models on python 3.12 fetching them from github
Test one single model across all python versions fetching from dockerhub
Evaluate if a model seems to have gotten slower
...
Below we describe the flags you can use to combine all these custom-made tests.
Ersilia playground flags
--cli
all
Specifies ersilia commands to run in order (fetch, serve, run, catalog, example, test, close, delete). Default is all, which executes commands in this order: "fetch", "serve", "run", "close", "catalog", "example", "delete", "test".
nox -s execute -- --cli fetch serve run
nox -s execute -- --cli run
--fetch
--from_github
Fetches models from sources (from_github, from_dockerhub, from_s3, version)
nox -s execute -- --fetch from_s3
nox -s execute -- --fetch from_dockerhub version dev
--run
None
Run a model. It will try to fetch it from_dockerhub as this is Ersilia's default (depends on the activate_docker flag, see below). Best combined with the input and output flags (see below)
nox -s execute -- --run
--example
["-n", 10, "--random"]
Generates example input for a model (-n, --random, -f). If we specify a
nox -s execute -- --example -n 10 random/predefined -c -f example.csv
--catalog
["--more", "--local", "--as-json"]
Retrieves model catalog from local or hub.
nox -s execute -- --catalog hub
--test
["--shallow", "--from_github"]
Tests models at different levels (shallow and deep) and from_github, from_dockerhub or from_s3.
nox -s execute -- --test deep from_dockerhub/from_s3/from_github
--delete
None
Used to delete models. It has only one flag: all
nox -s execute -- --delete all
Additional flags for Ersilia's CLI
--outputs
[results.{csv, json, h5}]
This is used with run command and used to specify output file types. Note that we only specified the file name, the path will be automatically set to ~/eos/playground/files/{file_name.{csv, json, h5}}
nox -s execute -- --outputs result.csv result.h5
--input_types
List of (str, list, csv)
This is also used with run command to define input formats (str, list, csv).
nox -s execute -- --input_types str list csv
--runner
single
Specifies execution mode (single, multiple). The single mode is used to execute commands using one model whereas the multiple mode will use multiple models to execute the given commands.
nox -s execute -- --runner multiple
--single
eos3b5e
Used to specify or override the default model ID used for single running mode.
nox -s execute -- --single model_id
--multiple
[eos5axz, eos4e40, eos2r5a, eos4zfy, eos8fma]
Used to specify or override the default model IDs used for multiple running mode.
nox -s execute -- --multiple model_id1 model_id2
General setting flags
--activate_docker
true
Activates or deactivates Docker. It allows to test for example if autofetcher will decide not to fetch from Docker (default) when Docker is not active
nox -s execute -- --activate_docker false
--log_error
true
Enables or disables logging of errors as file, which will be stored in ~/eos/playground/logs/. Each command failures will create a standalone file, with datetime on it in a string format. For instance catalog_20250129_145802.txt
nox -s execute -- --log_error false
--silent
true
Enable or disable logs from ersilia command execution
nox -s execute -- --silent false
--show_remark
false
Displays a remark column in the final execution summary table that allows to quickly see if the ersilia commands are executed successfully
nox -s execute -- --show_remark true
--max_runtime_minutes
10
Sets the maximum execution time for a run command, to test model speed if seems to be slow.
nox -s execute -- --max_runtime_minutes 5
--num_samples
10
Sets the sample size to create input for run
command.
nox -s execute -- --num_samples 5
Last updated
Was this helpful?