Experiments and Experiment Runs¶

palaestrAI’s Execution Model¶

Whenever agents act in palaestrAI, it is as part of an experiment. palaestrAI aims to make actions of agents and states of environments reproducible: Defined sets of (hyper-) parameters, studies on parameters, conditions for terminating runs, and multi-phase setups: palaestrAI’s philosophy is less that of coding and more that of experimentation. This also means that you will usually not have to write extensive (Python) code in order to set up agent trainings, or test runs. Instead, you will mostly be able to configure palaestrAI through experiment documents, whose purpose and syntax this section describes.

There is a distinction in palaestrAI between:

experiments, and
experiment runs.

Experiment runs describe what palaestrAI will do: Environments with their configurations, in which agents with concrete hyperparameter settings act, termination conditions, etc. An experiment run is a concrete recipe for training policies and generating data. Each execution of an experiment run will yield the same results, i.e., experiment runs are reproducible.

However, an experiment run by itself is not a complete answer to a scientific question. Perhaps you’d like to train the agent on a different set of hyperparameters, or in an environment with a slightly different configuration? Perhaps you’d like to evaluate different pairings of agents and algorithms? All these variations are part of one big experiment.

Experiments allow to design series of runs. Based on the design of experiments philosophy, it allows to define variations of factors. Therefore, one experiment document spawns at least one, but typically many experiment run documents.

An experiment document is fed to the arsenAI command. Running arsenai generate experiment.yaml will read the experiment definition and create the apropriate number of experiment run documents. Those are YAML files, too, and are fed to palaestrAI: palaestrai experiment-start _outputs/experiment-*.yaml.

Schedules¶

Experiment runs are subdivided into phases. Each phase describe the execution of a set of agents with their sensor/actuator assignments in one or more environment(s). A popular choice for two phases is to first have a training phase, followed by a testing phase. Phases are executed sequentially and are identified by names given by the user, which must be unique in the scope of a schedule.

Loadable Modules: The “name-params” Block¶

Experiment documents (and experiment run documents, for that matter) make heavy use of referencing existing code modules, which are dynamically loaded by palaestrAI. For example, a particular environment from the Quickstart Guide implements the game of Tic-Tac-Toe; it is available via the palaestrai_environments.tictactoe.TicTacToeEnvironment class. To instruct palaestrAI to load this environment, the following YAML block is used:

name: palaestrai_environments.tictactoe:TicTacToeEnvironment
params:
    towplayer: true

The name key specifies the class with its module path. Module and class names are separated by a colon (:).

Under params, all keys/values are passed to the class’s constructor (as **kwargs). For example, the constructor TicTacToeEnvironment has a argument called twoplayer, which would receive the value of True upon initialization. All parameters are documented as the initialization parameters of the respective class and can be looked up in the documentation.

These blocks are common in experiment (run) documents and are referred to as “name-params” blocks.

Experiment Documents¶

Top-Level Attributes¶

uid: A user-defined name of the experiment. Must be unique accross all experiments stored in a particular database instance.
seed: A number to seed random-number generators with. Specifying the seed ensures reproducibility, as pseudo random number generators that are given the same seed will produce exactly the same sequence of random numbers.
version: The version of palaestrAI the experiment document was designed for. Ensures that changes in the software are alerted to the user; relevant are major and minor parts of the version number. E.g., 3.5.
output: Directory to which all generated experiment run documents are written. Releative to the working directory arsenai generate is called from.
repetitions: How many repetitions are scheduled for each parameter/factor configuration, generated as separate experiment run documents. Any integer number ≥ 1 is valid.
max_runs: Depending on the factors defined, the number of concrete experiment run documents can be quite high. The amount of experiment run documents generated can be controlled via this setting. If set high enough, a full factorial experimentation plan is generated; otherwise, latin hypercube sampling or a minimax space filling scheme is used.
definitions: Under the definitions key, blocks defining agents, environments, etc. are specified. (See next section for details.)
schedule: Combines the building blocks from the definitions section. (See the section on experiment schedules for details.)

Definitions¶

This section defines the building blocks for an experiment. It consists of the following elements, corresponding to the subsections below:

agents (Agents)
environments
sensors
actuators
simulation (Simulation)
phase_config
run_config

Agents¶

Defines agents, which includes the algorithms they employ, their respective objective function, as well as whether existing models should be loaded or offline-training should be attempted (the latter must be supported by the respective algorithm). This section does not define sensor/actuator assignments, as this is part of the design of experiments.

Each agent definition lives in its own block, which has a unique identifier, e.g.:

definitions:
  agents:
    agent_1:
      # (Agent definition)
    "Agent Two":
      # (...)

Agent definitions contain the following configuration options:

name

User-visible name of the agent; can be freely choosen and does not have to be unique, although it helps.

brain

The learning algorithm part of the agent; a name-params block

muscle

The rollout part of the agent algorithm; a name-params block

objective

The particular objective function of the agent; a name-params block

load

Specifies an existing model to be loaded; must be compatible with the algorithms defined in brain and muscle. The load section is optional: if not given, a new agent model will be created from scratch. This section contains the following keys:

agent: Unique identifier (not name) of the agent to be loaded
experiment_run: An experiment run during which the model was created;
phase: Number of the phase after which the model was saved.

replay

Allows to load trajectories from one or more other runs for the agent to offline train on. The replay section is optional. Under the replay key, a list of objects with the following keys can be given:

agent: Unique identifier (not name) of the agent to be loaded
experiment_run: An experiment run during which the model was created;
phase: Number of the phase after which the model was saved.

Simulation¶

Within an episode, at least one agents acts within one environment; but usually, there is more than one actor. When agents act and environments are updated is determinted by a simulation controller; flow control is handled by termination conditions.

Flow control definitions can be factors as part of an experiment design. Therefore, each definition has a (per-experiment) unique name. E.g.:

definitions:
  simulation:
    taking_turns:
      name: palaestrai.simulation:TakingTurns
      conditions:
        - name: palaestrai.experiment:EnvironmentTerminationCondition
          params: {}

Simulation definitions take the following parameters:

name: The “name” part of a name-params block. Note that there are is no separate “params” block, though.
conditions: A list of termination conditions, as name-params blocks.

Read the chapter on Simulation Flow Control to learn more about simulation controllers and termination conditions.