Experiments and Experiment Runs¶
palaestrAI’s Execution Model¶
Whenever agents act in palaestrAI, it is as part of an experiment. palaestrAI aims to make actions of agents and states of environments reproducible: Defined sets of (hyper-) parameters, studies on parameters, conditions for terminating runs, and multi-phase setups: palaestrAI’s philosophy is less that of coding and more that of experimentation. This also means that you will usually not have to write extensive (Python) code in order to set up agent trainings, or test runs. Instead, you will mostly be able to configure palaestrAI through experiment documents, whose purpose and syntax this section describes.
There is a distinction in palaestrAI between:
experiments, and
experiment runs.
Experiment runs describe what palaestrAI will do: Environments with their configurations, in which agents with concrete hyperparameter settings act, termination conditions, etc. An experiment run is a concrete recipe for training policies and generating data. Each execution of an experiment run will yield the same results, i.e., experiment runs are reproducible.
However, an experiment run by itself is not a complete answer to a scientific question. Perhaps you’d like to train the agent on a different set of hyperparameters, or in an environment with a slightly different configuration? Perhaps you’d like to evaluate different pairings of agents and algorithms? All these variations are part of one big experiment.
Experiments allow to design series of runs. Based on the design of experiments philosophy, it allows to define variations of factors. Therefore, one experiment document spawns at least one, but typically many experiment run documents.
An experiment document is fed to the arsenAI command. Running
arsenai generate experiment.yaml
will read the experiment definition and
create the apropriate number of experiment run documents. Those are YAML
files, too, and are fed to palaestrAI:
palaestrai experiment-start _outputs/experiment-*.yaml
.
Schedules¶
Experiment runs are subdivided into phases. Each phase describe the execution of a set of agents with their sensor/actuator assignments in one or more environment(s). A popular choice for two phases is to first have a training phase, followed by a testing phase. Phases are executed sequentially and are identified by names given by the user, which must be unique in the scope of a schedule.
Loadable Modules: The “name-params” Block¶
Experiment documents (and experiment run documents, for that matter) make
heavy use of referencing existing code modules, which are dynamically loaded
by palaestrAI. For example, a particular environment from the
Quickstart Guide implements the game of Tic-Tac-Toe; it is available via the
palaestrai_environments.tictactoe.TicTacToeEnvironment
class.
To instruct palaestrAI to load this environment, the following YAML block is
used:
name: palaestrai_environments.tictactoe:TicTacToeEnvironment
params:
towplayer: true
The name
key specifies the class with its module path. Module and class
names are separated by a colon (:
).
Under params
, all keys/values are passed to the class’s constructor
(as **kwargs
). For example, the constructor TicTacToeEnvironment
has
a argument called twoplayer
, which would receive the value of True
upon initialization. All parameters are documented as the initialization
parameters of the respective class and can be looked up in the documentation.
These blocks are common in experiment (run) documents and are referred to as “name-params” blocks.
Experiment Documents¶
Top-Level Attributes¶
uid
A user-defined name of the experiment. Must be unique accross all experiments stored in a particular database instance.
seed
A number to seed random-number generators with. Specifying the seed ensures reproducibility, as pseudo random number generators that are given the same seed will produce exactly the same sequence of random numbers.
version
The version of palaestrAI the experiment document was designed for. Ensures that changes in the software are alerted to the user; relevant are major and minor parts of the version number. E.g.,
3.5
.output
Directory to which all generated experiment run documents are written. Releative to the working directory
arsenai generate
is called from.repetitions
How many repetitions are scheduled for each parameter/factor configuration, generated as separate experiment run documents. Any integer number ≥ 1 is valid.
max_runs
Depending on the factors defined, the number of concrete experiment run documents can be quite high. The amount of experiment run documents generated can be controlled via this setting. If set high enough, a full factorial experimentation plan is generated; otherwise, latin hypercube sampling or a minimax space filling scheme is used.
definitions
Under the
definitions
key, blocks defining agents, environments, etc. are specified. (See next section for details.)schedule
Combines the building blocks from the
definitions
section. (See the section on experiment schedules for details.)
Definitions¶
This section defines the building blocks for an experiment. It consists of the following elements, corresponding to the subsections below:
agents
(Agents)environments
sensors
actuators
simulation
(Simulation)phase_config
run_config
Agents¶
Defines agents, which includes the algorithms they employ, their respective objective function, as well as whether existing models should be loaded or offline-training should be attempted (the latter must be supported by the respective algorithm). This section does not define sensor/actuator assignments, as this is part of the design of experiments.
Each agent definition lives in its own block, which has a unique identifier, e.g.:
definitions:
agents:
agent_1:
# (Agent definition)
"Agent Two":
# (...)
Agent definitions contain the following configuration options:
name
User-visible name of the agent; can be freely choosen and does not have to be unique, although it helps.
brain
The learning algorithm part of the agent; a name-params block
muscle
The rollout part of the agent algorithm; a name-params block
objective
The particular objective function of the agent; a name-params block
load
Specifies an existing model to be loaded; must be compatible with the algorithms defined in
brain
andmuscle
. Theload
section is optional: if not given, a new agent model will be created from scratch. This section contains the following keys:agent
Unique identifier (not
name
) of the agent to be loadedexperiment_run
An experiment run during which the model was created;
phase
Number of the phase after which the model was saved.
replay
Allows to load trajectories from one or more other runs for the agent to offline train on. The
replay
section is optional. Under thereplay
key, a list of objects with the following keys can be given:agent
Unique identifier (not
name
) of the agent to be loadedexperiment_run
An experiment run during which the model was created;
phase
Number of the phase after which the model was saved.
Simulation¶
Within an episode, at least one agents acts within one environment; but usually, there is more than one actor. When agents act and environments are updated is determinted by a simulation controller; flow control is handled by termination conditions.
Flow control definitions can be factors as part of an experiment design. Therefore, each definition has a (per-experiment) unique name. E.g.:
definitions:
simulation:
taking_turns:
name: palaestrai.simulation:TakingTurns
conditions:
- name: palaestrai.experiment:EnvironmentTerminationCondition
params: {}
Simulation definitions take the following parameters:
name
The “name” part of a name-params block. Note that there are is no separate “params” block, though.
conditions
A list of termination conditions, as name-params blocks.
Read the chapter on Simulation Flow Control to learn more about simulation controllers and termination conditions.