1. Getting Started: Executing a Dummy Experiment Run#
This tutorial gently guides you through your first end-to-end run with palaestrAI. Here, you will learn how an experiment run file looks like, how to execute it, and how to query the store for data. We will stay only within palaestrAI core and not import any hARL agent, or use an extended palaestrAI environment. As such, the agents will perform only random actions in a dummy environment. That might not look super exciting, but then again, its good to start with baby steps and train the mighty agents later on…!
This tutorial will call the palaestrAI API directly from the notebook. The command-line interface (CLI) does exactly that under the hood, too: There is no difference in the general usage or the layout of the experiment run files. But with the Juypter notebook, we can have everything neatly in one place.
So sit back and follow us through your first experiment run… Have a lot of fun!
1.1. Imports#
Let’s start by importing necessary modules. This will be what we need for palaestrAI, namely the entrypoint, the runtime config, and the database access stuff:
[45]:
import palaestrai # Will provide palaestrai.exectue
import palaestrai.core # RuntimeConfig
import palaestrai.store # store.Session for database connectivity
import palaestrai.store.database_util
import palaestrai.store.database_model as paldb
The typical data science analysis toolstack uses pandas and matplotlib, so let’s import those, too.
[46]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
jsonpickle we will need to inspect the reward information objects later on. Here, we also need to use the jsonpickle extension for numpy:
[47]:
import jsonpickle
import jsonpickle.ext.numpy as jsonpickle_numpy
jsonpickle_numpy.register_handlers()
There are also some of the usual suspects from Python’s standard library, which we’ll import here without further comment:
[48]:
import io
import os
import pprint
import tempfile
from pathlib import Path
1.2. Experiment Run Document#
Everything palaestrAI does depends on its configuration, or rather, experiments. When you do real design of experiments, you first create an experiment document, in which you define strategies for sampling your factors. Each sample is an experiment run, which will be executed by palaestrAI. We won’t do the full DoE dance here, but rather provide an experiment run document directly.
Experiments and experiment runs have unique names (uid
). When they’re not given, they are auto-generated, but usually the user wants to set them in order to find them in the store later on. Choosing a good name might seem hard (it isn’t, any string will do); being forced to choose a unique name might seem an unecessary constraint. However, it isn’t: Each experiment run must be repeatable, i.e., always have the same result, no matter how often it is run. A change in an experiment run
definition can yield different results. Therefore, each experiment run is unique—and thus should be its name, too. We will define the experiment run name as a separate variable so that we don’t have to remember it later on when we query the store:
[49]:
experiment_run_name = "Tutorial Experiment Run"
Experiment (run) documents also have a version. It serves as a discriminator to catch semantic changes in the document. It is an additional safeguard and emits a log message, but not a stopgap.
For this tutorial, we set the document’s version to palaestrAI’s version. That is okay here since we need to keep this documented up-to-date in any case. When experiment runs are archived, the version number (and its immutability!) become more important.
[50]:
experiment_run_version = "3.4"
And now to the document itself. Apart from the uid
, the version
, and the random seed (seed
), it provides the configuration of the experiment run. Experiment runs have phases, so the most important key here is the experiment schedule
.
A schedule defines the phases of an experiment run. A phase comprises environments, agents, simulation parameters such as the termination condition, as well as general configuration flags. Schedule configurations are cascading: Values defined in the previous phase are applied to following phases, too, unless they are explicitly overwritten.
In our example, we have three phases in our schedule. The first phase trains only one agent, the second trains two in the same environment, and finally, there is a third phase as testing stage.
(Please note that we’re using an f-string here, and hence the YAML dict {}
becomes {{}}
.)
[51]:
experiment_run_document = f"""
uid: "{experiment_run_name}"
seed: 47 # Not quite Star Trek, but...
version: "{experiment_run_version}"
schedule: # The schedule for this run; it is a list
- phase_0:
environments: # Definition of the environments for this phase
- environment:
name: palaestrai.environment.dummy_environment:DummyEnvironment
uid: denv
params: {{ }}
agents: # Definiton of agents for this phase
- name: mighty_defender
brain:
name: palaestrai.agent.dummy_brain:DummyBrain
params: {{ }}
muscle:
name: palaestrai.agent.dummy_muscle:DummyMuscle
params: {{ }}
objective:
name: palaestrai.agent.dummy_objective:DummyObjective
params: {{"params": 1}}
sensors: [denv.0, denv.1, denv.2, denv.3, denv.4]
actuators: [denv.0, denv.1, denv.2, denv.3, denv.4]
simulation: # Definition of the simulation controller for this phase
name: palaestrai.simulation:VanillaSimulationController
conditions:
- name: palaestrai.simulation:VanillaSimControllerTerminationCondition
params: {{ }}
phase_config: # Additional config for this phase
mode: train
worker: 1
episodes: 5
- phase_1: # Name of the current phase. Can be any user-chosen name
agents: # Definiton of agents for this phase
- name: mighty_defender
brain:
name: palaestrai.agent.dummy_brain:DummyBrain
params: {{ }}
muscle:
name: palaestrai.agent.dummy_muscle:DummyMuscle
params: {{ }}
objective:
name: palaestrai.agent.dummy_objective:DummyObjective
params: {{"params": 1}}
sensors: [denv.0, denv.1, denv.2, denv.3, denv.4]
actuators: [denv.0, denv.1, denv.2, denv.3, denv.4]
- name: evil_attacker
brain:
name: palaestrai.agent.dummy_brain:DummyBrain
params: {{ }}
muscle:
name: palaestrai.agent.dummy_muscle:DummyMuscle
params: {{ }}
objective:
name: palaestrai.agent.dummy_objective:DummyObjective
params: {{"params": 1}}
sensors: [denv.5, denv.6, denv.7, denv.8, denv.9]
actuators: [denv.5, denv.6, denv.7, denv.8, denv.9]
simulation: # Definition of the simulation controller for this phase
name: palaestrai.simulation:VanillaSimulationController
conditions:
- name: palaestrai.simulation:VanillaSimControllerTerminationCondition
params: {{ }}
phase_config: # Additional config for this phase
mode: train
worker: 1
episodes: 2
- phase_2: # Definition of the second phase. Keeps every information
# from the first except for those keys that are redefined
# here.
phase_config:
mode: test
episodes: 3
run_config: # Not a runTIME config
condition:
name: palaestrai.experiment:VanillaRunGovernorTerminationCondition
params: {{ }}
"""
1.3. Runtime Config#
With the experiment run neatly defined, there is something else that defines how palaestrAI behaves: Its runtime config. It has nothing to do with an experiment run, but defines the behavior of palaestrAI on a certain machine. This includes log levels or the URI defining how to connect to the database. Usually, one does not touch it once the framework is installed.
In this case, we’re playing it safe and provide some sane defaults that are only relevant for the scope of this notebook. For example, we’ll resort to using SQLite in a temporary directory instead of PostgreSQL + TimescaleDB (speed is not of importance here).
Let’s create the database in a temporary location:
[52]:
store_dir = tempfile.TemporaryDirectory()
store_dir
[52]:
<TemporaryDirectory '/tmp/tmp4ke3ntu0'>
[53]:
runtime_config = palaestrai.core.RuntimeConfig()
runtime_config.reset()
runtime_config.load(
{
"store_uri": "sqlite:///%s/palaestrai.db" % store_dir.name,
"executor_bus_port": 4747,
"logger_port": 4748,
}
)
pprint.pprint(runtime_config.to_dict())
{'data_path': './_outputs',
'executor_bus_port': 4747,
'logger_port': 4748,
'logging': {'filters': {'debug_filter': {'()': 'palaestrai.core.runtime_config.DebugLogFilter'}},
'formatters': {'debug': {'format': '%(asctime)s '
'%(name)s[%(process)d]: '
'%(levelname)s - %(message)s '
'(%(module)s.%(funcName)s in '
'%(filename)s:%(lineno)d)'},
'simple': {'format': '%(asctime)s '
'%(name)s[%(process)d]: '
'%(levelname)s - '
'%(message)s'}},
'handlers': {'console': {'class': 'logging.StreamHandler',
'formatter': 'simple',
'level': 'INFO',
'stream': 'ext://sys.stdout'},
'console_debug': {'class': 'logging.StreamHandler',
'filters': ['debug_filter'],
'formatter': 'debug',
'level': 'DEBUG',
'stream': 'ext://sys.stdout'}},
'loggers': {'palaestrai.agent': {'level': 'ERROR'},
'palaestrai.agent.agent_conductor': {'level': 'ERROR'},
'palaestrai.agent.brain': {'level': 'ERROR'},
'palaestrai.agent.muscle': {'level': 'ERROR'},
'palaestrai.core': {'level': 'ERROR'},
'palaestrai.environment': {'level': 'ERROR'},
'palaestrai.experiment': {'level': 'ERROR'},
'palaestrai.simulation': {'level': 'ERROR'},
'palaestrai.store': {'level': 'ERROR'},
'palaestrai.types': {'level': 'ERROR'},
'palaestrai.util': {'level': 'ERROR'},
'palaestrai.visualization': {'level': 'ERROR'},
'sqlalchemy.engine': {'level': 'ERROR'}},
'root': {'handlers': ['console', 'console_debug'],
'level': 'ERROR'},
'version': 1},
'major_domo_client_retries': 3,
'major_domo_client_timeout': 300000,
'profile': False,
'public_bind': False,
'store_uri': 'sqlite:////tmp/tmp4ke3ntu0/palaestrai.db',
'time_series_store_uri': 'influx+localhost:8086'}
The nice thing about the RuntimeConfig
is that it is a singleton available everywhere in the framework. So whatever we set here pertains throughout the run.
1.4. Database Initialization#
Since we’ve opted to start fresh with a new SQLite database in a temporary directory, we will have to create and initialize it. Usually, one does this once (e.g., from the CLI with palaestrai database-create
) and is then done with it, but in this case we do it every time we run the notebook—it is a one-shot tutorial, after all. :-)
Luckily, palaestrAI has just the function we need to do it for us:
[54]:
palaestrai.store.database_util.setup_database(runtime_config.store_uri)
Could not create extension timescaledb and create hypertables: (sqlite3.OperationalError) near "EXTENSION": syntax error
[SQL: CREATE EXTENSION IF NOT EXISTS timescaledb CASCADE;]
(Background on this error at: https://sqlalche.me/e/14/e3q8). Your database setup might lead to noticeable slowdowns with larger experiment runs. Please upgrade to PostgreSQL with TimescaleDB for the best performance.
You will see a warning regarding the TimescaleDB extension. That is okay and just a warning. Since we’re not running a big, sophisticated experiment, we can live with a bit of a performance penality.
1.5. Experiment Run Execution#
Next up: Actually executing the experiment run! It just consists of one line: A call to palaestrai.execute()
. This method can cope with three types of parameters:
An
ExperimentRun
object. Nice in cases one has already loaded it (e.g., de-serialized it).A
str
.palaestrAI.execute()
interprets this as a path to a file—one of the most common use cases.A
TextIO
object: Any stream that delivers text. Useful when the experiment run document is not yet deserialized, and exactly what we need.
To turn a str
into a TextIO
, we simply wrap it into a StringIO
object. Make it so!
[55]:
rc = palaestrai.execute(io.StringIO(experiment_run_document))
Your palaestrAI installation has version 3.5.0 but your run file uses version 3.4, which may be incompatible.
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
[ ]:
[ ]:
The execution should yield no errors (and no warnings, too).
[56]:
assert rc[1].name == "EXITED"
1.6. Querying the Store#
Let’s get a custom session to the database first:
[57]:
dbh = palaestrai.store.Session()
palaestrAI has no special database access features, only nice object-relational mapper (ORM) bindings provided by SQLAlchemy. Which means that we can use all the nice magic SQLAlchemy gives us. So let’s first import it:
[58]:
import sqlalchemy as sa
Do you remember the name of our experiment run? We can now use it to look it up. Therefore, we first create a query using sqlalchemy.select
, which we then execute.
[59]:
q = sa.select(paldb.ExperimentRun).where(
paldb.ExperimentRun.uid == experiment_run_name
)
str(q)
[59]:
'SELECT experiment_runs.document, experiment_runs.document_json, experiment_runs.id, experiment_runs.uid, experiment_runs.experiment_id \nFROM experiment_runs \nWHERE experiment_runs.uid = :uid_1'
palaestrAI ensures through the uid
that each experiment run is stored only once in the database. one()
not only retrieves only one element from the query, it also raises an exception if there’s no or more than one row in the result set. Thus:
[60]:
result = dbh.execute(q).one()
experiment_run_record = result[paldb.ExperimentRun]
experiment_run_record.id, experiment_run_record.uid
[60]:
(1, 'Tutorial Experiment Run')
…yes, that’s us.
No matter how often an experiment run is executed, there will be only one entry for the same UID in the table. But many more instances will exist. Here, since we ran it only once, we will also see only one experiment run instance.
Through the SQLAlechemy ORM, we can access the experiment run instances directly:
[61]:
experiment_run_record.experiment_run_instances
[61]:
[<palaestrai.store.database_model.ExperimentRunInstance at 0x7f6df5a04a60>]
Would we run execute the run again, we’d see two entries in the list here:
[62]:
rc = palaestrai.execute(io.StringIO(experiment_run_document))
assert rc[1].name == "EXITED"
Your palaestrAI installation has version 3.5.0 but your run file uses version 3.4, which may be incompatible.
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
[63]:
dbh.refresh(experiment_run_record)
experiment_run_record.experiment_run_instances
[63]:
[<palaestrai.store.database_model.ExperimentRunInstance at 0x7f6df5a04a60>,
<palaestrai.store.database_model.ExperimentRunInstance at 0x7f6df5b57fa0>]
[64]:
assert len(experiment_run_record.experiment_run_instances) > 1
Now let’s focus on the run phases. Each instance will have several of them—three, to be precise. Remember our experiment run document? We have three, so lets find them in the database:
[65]:
experiment_run_record.experiment_run_instances[1]
[65]:
<palaestrai.store.database_model.ExperimentRunInstance at 0x7f6df5b57fa0>
[66]:
assert (
len(
experiment_run_record.experiment_run_instances[0].experiment_run_phases
)
== 3
)
Next up: Who did participate in this run phase? We can define participants for each run phase separately. In our experiment run document, we decided that first one agent may train on its own, then we have two agents train together, and finally a test phase for both. So that is what we want to see now.
However, simply exploring the ORM is not really fun for showing it in a Jupyter notebook. Thankfully, SQLAlchemy and pandas interface nicely: We can construct a query in SQLAlchemy with our ORM and than end it over to pandas to construct a dataframe out of it:
[67]:
pd.read_sql(
sa.select(paldb.Agent).where(
paldb.Agent.experiment_run_phase_id.in_(
phase.id
for phase in experiment_run_record.experiment_run_instances[
0
].experiment_run_phases
)
),
dbh.bind,
)
[67]:
id | uid | name | muscles | configuration | experiment_run_phase_id | |
---|---|---|---|---|---|---|
0 | 1 | mighty_defender | mighty_defender | [Tutorial Experiment Run_A:0fde7238-1e1b-4c2e-... | {'name': 'mighty_defender', 'brain': {'name': ... | 1 |
1 | 2 | mighty_defender | mighty_defender | [Tutorial Experiment Run_A:814e074f-1182-41c1-... | {'name': 'mighty_defender', 'brain': {'name': ... | 2 |
2 | 3 | evil_attacker | evil_attacker | [Tutorial Experiment Run_A:914d21dd-fd8b-43cf-... | {'name': 'evil_attacker', 'brain': {'name': 'p... | 2 |
3 | 4 | mighty_defender | mighty_defender | [Tutorial Experiment Run_A:4bb49b68-0870-474b-... | {'name': 'mighty_defender', 'brain': {'name': ... | 3 |
4 | 5 | evil_attacker | evil_attacker | [Tutorial Experiment Run_A:89555042-cabb-4077-... | {'name': 'evil_attacker', 'brain': {'name': 'p... | 3 |
Okay, now that we have explored many things, let’s find out how good our agents were! Let us start by looking at how well the first agent trained when it was alone. Each agent gets a new ID when it enters a new experiment run phase, regardless of whether its the same agent than before or a new one. (The discriminating element is the agent’s name.)
We first need the ID of the first experiment run phase:
[68]:
run_phase_id = min(
phase.id
for phase in experiment_run_record.experiment_run_instances[
0
].experiment_run_phases
)
run_phase_id
[68]:
1
Okay, which agent is it?
[69]:
agent_record = dbh.execute(
sa.select(paldb.Agent).where(
paldb.Agent.experiment_run_phase_id == run_phase_id
)
).one()[paldb.Agent]
assert agent_record.name == "mighty_defender"
agent_record.id, agent_record.name
[69]:
(1, 'mighty_defender')
[70]:
actions = pd.read_sql(
sa.select(paldb.MuscleAction).where(
paldb.MuscleAction.agent_id == agent_record.id
),
dbh.bind,
)
actions
[70]:
id | walltime | agent_id | simtimes | sensor_readings | actuator_setpoints | rewards | objective | |
---|---|---|---|---|---|---|---|---|
0 | 1 | 2023-12-13 14:04:42.238246 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [] | [] | [{'py/object': 'palaestrai.agent.reward_inform... | 0.0 |
1 | 2 | 2023-12-13 14:04:42.269778 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 1.0 |
2 | 3 | 2023-12-13 14:04:42.298354 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 3.0 |
3 | 4 | 2023-12-13 14:04:42.324810 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 6.0 |
4 | 5 | 2023-12-13 14:04:42.354093 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 10.0 |
5 | 6 | 2023-12-13 14:04:42.382757 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 15.0 |
6 | 7 | 2023-12-13 14:04:42.410707 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 21.0 |
7 | 8 | 2023-12-13 14:04:42.439630 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 28.0 |
8 | 9 | 2023-12-13 14:04:42.469556 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 36.0 |
9 | 10 | 2023-12-13 14:04:42.497725 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 45.0 |
10 | 11 | 2023-12-13 14:04:42.531033 | 1 | {} | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [] | 45.0 |
11 | 12 | 2023-12-13 14:04:42.559694 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 45.0 |
12 | 13 | 2023-12-13 14:04:42.589040 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 46.0 |
13 | 14 | 2023-12-13 14:04:42.618250 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 48.0 |
14 | 15 | 2023-12-13 14:04:42.647749 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 51.0 |
15 | 16 | 2023-12-13 14:04:42.675242 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 55.0 |
16 | 17 | 2023-12-13 14:04:42.706640 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 60.0 |
17 | 18 | 2023-12-13 14:04:42.734883 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 66.0 |
18 | 19 | 2023-12-13 14:04:42.763309 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 73.0 |
19 | 20 | 2023-12-13 14:04:42.791586 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 81.0 |
20 | 21 | 2023-12-13 14:04:42.820377 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 90.0 |
21 | 22 | 2023-12-13 14:04:42.852897 | 1 | {} | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [] | 90.0 |
22 | 23 | 2023-12-13 14:04:42.882183 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 90.0 |
23 | 24 | 2023-12-13 14:04:42.914938 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 91.0 |
24 | 25 | 2023-12-13 14:04:42.949641 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 93.0 |
25 | 26 | 2023-12-13 14:04:42.989857 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 96.0 |
26 | 27 | 2023-12-13 14:04:43.023344 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 100.0 |
27 | 28 | 2023-12-13 14:04:43.055970 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 105.0 |
28 | 29 | 2023-12-13 14:04:43.086548 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 111.0 |
29 | 30 | 2023-12-13 14:04:43.124622 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 118.0 |
30 | 31 | 2023-12-13 14:04:43.156242 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 126.0 |
31 | 32 | 2023-12-13 14:04:43.187564 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 135.0 |
32 | 33 | 2023-12-13 14:04:43.224615 | 1 | {} | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [] | 135.0 |
33 | 34 | 2023-12-13 14:04:43.255582 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 135.0 |
34 | 35 | 2023-12-13 14:04:43.287002 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 136.0 |
35 | 36 | 2023-12-13 14:04:43.321605 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 138.0 |
36 | 37 | 2023-12-13 14:04:43.353263 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 141.0 |
37 | 38 | 2023-12-13 14:04:43.383604 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 145.0 |
38 | 39 | 2023-12-13 14:04:43.413991 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 150.0 |
39 | 40 | 2023-12-13 14:04:43.446556 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 156.0 |
40 | 41 | 2023-12-13 14:04:43.476132 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 163.0 |
41 | 42 | 2023-12-13 14:04:43.505711 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 171.0 |
42 | 43 | 2023-12-13 14:04:43.536784 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 180.0 |
43 | 44 | 2023-12-13 14:04:43.576547 | 1 | {} | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [] | 180.0 |
44 | 45 | 2023-12-13 14:04:43.607640 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 180.0 |
45 | 46 | 2023-12-13 14:04:43.638726 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 181.0 |
46 | 47 | 2023-12-13 14:04:43.671791 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 183.0 |
47 | 48 | 2023-12-13 14:04:43.704690 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 186.0 |
48 | 49 | 2023-12-13 14:04:43.739740 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 190.0 |
49 | 50 | 2023-12-13 14:04:43.774656 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 195.0 |
50 | 51 | 2023-12-13 14:04:43.821513 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 201.0 |
51 | 52 | 2023-12-13 14:04:43.857055 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 208.0 |
52 | 53 | 2023-12-13 14:04:43.893855 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 216.0 |
53 | 54 | 2023-12-13 14:04:43.930349 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 225.0 |
Okay, but how do we get rewards out of this? The rewards
column contains a list of RewardInformation
objects. In our case, we know that there will ever be only one (more than one is a special case). We also know that there will always be a float
. The knowledge about this comes from our knowledge of the reward, i.e., it is really domain knowledge that an experimenter will have.
At this point, we need to modify the dataframe a bit. We have to call jsonpickle.loads()
to get the object, and then extract the reward out of it. DataFrame.apply()
solves us well here. In order to make it more readable, we provide a function for this.
[76]:
def unpack_reward(x):
return float(x[0]["py/state"]["value"]) if x else 0.0
actions.rewards = actions.rewards.apply(lambda x: unpack_reward(x))
actions
[76]:
id | walltime | agent_id | simtimes | sensor_readings | actuator_setpoints | rewards | objective | |
---|---|---|---|---|---|---|---|---|
0 | 1 | 2023-12-13 14:04:42.238246 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [] | [] | 0.0 | 0.0 |
1 | 2 | 2023-12-13 14:04:42.269778 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 1.0 | 1.0 |
2 | 3 | 2023-12-13 14:04:42.298354 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 2.0 | 3.0 |
3 | 4 | 2023-12-13 14:04:42.324810 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 3.0 | 6.0 |
4 | 5 | 2023-12-13 14:04:42.354093 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 4.0 | 10.0 |
5 | 6 | 2023-12-13 14:04:42.382757 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 5.0 | 15.0 |
6 | 7 | 2023-12-13 14:04:42.410707 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 6.0 | 21.0 |
7 | 8 | 2023-12-13 14:04:42.439630 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 7.0 | 28.0 |
8 | 9 | 2023-12-13 14:04:42.469556 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 8.0 | 36.0 |
9 | 10 | 2023-12-13 14:04:42.497725 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 9.0 | 45.0 |
10 | 11 | 2023-12-13 14:04:42.531033 | 1 | {} | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 0.0 | 45.0 |
11 | 12 | 2023-12-13 14:04:42.559694 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 0.0 | 45.0 |
12 | 13 | 2023-12-13 14:04:42.589040 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 1.0 | 46.0 |
13 | 14 | 2023-12-13 14:04:42.618250 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 2.0 | 48.0 |
14 | 15 | 2023-12-13 14:04:42.647749 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 3.0 | 51.0 |
15 | 16 | 2023-12-13 14:04:42.675242 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 4.0 | 55.0 |
16 | 17 | 2023-12-13 14:04:42.706640 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 5.0 | 60.0 |
17 | 18 | 2023-12-13 14:04:42.734883 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 6.0 | 66.0 |
18 | 19 | 2023-12-13 14:04:42.763309 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 7.0 | 73.0 |
19 | 20 | 2023-12-13 14:04:42.791586 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 8.0 | 81.0 |
20 | 21 | 2023-12-13 14:04:42.820377 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 9.0 | 90.0 |
21 | 22 | 2023-12-13 14:04:42.852897 | 1 | {} | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 0.0 | 90.0 |
22 | 23 | 2023-12-13 14:04:42.882183 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 0.0 | 90.0 |
23 | 24 | 2023-12-13 14:04:42.914938 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 1.0 | 91.0 |
24 | 25 | 2023-12-13 14:04:42.949641 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 2.0 | 93.0 |
25 | 26 | 2023-12-13 14:04:42.989857 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 3.0 | 96.0 |
26 | 27 | 2023-12-13 14:04:43.023344 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 4.0 | 100.0 |
27 | 28 | 2023-12-13 14:04:43.055970 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 5.0 | 105.0 |
28 | 29 | 2023-12-13 14:04:43.086548 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 6.0 | 111.0 |
29 | 30 | 2023-12-13 14:04:43.124622 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 7.0 | 118.0 |
30 | 31 | 2023-12-13 14:04:43.156242 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 8.0 | 126.0 |
31 | 32 | 2023-12-13 14:04:43.187564 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 9.0 | 135.0 |
32 | 33 | 2023-12-13 14:04:43.224615 | 1 | {} | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 0.0 | 135.0 |
33 | 34 | 2023-12-13 14:04:43.255582 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 0.0 | 135.0 |
34 | 35 | 2023-12-13 14:04:43.287002 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 1.0 | 136.0 |
35 | 36 | 2023-12-13 14:04:43.321605 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 2.0 | 138.0 |
36 | 37 | 2023-12-13 14:04:43.353263 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 3.0 | 141.0 |
37 | 38 | 2023-12-13 14:04:43.383604 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 4.0 | 145.0 |
38 | 39 | 2023-12-13 14:04:43.413991 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 5.0 | 150.0 |
39 | 40 | 2023-12-13 14:04:43.446556 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 6.0 | 156.0 |
40 | 41 | 2023-12-13 14:04:43.476132 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 7.0 | 163.0 |
41 | 42 | 2023-12-13 14:04:43.505711 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 8.0 | 171.0 |
42 | 43 | 2023-12-13 14:04:43.536784 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 9.0 | 180.0 |
43 | 44 | 2023-12-13 14:04:43.576547 | 1 | {} | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 0.0 | 180.0 |
44 | 45 | 2023-12-13 14:04:43.607640 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 0.0 | 180.0 |
45 | 46 | 2023-12-13 14:04:43.638726 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 1.0 | 181.0 |
46 | 47 | 2023-12-13 14:04:43.671791 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 2.0 | 183.0 |
47 | 48 | 2023-12-13 14:04:43.704690 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 3.0 | 186.0 |
48 | 49 | 2023-12-13 14:04:43.739740 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 4.0 | 190.0 |
49 | 50 | 2023-12-13 14:04:43.774656 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 5.0 | 195.0 |
50 | 51 | 2023-12-13 14:04:43.821513 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 6.0 | 201.0 |
51 | 52 | 2023-12-13 14:04:43.857055 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 7.0 | 208.0 |
52 | 53 | 2023-12-13 14:04:43.893855 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 8.0 | 216.0 |
53 | 54 | 2023-12-13 14:04:43.930349 | 1 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 9.0 | 225.0 |
Plotting is relatively easy now, as pandas already provides us with everything we need.
[77]:
actions.plot(x="id", y="rewards", kind="scatter")
[77]:
<Axes: xlabel='id', ylabel='rewards'>
Okay, but if we want to compare the agents’ performance during the testing phase? First we need to find out what agents participated in the last experiment run phase. So let’s return to the experiment run phases table:
[78]:
experiment_run_phases = pd.read_sql(
sa.select(paldb.Agent)
.where(
paldb.Agent.experiment_run_phase_id.in_(
phase.id
for phase in experiment_run_record.experiment_run_instances[
0
].experiment_run_phases
)
)
.order_by(paldb.Agent.experiment_run_phase_id.desc()),
dbh.bind,
)
experiment_run_phases
[78]:
id | uid | name | muscles | configuration | experiment_run_phase_id | |
---|---|---|---|---|---|---|
0 | 4 | mighty_defender | mighty_defender | [Tutorial Experiment Run_A:4bb49b68-0870-474b-... | {'name': 'mighty_defender', 'brain': {'name': ... | 3 |
1 | 5 | evil_attacker | evil_attacker | [Tutorial Experiment Run_A:89555042-cabb-4077-... | {'name': 'evil_attacker', 'brain': {'name': 'p... | 3 |
2 | 2 | mighty_defender | mighty_defender | [Tutorial Experiment Run_A:814e074f-1182-41c1-... | {'name': 'mighty_defender', 'brain': {'name': ... | 2 |
3 | 3 | evil_attacker | evil_attacker | [Tutorial Experiment Run_A:914d21dd-fd8b-43cf-... | {'name': 'evil_attacker', 'brain': {'name': 'p... | 2 |
4 | 1 | mighty_defender | mighty_defender | [Tutorial Experiment Run_A:0fde7238-1e1b-4c2e-... | {'name': 'mighty_defender', 'brain': {'name': ... | 1 |
Okay, the top two rows are the ones we want to look at.
[79]:
muscle_actions = pd.read_sql(
sa.select(paldb.Agent, paldb.MuscleAction)
.join(paldb.Agent.muscle_actions)
.where(
paldb.Agent.experiment_run_phase_id.in_(
experiment_run_phases.experiment_run_phase_id[0:2]
)
),
dbh.bind,
)
assert len(muscle_actions) > 2
muscle_actions
[79]:
id | uid | name | muscles | configuration | experiment_run_phase_id | id_1 | walltime | agent_id | simtimes | sensor_readings | actuator_setpoints | rewards | objective | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 4 | mighty_defender | mighty_defender | [Tutorial Experiment Run_A:4bb49b68-0870-474b-... | {'name': 'mighty_defender', 'brain': {'name': ... | 3 | 97 | 2023-12-13 14:05:15.992059 | 4 | {'denv': {'py/object': 'palaestrai.types.simti... | [] | [] | [{'py/object': 'palaestrai.agent.reward_inform... | 0.0 |
1 | 5 | evil_attacker | evil_attacker | [Tutorial Experiment Run_A:89555042-cabb-4077-... | {'name': 'evil_attacker', 'brain': {'name': 'p... | 3 | 98 | 2023-12-13 14:05:16.049575 | 5 | {'denv': {'py/object': 'palaestrai.types.simti... | [] | [] | [{'py/object': 'palaestrai.agent.reward_inform... | 0.0 |
2 | 4 | mighty_defender | mighty_defender | [Tutorial Experiment Run_A:4bb49b68-0870-474b-... | {'name': 'mighty_defender', 'brain': {'name': ... | 3 | 99 | 2023-12-13 14:05:16.079000 | 4 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 1.0 |
3 | 5 | evil_attacker | evil_attacker | [Tutorial Experiment Run_A:89555042-cabb-4077-... | {'name': 'evil_attacker', 'brain': {'name': 'p... | 3 | 100 | 2023-12-13 14:05:16.079462 | 5 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 1.0 |
4 | 5 | evil_attacker | evil_attacker | [Tutorial Experiment Run_A:89555042-cabb-4077-... | {'name': 'evil_attacker', 'brain': {'name': 'p... | 3 | 101 | 2023-12-13 14:05:16.110210 | 5 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 3.0 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
59 | 4 | mighty_defender | mighty_defender | [Tutorial Experiment Run_A:4bb49b68-0870-474b-... | {'name': 'mighty_defender', 'brain': {'name': ... | 3 | 156 | 2023-12-13 14:05:17.008338 | 4 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 118.0 |
60 | 4 | mighty_defender | mighty_defender | [Tutorial Experiment Run_A:4bb49b68-0870-474b-... | {'name': 'mighty_defender', 'brain': {'name': ... | 3 | 157 | 2023-12-13 14:05:17.041415 | 4 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 126.0 |
61 | 5 | evil_attacker | evil_attacker | [Tutorial Experiment Run_A:89555042-cabb-4077-... | {'name': 'evil_attacker', 'brain': {'name': 'p... | 3 | 158 | 2023-12-13 14:05:17.044670 | 5 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 126.0 |
62 | 4 | mighty_defender | mighty_defender | [Tutorial Experiment Run_A:4bb49b68-0870-474b-... | {'name': 'mighty_defender', 'brain': {'name': ... | 3 | 159 | 2023-12-13 14:05:17.076680 | 4 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 135.0 |
63 | 5 | evil_attacker | evil_attacker | [Tutorial Experiment Run_A:89555042-cabb-4077-... | {'name': 'evil_attacker', 'brain': {'name': 'p... | 3 | 160 | 2023-12-13 14:05:17.077074 | 5 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | [{'py/object': 'palaestrai.agent.reward_inform... | 135.0 |
64 rows × 14 columns
Let’s do the reward conversion dance:
[80]:
muscle_actions.rewards = muscle_actions.rewards.apply(
lambda x: unpack_reward(x)
)
muscle_actions
[80]:
id | uid | name | muscles | configuration | experiment_run_phase_id | id_1 | walltime | agent_id | simtimes | sensor_readings | actuator_setpoints | rewards | objective | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 4 | mighty_defender | mighty_defender | [Tutorial Experiment Run_A:4bb49b68-0870-474b-... | {'name': 'mighty_defender', 'brain': {'name': ... | 3 | 97 | 2023-12-13 14:05:15.992059 | 4 | {'denv': {'py/object': 'palaestrai.types.simti... | [] | [] | 0.0 | 0.0 |
1 | 5 | evil_attacker | evil_attacker | [Tutorial Experiment Run_A:89555042-cabb-4077-... | {'name': 'evil_attacker', 'brain': {'name': 'p... | 3 | 98 | 2023-12-13 14:05:16.049575 | 5 | {'denv': {'py/object': 'palaestrai.types.simti... | [] | [] | 0.0 | 0.0 |
2 | 4 | mighty_defender | mighty_defender | [Tutorial Experiment Run_A:4bb49b68-0870-474b-... | {'name': 'mighty_defender', 'brain': {'name': ... | 3 | 99 | 2023-12-13 14:05:16.079000 | 4 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 1.0 | 1.0 |
3 | 5 | evil_attacker | evil_attacker | [Tutorial Experiment Run_A:89555042-cabb-4077-... | {'name': 'evil_attacker', 'brain': {'name': 'p... | 3 | 100 | 2023-12-13 14:05:16.079462 | 5 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 1.0 | 1.0 |
4 | 5 | evil_attacker | evil_attacker | [Tutorial Experiment Run_A:89555042-cabb-4077-... | {'name': 'evil_attacker', 'brain': {'name': 'p... | 3 | 101 | 2023-12-13 14:05:16.110210 | 5 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 2.0 | 3.0 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
59 | 4 | mighty_defender | mighty_defender | [Tutorial Experiment Run_A:4bb49b68-0870-474b-... | {'name': 'mighty_defender', 'brain': {'name': ... | 3 | 156 | 2023-12-13 14:05:17.008338 | 4 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 7.0 | 118.0 |
60 | 4 | mighty_defender | mighty_defender | [Tutorial Experiment Run_A:4bb49b68-0870-474b-... | {'name': 'mighty_defender', 'brain': {'name': ... | 3 | 157 | 2023-12-13 14:05:17.041415 | 4 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 8.0 | 126.0 |
61 | 5 | evil_attacker | evil_attacker | [Tutorial Experiment Run_A:89555042-cabb-4077-... | {'name': 'evil_attacker', 'brain': {'name': 'p... | 3 | 158 | 2023-12-13 14:05:17.044670 | 5 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 8.0 | 126.0 |
62 | 4 | mighty_defender | mighty_defender | [Tutorial Experiment Run_A:4bb49b68-0870-474b-... | {'name': 'mighty_defender', 'brain': {'name': ... | 3 | 159 | 2023-12-13 14:05:17.076680 | 4 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 9.0 | 135.0 |
63 | 5 | evil_attacker | evil_attacker | [Tutorial Experiment Run_A:89555042-cabb-4077-... | {'name': 'evil_attacker', 'brain': {'name': 'p... | 3 | 160 | 2023-12-13 14:05:17.077074 | 5 | {'denv': {'py/object': 'palaestrai.types.simti... | [{'py/object': 'palaestrai.agent.sensor_inform... | [{'py/object': 'palaestrai.agent.actuator_info... | 9.0 | 135.0 |
64 rows × 14 columns
The table contains rewards, alternating, for both agents. You can see that from the simtime_ticks
entry as well the name
column. So let’s plot them—it’s easy now:
[81]:
defender_actions = muscle_actions[muscle_actions.name == "mighty_defender"][
["rewards"]
].rename(columns={"rewards": "defender_rewards"})
attacker_actions = muscle_actions[muscle_actions.name == "evil_attacker"][
["rewards"]
].rename(columns={"rewards": "attacker_rewards"})
pd.concat([attacker_actions, defender_actions]).plot()
[81]:
<Axes: >
1.7. Conclusion#
This concludes our first tutorial. We hope you enjoyed the whole run. If you encountered any errors, head over to the palaestrAI issue tracker at Gitlab and let us know!