Access results#
The Store hooks into the global communication between several classes,
most notably environment
, muscle
and brain
. It extracts
interesting information and saves it to a database.
Query API#
palaestrAI’s database backend offers a simple (but growing) API for querying. It is accessible from the ::palaestrai.store.query module. The module integrates with the runtime configuration, i.e., the global database credentials configuration is also used here. The query API is convienient as it returns pandas or dask dataframes, which can then be easily used in Jupyter notebooks or for plotting.
Store Query API
palaestrAI provides a hierarchical database schema where experiments, experiment runs, agent/environment configurations, as well as states, actions, and rewards are stored. The ::palaestrai.store.query module provides convenience methods for the data that is requested most of the time, such as experiment configuration or agent rewards. All data is returned either as pandas DataFrame or dask DataFrame, depending on the expected size of the data and query options.
All query functions offer some parameters to filte the query.
The easiest option is to pass a data frame via the like_dataframe
parameter. This constructs the query according to the contents of the data
frame:
The column names are the table’s attributes and all data frame contents are
used as filter predicate in to construct a query using the schema of
WHERE column_name IN [cell1, cell2, cell3] AND column_name_2 IN ...
.
More colloquially, this means that the data frame passed via the
like_dataframe
parameter contains the data used for filtering.
If a data frame contains the columns experiment_run_uid
and
agent_name
, and the contents are 1
, 2
, and a1
and a2
,
respectively, then the results from the database contain only those rows where
the experiment run UID is either 1
or 2
, and the agent name is
either a1
or a2
.
In addition, each query function also has explicitly spelled out parameters
for filtering.
E.g., with experiment_run_uids
, only the mentioned experiment run UIDs
are being queried.
If multiple parameters are specified, they are joined with an implicit
logical and.
E.g., if both experiment_run_uids
and experiment_run_phase_uids
are
specified, then the database is queried for data that belongs to the
specified experiment runs AND the specified experiment run phases.
The resulting query that is rendered is equivalent to
... experiment_run.uid IN ... AND experiment_run_phase.uid IN ...
.
In addition, each query function allows for a user-defined predicate to be
passed. This parameter, predicate
, is expected to be a callable (e.g., a
lambda expression) that receives the query object after all other query
options are applied. It is expected to return the (modified) query object.
For example:
df: pandas.DataFrame = experiments_and_runs_configurations(
predicate=lambda query: query.limit(5)
)
This would select only five entries from the database.
All results are ordered in descending order, such that the newest entries
are always first.
I.e., the limit(5)
example above would automatically select the five newest
entries.
In order to avoid confusion, relation names and attributes are joined by an
underscore (_
).
E.g., the ID of an environment is represented by the environment_id
attribute in the resulting data frame.
Each function expects a Session
object as first argument.
I.e., the access credentials will be those that are stored in the current
runtime configuration.
If the session
parameter is not supplied, a new session will be
automatically created.
However, the store API does not take care of cleaning sessions. I.e., running
more than one query function without explicitly supplying a session object will
most likely lead to dangling open connections to the database.
The best solution is to use a context manager, e.g.:
from palaestrai.store import Session
import palaestrai.store.query as palq
with Session() as session:
ers = palq.experiments_and_runs_configurations(session)
Warning
The query API is currently in beta state. That means that it is currently
caught in the impeding API changes. This includes the way the
SensorInformation
, ActuatorInformation
, and
RewardInformation
classes are serialized.
If you encounter bugs, please report them at the
palaestrAI issue tracker
for the store subsystem.
Low-Level Access#
When the query API is not sufficient, the database can be accessed using the ORM classes in ::palaestrai.store.database_model. An example would be:
import palaestrai.core
import palaestrai.store.database_model as dbm
import sqlalchemy.orm
import json
# pandapower is used by midas environments, this might differ by usecase
import pandapower as pp
from pandapower.plotting import simple_plot
# Create an alchemy engine
alchemyEngine= create_engine('postgresql+psycopg2://postgres:passwordforpalaestrai@127.0.0.1:5432/Palaestrai', pool_recycle=3600);
# Create a session
session_maker = sqlalchemy.orm.sessionmaker()
session_maker.configure(bind=alchemyEngine)
dbSession = session_maker()
# Query experiments
query= dbSession.query(dbm.Experiment)
# Get the last experiment
exps = query.order_by(dbm.Experiment.id.desc())
exp = exps[1]
# Get all runs
runs = exp.experiment_runs
run = runs[0]
# Get all simulation runs
sims = run.simulation_instances
sim = sims[0]
# Get all muscles
muscles = sim.muscles
muscle1 = muscles[0]
muscle2 = muscles[1]
# Get all muscle actions
m1_actions = muscle1.muscle_actions
# Get rewards from the muscle actions
rewards = [a.reward for a in muscle1.muscle_actions]
# Get environment conductors
ecs = sim.environment_conductors
# Get first ec
ec = ecs[0]
# Get world states
world_states = ec.world_states
# get the last 10 states but not the last one because its empty
states = world_states[-10:-1]
# for every state load the json
# extract the external grid state
# load the panda power net
# extract the values and store it
external_grids = None
for state in states:
world_state_json = json.loads(state.state_dump)
s = [x for x in world_state_json if x["sensor_id"] == "Powergrid-0.Grid-0.grid_json"]
net = pp.from_json_string(s[0]['sensor_value'])
if external_grids is None:
external_grids = net.res_ext_grid
else:
external_grids = external_grids.append(net.res_ext_grid)
# Since the data are present in their original form,
# all functions from the pandapower framework are applicable for data analysis,
# for example build-in plotting functions:
simple_plot(net)
To get a full overview of what can be done with the databse model visit The SQLAlchemy Documentation. An overview of the data-structure can be found below.