`gym_acnportal.gym_acnsim`¶

Open AI Gym plugin for ACN-Sim. Provides several customizable environments for training reinforcement learning (RL) agents. See tutorial X for examples of usage.

Subpackages¶

Submodules¶

gym_acnportal.gym_acnsim.interfaces

Package Contents¶

Classes¶

`GymTrainedInterface`	Interface between OpenAI Environments and the ACN Simulation
`GymTrainingInterface`	Interface between OpenAI Environments and the ACN Simulation
`BaseSimEnv`	Abstract base class meant to be inherited from to implement
`CustomSimEnv`	A simulator environment with customizable observations, action
`RebuildingEnv`	A simulator environment that subclasses CustomSimEnv, with

Functions¶

`make_default_sim_env`(interface: Optional[GymTrainedInterface] = None) → gym_acnportal.gym_acnsim.envs.custom_envs.CustomSimEnv	A simulator environment with the following characteristics:
`make_rebuilding_default_sim_env`(interface_generating_function: Optional[Callable[[], GymTrainedInterface]]) → gym_acnportal.gym_acnsim.envs.custom_envs.RebuildingEnv	A simulator environment with the same characteristics as the

class gym_acnportal.gym_acnsim.GymTrainedInterface¶

Bases: acnportal.acnsim.Interface

Interface between OpenAI Environments and the ACN Simulation Environment.

classmethod from_interface(cls, interface: acnportal.acnsim.Interface) → gym_acnportal.gym_acnsim.interfaces.GymTrainedInterface ¶

Generates an instance of this class from an Interface-like object. Note that the _simulator of the input interface is not copied into the new GymTrainedInterface-like object generated here, so changes from elsewhere to the _simulator will be reflected in this object’s _simulator.

Args:: interface (Interface):
Returns:: GymTrainedInterface:

property station_ids(self) → List[str]¶

Return a list of space ids of stations the in the network.

Returns:: List[str]: List of station ids in the network.

property active_station_ids(self) → List[str]¶

Returns a list of active EVSE station ids for use by the algorithm.

Returns:

List[str]: List of EVSE station ids with an EV plugged in: that is not finished charging.

property is_done(self) → bool¶

Returns if the simulation is complete (i.e. event queue is empty).

Returns:: bool: True if simulation is complete.

property charging_rates(self) → numpy.ndarray¶

Returns the charging_rates of the simulator at all times.

Returns:

np.ndarray: numpy array of all charging rates. Each row: represents the charging rates of a station; each column represents the charging rates at an iteration.

is_feasible_evse(self, load_currents: Dict[str, List[float]]) → bool¶

Return if each EVSE in load_currents can accept the pilots assigned to it.

Args:

load_currents (Dict[str, List[number]]): Dictionary mapping: load_ids to schedules of charging rates.

Returns: bool: True if all pilots are valid for the EVSEs to

which they are sent.

is_feasible(self, load_currents: Dict[str, List[float]], linear: bool = False, violation_tolerance: Optional[float] = None, relative_tolerance: Optional[float] = None) → bool¶

Overrides Interface.is_feasible with extra feasibility checks. These include:

Checking for stations in a schedule but not in the network.
Checking that the schedule doesn’t violate any constraints on

EVSE charging rates.

last_energy_delivered(self) → float¶

Return the actual energy delivered in the last period, in amp-periods.

TODO: This is known to produce a warning in acnportal 0.2.2

Returns:

float: Total energy delivered in the last period, in: amp-periods.

current_constraint_currents(self, input_schedule: object) → object¶

TODO Args:

input_schedule:

Returns:

class gym_acnportal.gym_acnsim.GymTrainingInterface¶

Bases: gym_acnportal.gym_acnsim.interfaces.GymTrainedInterface

Interface between OpenAI Environments and the ACN Simulation: Environment.

This class of interface facilitates training by allowing an agent to step the Simulator by a single iteration.

step(self, new_schedule: Dict[str, List[float]], force_feasibility: bool = True) → Tuple[bool, bool]¶

Step the simulation using the input new_schedule until the simulator requires a new charging schedule. If the provided schedule is infeasible, steps the simulation only if force_feasibility is False, otherwise doesn’t step the simulation.

Args:

new_schedule (Dict[str, List[float]]): Dictionary mapping station ids to a schedule of pilot signals. force_feasibility (bool): If True, do not allow an

Returns:

bool: True if the simulation is completed bool: True if the schedule was feasible

Warns:

UserWarning: If the length of the new schedule is less than: the Simulator’s max_recompute parameter. This warning is raised because stepping the Simulator with a schedule of length less than max_recompute could cause the pilot signals to be updated with 0’s after the schedule runs out of entries.

class gym_acnportal.gym_acnsim.BaseSimEnv(interface: Optional[GymTrainedInterface])¶

Bases: gym.Env

Abstract base class meant to be inherited from to implement new ACN-Sim Environments.

Subclasses must implement the following methods:: action_to_schedule observation_from_state reward_from_state done_from_state

Subclasses must also specify observation_space and action_space, either as class or instance variables.

Optionally, subclasses may implement info_from_state, which here returns an empty dict.

Subclasses may override __init__, step, and reset functions.

Currently, no render function is implemented, though this function is not required for internal functionality.

Attributes:

_interface (GymTrainedInterface): An interface to a simulation to be: stepped by this environment, or None. If None, an interface must be set later.
_init_snapshot (GymTrainedInterface): A deep copy of the initial: interface, used for environment resets.
_prev_interface (GymTrainedInterface): A deep copy of the interface: at the previous time step; used for calculating action rewards.
_action (object): The action taken by the agent in this: agent-environment loop iteration.
_schedule (Dict[str, List[number]]): Dictionary mapping: station ids to a schedule of pilot signals.
_observation (np.ndarray): The observation given to the agent in: this agent-environment loop iteration.
_done (object): An object representing whether or not the: execution of the environment is complete.

_info (object): An object that gives info about the environment.

_interface :Optional[GymTrainedInterface]¶

_init_snapshot :GymTrainedInterface¶

_prev_interface :GymTrainedInterface¶

_action :Optional[np.ndarray]¶

_schedule :Dict[str, List[float]]¶

_observation :Optional[np.ndarray]¶

_reward :Optional[float]¶

_done :Optional[bool]¶

_info :Optional[Dict[Any, Any]]¶

property interface(self) → gym_acnportal.gym_acnsim.interfaces.GymTrainedInterface ¶

property prev_interface(self) → gym_acnportal.gym_acnsim.interfaces.GymTrainedInterface ¶

property action(self) → numpy.ndarray¶

property schedule(self) → Dict[str, List[float]]¶

property observation(self) → numpy.ndarray¶

property reward(self) → float¶

property done(self) → bool¶

property info(self) → Dict[Any, Any]¶

update_state(self) → None¶

Update the state of the environment. Namely, the observation, reward, done, and info attributes of the environment.

Returns:: None.

store_previous_state(self) → None¶

Store the previous state of the simulation in the _prev_interface environment attribute.

Returns:: None.

step(self, action: numpy.ndarray) → Tuple[np.ndarray, float, bool, Dict[Any, Any]]¶

Step the simulation one timestep with an agent’s action.

Accepts an action and returns a tuple (observation, reward, done, info).

Implements gym.Env.step()

Args:

action (object): an action provided by the agent

Returns:

observation (np.ndarray): agent’s observation of the current: environment
reward (float)amount of reward returned after previous: action
done (bool): whether the episode has ended, in which case: further step() calls will return undefined results
info (dict): contains auxiliary diagnostic information: (helpful for debugging, and sometimes learning)

reset(self) → Dict[str, np.ndarray]¶

Resets the state of the simulation and returns an initial observation. Resetting is done by setting the interface to the simulation to an interface to the simulation in its initial state.

Implements gym.Env.reset()

Returns:: observation (np.ndarray): the initial observation.

abstract render(self, mode='human')¶: Renders the environment. Implements gym.Env.render().

abstract action_to_schedule(self) → Dict[str, List[float]]¶

Convert an agent action to a schedule to be input to the simulator.

Returns:

schedule (Dict[str, List[float]]): Dictionary mapping: station ids to a schedule of pilot signals.

abstract observation_from_state(self) → Dict[str, np.ndarray]¶

Construct an environment observation from the state of the simulator

Returns:

observation (Dict[str, np.ndarray]): an environment: observation generated from the simulation state

abstract reward_from_state(self) → float¶

Calculate a reward from the state of the simulator

Returns:: reward (float): a reward generated from the simulation state

abstract done_from_state(self) → bool¶

Determine if the simulation is done from the state of the simulator

Returns:: done (bool): True if the simulation is done, False if not

abstract info_from_state(self) → Dict[Any, Any]¶

Give information about the environment using the state of the simulator

Returns:: info (dict): dict of environment information

class gym_acnportal.gym_acnsim.CustomSimEnv(interface: Optional[GymTrainedInterface], observation_objects: List[SimObservation], action_object: gym_acnportal.gym_acnsim.envs.action_spaces.SimAction, reward_functions: List[Callable[[BaseSimEnv], float]])¶

Bases: gym_acnportal.gym_acnsim.envs.base_env.BaseSimEnv

A simulator environment with customizable observations, action spaces, and rewards.

Observations are specified as objects, where each object specifies a function to generate a space from a simulation interface and a function to generate an observation from a simulation interface.

Action spaces are specified as functions that generate a space from a simulation interface.

Rewards are specified as functions that generate a number (reward) from an environment.

Users may define their own objects/functions to input to this environment, use the objects/functions defined in the gym_acnsim package, or use an environment factory function defined in the sim_prototype_env module.

observation_objects :List[SimObservation]¶

observation_space :spaces.Dict¶

action_object :SimAction¶

action_space :spaces.Space¶

reward_functions :List[Callable[[BaseSimEnv], float]]¶

property interface(self) → gym_acnportal.gym_acnsim.interfaces.GymTrainedInterface ¶

Get the current Interface of the environment.

Returns:: GymTrainedInterface: The current Interface of the environment.

abstract render(self, mode='human')¶: Renders the environment. Implements gym.Env.render().

action_to_schedule(self) → Dict[str, List[float]]¶

Convert an agent action to a schedule to be input to the simulator.

Returns:

schedule (Dict[str, List[float]]): Dictionary mapping: station ids to a schedule of pilot signals.

observation_from_state(self) → Dict[str, np.ndarray]¶

Construct an environment observation from the state of the simulator using the environment’s observation construction functions.

Returns:

observation (Dict[str, np.ndarray]): An environment: observation generated from the simulation state

reward_from_state(self) → float¶

Calculate a reward from the state of the simulator

Returns:

reward (float): a reward generated from the simulation: state

done_from_state(self) → bool¶

Determine if the simulation is done from the state of the simulator

Returns:: done (bool): True if the simulation is done, False if not

info_from_state(self) → Dict[Any, Any]¶

Give information about the environment using the state of the simulator. In this case, all the info about the simulator is given by returning a dict containing the simulator’s interface.

Returns:

info (Dict[str, GymTrainedInterface]): The interface between: the environment and Simulator.

class gym_acnportal.gym_acnsim.RebuildingEnv(interface: Optional[GymTrainedInterface], observation_objects: List[SimObservation], action_object: gym_acnportal.gym_acnsim.envs.action_spaces.SimAction, reward_functions: List[Callable[[BaseSimEnv], float]], interface_generating_function: Optional[Callable[], GymTrainedInterface]] = None)¶

Bases: gym_acnportal.gym_acnsim.envs.custom_envs.CustomSimEnv

A simulator environment that subclasses CustomSimEnv, with the extra property that the entire simulation is rebuilt within the environment when __init__ or reset are called

This is especially useful if the network or event queue have stochastic elements.

classmethod from_custom_sim_env(cls, env: gym_acnportal.gym_acnsim.envs.custom_envs.CustomSimEnv, interface_generating_function: Optional[Callable[], GymTrainedInterface]] = None) → gym_acnportal.gym_acnsim.envs.custom_envs.RebuildingEnv ¶

reset(self) → Dict[str, np.ndarray]¶

Resets the state of the simulation and returns an initial observation. Resetting is done by setting the interface to the simulation to an interface to the simulation in its initial state.

Returns:: observation (np.ndarray): the initial observation.

abstract render(self, mode='human')¶: Renders the environment. Implements gym.Env.render().

gym_acnportal.gym_acnsim.make_default_sim_env(interface: Optional[GymTrainedInterface] = None) → gym_acnportal.gym_acnsim.envs.custom_envs.CustomSimEnv ¶

A simulator environment with the following characteristics:

The action and observation spaces are continuous.

An action in this environment is a pilot signal for each EVSE, within the minimum and maximum EVSE rates.

An observation is a dict consisting of fields (times are 1-indexed in the observations):

arrivals: arrival time of the EV at each EVSE (or 0 if there’s
no EV plugged in)

departures: departure time of the EV at each EVSE (or 0 if
there’s no EV plugged in)

demand: energy demand of the EV at each EVSE (unoccupied
EVSEs have demand 0)

constraint_matrix: matrix of aggregate current coefficients magnitudes: magnitude vector constraining aggregate currents timestep: timestep of the simulation

The reward is calculated as follows:

If no constraints (on the network or on the EVSEs) were: violated by the action,
a reward equal to the total charge delivered (in A) is: returned
If any constraint violation occurred, a negative reward equal: to the magnitude of the violation is returned.

Network constraint violations are scaled by the number of EVs Finally, a user-input reward function is added to the total

reward.

The simulation is considered done if the event queue is empty.

gym_acnportal.gym_acnsim.make_rebuilding_default_sim_env(interface_generating_function: Optional[Callable[], GymTrainedInterface]]) → gym_acnportal.gym_acnsim.envs.custom_envs.RebuildingEnv ¶

A simulator environment with the same characteristics as the environment returned by make_default_sim_env except on every reset, the simulation is completely rebuilt using interface_generating_function.

See make_default_sim_env for more info.

gym_acnportal.gym_acnsim.default_observation_objects :List[SimObservation]¶

gym_acnportal.gym_acnsim.default_action_object :SimAction¶

gym_acnportal.gym_acnsim.default_reward_functions :List[Callable[[BaseSimEnv], float]]¶

gym_acnportal.gym_acnsim.all_envs :List[EnvSpec]¶

gym_acnportal.gym_acnsim.env_ids¶

gym_acnportal.gym_acnsim.gym_env_dict :Dict[str, str]¶

gym_acnportal.gym_acnsim¶

Subpackages¶

Submodules¶

Package Contents¶

Classes¶

Functions¶

`gym_acnportal.gym_acnsim`¶