`gym_acnportal.gym_acnsim.envs.base_env`¶

This module contains an abstract gym environment that wraps an ACN-Sim Simulation.

Module Contents¶

Classes¶

BaseSimEnv

Abstract base class meant to be inherited from to implement

class gym_acnportal.gym_acnsim.envs.base_env.BaseSimEnv(interface: Optional[GymTrainedInterface])¶

Bases: gym.Env

Abstract base class meant to be inherited from to implement new ACN-Sim Environments.

Subclasses must implement the following methods:: action_to_schedule observation_from_state reward_from_state done_from_state

Subclasses must also specify observation_space and action_space, either as class or instance variables.

Optionally, subclasses may implement info_from_state, which here returns an empty dict.

Subclasses may override __init__, step, and reset functions.

Currently, no render function is implemented, though this function is not required for internal functionality.

Attributes:

_interface (GymTrainedInterface): An interface to a simulation to be: stepped by this environment, or None. If None, an interface must be set later.
_init_snapshot (GymTrainedInterface): A deep copy of the initial: interface, used for environment resets.
_prev_interface (GymTrainedInterface): A deep copy of the interface: at the previous time step; used for calculating action rewards.
_action (object): The action taken by the agent in this: agent-environment loop iteration.
_schedule (Dict[str, List[number]]): Dictionary mapping: station ids to a schedule of pilot signals.
_observation (np.ndarray): The observation given to the agent in: this agent-environment loop iteration.
_done (object): An object representing whether or not the: execution of the environment is complete.

_info (object): An object that gives info about the environment.

_interface :Optional[GymTrainedInterface]¶

_init_snapshot :GymTrainedInterface¶

_prev_interface :GymTrainedInterface¶

_action :Optional[np.ndarray]¶

_schedule :Dict[str, List[float]]¶

_observation :Optional[np.ndarray]¶

_reward :Optional[float]¶

_done :Optional[bool]¶

_info :Optional[Dict[Any, Any]]¶

property interface(self) → gym_acnportal.gym_acnsim.interfaces.GymTrainedInterface ¶

property prev_interface(self) → gym_acnportal.gym_acnsim.interfaces.GymTrainedInterface ¶

property action(self) → numpy.ndarray¶

property schedule(self) → Dict[str, List[float]]¶

property observation(self) → numpy.ndarray¶

property reward(self) → float¶

property done(self) → bool¶

property info(self) → Dict[Any, Any]¶

update_state(self) → None¶

Update the state of the environment. Namely, the observation, reward, done, and info attributes of the environment.

Returns:: None.

store_previous_state(self) → None¶

Store the previous state of the simulation in the _prev_interface environment attribute.

Returns:: None.

step(self, action: numpy.ndarray) → Tuple[np.ndarray, float, bool, Dict[Any, Any]]¶

Step the simulation one timestep with an agent’s action.

Accepts an action and returns a tuple (observation, reward, done, info).

Implements gym.Env.step()

Args:

action (object): an action provided by the agent

Returns:

observation (np.ndarray): agent’s observation of the current: environment
reward (float)amount of reward returned after previous: action
done (bool): whether the episode has ended, in which case: further step() calls will return undefined results
info (dict): contains auxiliary diagnostic information: (helpful for debugging, and sometimes learning)

reset(self) → Dict[str, np.ndarray]¶

Resets the state of the simulation and returns an initial observation. Resetting is done by setting the interface to the simulation to an interface to the simulation in its initial state.

Implements gym.Env.reset()

Returns:: observation (np.ndarray): the initial observation.

abstract render(self, mode='human')¶: Renders the environment. Implements gym.Env.render().

abstract action_to_schedule(self) → Dict[str, List[float]]¶

Convert an agent action to a schedule to be input to the simulator.

Returns:

schedule (Dict[str, List[float]]): Dictionary mapping: station ids to a schedule of pilot signals.

abstract observation_from_state(self) → Dict[str, np.ndarray]¶

Construct an environment observation from the state of the simulator

Returns:

observation (Dict[str, np.ndarray]): an environment: observation generated from the simulation state

abstract reward_from_state(self) → float¶

Calculate a reward from the state of the simulator

Returns:: reward (float): a reward generated from the simulation state

abstract done_from_state(self) → bool¶

Determine if the simulation is done from the state of the simulator

Returns:: done (bool): True if the simulation is done, False if not

abstract info_from_state(self) → Dict[Any, Any]¶

Give information about the environment using the state of the simulator

Returns:: info (dict): dict of environment information

gym_acnportal.gym_acnsim.envs.base_env¶

Module Contents¶

Classes¶

`gym_acnportal.gym_acnsim.envs.base_env`¶