splendor.splendor.gym.envs package
Submodules
splendor.splendor.gym.envs.actions module
Combinatorially define all possible actions in the game.
- class splendor.splendor.gym.envs.actions.Action(type_enum: ActionEnum, collected_gems: dict[Literal['black', 'red', 'yellow', 'green', 'blue', 'white'], int] | None = None, returned_gems: dict[Literal['black', 'red', 'yellow', 'green', 'blue', 'white'], int] | None = None, position: CardPosition | None = None, noble_index: int | None = None)[source]
Bases:
object
Represent an action as a dataclass with a more comfortable API.
- collected_gems: dict[Literal['black', 'red', 'yellow', 'green', 'blue', 'white'], int] | None = None
- noble_index: int | None = None
- position: CardPosition | None = None
- returned_gems: dict[Literal['black', 'red', 'yellow', 'green', 'blue', 'white'], int] | None = None
- classmethod to_action_element(action: CollectAction | ReserveAction | BuyAction, state: SplendorState, agent_index: int) ActionTypeVar [source]
Convert an action in SplendorGameRule format to Action.
- type_enum: ActionEnum
splendor.splendor.gym.envs.splendor_env module
Implementation of Splendor as a gym.Env.
- class splendor.splendor.gym.envs.splendor_env.SplendorEnv(agents: list[Agent], shuffle_turns: bool = True, fixed_turn: int | None = None, render_mode: str | None = None)[source]
Bases:
Env
Custom gym.Env for the game Splendor.
- get_legal_actions_mask() ndarray[Any, dtype[_ScalarType_co]] [source]
Create an array of shape (len(ALL_ACTIONS),) whose values are 0’s or 1’s. If the at the i’th index the mask[i] == 1 then the i’th action is legal, otherwise it’s illegal (The legal actions are based on SplendorGameRule).
- render() None [source]
Compute the render frames as specified by
render_mode
during the initialization of the environment.The environment’s
metadata
render modes (env.metadata[“render_modes”]) should contain the possible ways to implement the render modes. In addition, list versions for most render modes is achieved through gymnasium.make which automatically applies a wrapper to collect rendered frames.- Note:
As the
render_mode
is known during__init__
, the objects used to render the environment state should be initialised in__init__
.
By convention, if the
render_mode
is:None (default): no render is computed.
“human”: The environment is continuously rendered in the current display or terminal, usually for human consumption. This rendering should occur during
step()
andrender()
doesn’t need to be called. ReturnsNone
.“rgb_array”: Return a single frame representing the current state of the environment. A frame is a
np.ndarray
with shape(x, y, 3)
representing RGB values for an x-by-y pixel image.“ansi”: Return a strings (
str
) orStringIO.StringIO
containing a terminal-style text representation for each time step. The text can include newlines and ANSI escape sequences (e.g. for colors).“rgb_array_list” and “ansi_list”: List based version of render modes are possible (except Human) through the wrapper,
gymnasium.wrappers.RenderCollection
that is automatically applied duringgymnasium.make(..., render_mode="rgb_array_list")
. The frames collected are popped afterrender()
is called orreset()
.
- Note:
Make sure that your class’s
metadata
"render_modes"
key includes the list of supported modes.
Changed in version 0.25.0: The render function was changed to no longer accept parameters, rather these parameters should be specified in the environment initialised, i.e.,
gymnasium.make("CartPole-v1", render_mode="human")
- reset(*, seed: int | None = None, options: dict | None = None) tuple[ndarray[Any, dtype[_ScalarType_co]], dict[str, int]] [source]
Reset the environment - Create a new game.
- Parameters:
seed – the seed to use in np_random.
options – ignored, both this parameter & seed are passed in order to comply with gym.Env signature.
- Returns:
the initial state of a new game and the id (turn) of my agent.
- Note:
the order of turns in randomly chosen each time reset is called.
- property state: SplendorState
return the current game state itself, not the feature vector of that state.
- step(action: int) tuple[ndarray[Any, dtype[_ScalarType_co]], float, bool, bool, dict] [source]
Run one time-step of the environment’s dynamics.
- Param:
which action to take.
- Returns:
The new state (successor), the reward given, a flag indicating whether or not the game ended, truncated (will be ignored), additional information (will be ignored).
- Note:
this method returns 2 redundant variables (truncated & info) only in order to comply with gym.Env.step signature.
- property turn: int
return the turn of the player.
splendor.splendor.gym.envs.utils module
Collection of useful utility functions used in the implementation of SplendorEnv.
- splendor.splendor.gym.envs.utils.build_action(action_index: int, state: SplendorState, agent_index: int) dict [source]
Construct the action to be taken from it’s action index in the ALL_ACTION list.
- Returns:
the corresponding action to the action_index, in the format required by SplendorGameRule.
- Note:
when using this function for building a buying action the function doesn’t takes into account the wildcard gems (yellow) and the owned cards for the conclusion of the returned_gems - this can lead to a broken state where a player have a negative amount of gems…
- splendor.splendor.gym.envs.utils.create_action_mapping(legal_actions: list[CollectAction | ReserveAction | BuyAction], state: SplendorState, agent_index: int) dict[int, CollectAction | ReserveAction | BuyAction] [source]
Create the mapping between action indices to legal actions. This would be in use by both SplendorEnv & by the PPO agent.
- splendor.splendor.gym.envs.utils.create_legal_actions_mask(legal_actions: list[CollectAction | ReserveAction | BuyAction], state: SplendorState, agent_index: int) ndarray[Any, dtype[_ScalarType_co]] [source]
Create an array of shape (len(ALL_ACTIONS),) whose values are 0’s or 1’s. If the at the i’th index the mask[i] == 1 then the i’th action is legal, otherwise it’s illegal.
Module contents
Import SplendorEnv whenever someone import splendor.Splendor.gym.envs.
- class splendor.splendor.gym.envs.SplendorEnv(agents: list[Agent], shuffle_turns: bool = True, fixed_turn: int | None = None, render_mode: str | None = None)[source]
Bases:
Env
Custom gym.Env for the game Splendor.
- get_legal_actions_mask() ndarray[Any, dtype[_ScalarType_co]] [source]
Create an array of shape (len(ALL_ACTIONS),) whose values are 0’s or 1’s. If the at the i’th index the mask[i] == 1 then the i’th action is legal, otherwise it’s illegal (The legal actions are based on SplendorGameRule).
- render() None [source]
Compute the render frames as specified by
render_mode
during the initialization of the environment.The environment’s
metadata
render modes (env.metadata[“render_modes”]) should contain the possible ways to implement the render modes. In addition, list versions for most render modes is achieved through gymnasium.make which automatically applies a wrapper to collect rendered frames.- Note:
As the
render_mode
is known during__init__
, the objects used to render the environment state should be initialised in__init__
.
By convention, if the
render_mode
is:None (default): no render is computed.
“human”: The environment is continuously rendered in the current display or terminal, usually for human consumption. This rendering should occur during
step()
andrender()
doesn’t need to be called. ReturnsNone
.“rgb_array”: Return a single frame representing the current state of the environment. A frame is a
np.ndarray
with shape(x, y, 3)
representing RGB values for an x-by-y pixel image.“ansi”: Return a strings (
str
) orStringIO.StringIO
containing a terminal-style text representation for each time step. The text can include newlines and ANSI escape sequences (e.g. for colors).“rgb_array_list” and “ansi_list”: List based version of render modes are possible (except Human) through the wrapper,
gymnasium.wrappers.RenderCollection
that is automatically applied duringgymnasium.make(..., render_mode="rgb_array_list")
. The frames collected are popped afterrender()
is called orreset()
.
- Note:
Make sure that your class’s
metadata
"render_modes"
key includes the list of supported modes.
Changed in version 0.25.0: The render function was changed to no longer accept parameters, rather these parameters should be specified in the environment initialised, i.e.,
gymnasium.make("CartPole-v1", render_mode="human")
- reset(*, seed: int | None = None, options: dict | None = None) tuple[ndarray[Any, dtype[_ScalarType_co]], dict[str, int]] [source]
Reset the environment - Create a new game.
- Parameters:
seed – the seed to use in np_random.
options – ignored, both this parameter & seed are passed in order to comply with gym.Env signature.
- Returns:
the initial state of a new game and the id (turn) of my agent.
- Note:
the order of turns in randomly chosen each time reset is called.
- property state: SplendorState
return the current game state itself, not the feature vector of that state.
- step(action: int) tuple[ndarray[Any, dtype[_ScalarType_co]], float, bool, bool, dict] [source]
Run one time-step of the environment’s dynamics.
- Param:
which action to take.
- Returns:
The new state (successor), the reward given, a flag indicating whether or not the game ended, truncated (will be ignored), additional information (will be ignored).
- Note:
this method returns 2 redundant variables (truncated & info) only in order to comply with gym.Env.step signature.
- property turn: int
return the turn of the player.