splendor.splendor.gym.envs package

Submodules

splendor.splendor.gym.envs.actions module

Combinatorially define all possible actions in the game.

class splendor.splendor.gym.envs.actions.Action(type_enum: ActionEnum, collected_gems: dict[Literal['black', 'red', 'yellow', 'green', 'blue', 'white'], int] | None = None, returned_gems: dict[Literal['black', 'red', 'yellow', 'green', 'blue', 'white'], int] | None = None, position: CardPosition | None = None, noble_index: int | None = None)[source]

Bases: object

Represent an action as a dataclass with a more comfortable API.

collected_gems: dict[Literal['black', 'red', 'yellow', 'green', 'blue', 'white'], int] | None = None

noble_index: int | None = None

position: CardPosition | None = None

returned_gems: dict[Literal['black', 'red', 'yellow', 'green', 'blue', 'white'], int] | None = None

classmethod to_action_element(action: CollectAction | ReserveAction | BuyAction, state: SplendorState, agent_index: int) → ActionTypeVar[source]: Convert an action in SplendorGameRule format to Action.

type_enum: ActionEnum

class splendor.splendor.gym.envs.actions.ActionEnum(*values)[source]

Bases: Enum

Enum for all action types.

BUY_AVAILABLE = 5

BUY_RESERVE = 6

COLLECT_DIFF = 3

COLLECT_SAME = 2

PASS = 1

RESERVE = 4

class splendor.splendor.gym.envs.actions.CardPosition(tier: int, card_index: int, reserved_index: int)[source]

Bases: object

dataclass for representing where a card is located on the board.

card_index: int

reserved_index: int

tier: int

splendor.splendor.gym.envs.splendor_env module

Implementation of Splendor as a gym.Env.

class splendor.splendor.gym.envs.splendor_env.SplendorEnv(agents: list[Agent], shuffle_turns: bool = True, fixed_turn: int | None = None, render_mode: str | None = None)[source]

Bases: Env

Custom gym.Env for the game Splendor.

get_legal_actions_mask() → ndarray[Any, dtype[_ScalarType_co]][source]: Create an array of shape (len(ALL_ACTIONS),) whose values are 0’s or 1’s. If the at the i’th index the mask[i] == 1 then the i’th action is legal, otherwise it’s illegal (The legal actions are based on SplendorGameRule).

render() → None[source]

Compute the render frames as specified by render_mode during the initialization of the environment.

The environment’s metadata render modes (env.metadata[“render_modes”]) should contain the possible ways to implement the render modes. In addition, list versions for most render modes is achieved through gymnasium.make which automatically applies a wrapper to collect rendered frames.

Note:: As the render_mode is known during __init__, the objects used to render the environment state should be initialised in __init__.

By convention, if the render_mode is:

None (default): no render is computed.
“human”: The environment is continuously rendered in the current display or terminal, usually for human consumption. This rendering should occur during step() and render() doesn’t need to be called. Returns None.
“rgb_array”: Return a single frame representing the current state of the environment. A frame is a np.ndarray with shape (x, y, 3) representing RGB values for an x-by-y pixel image.
“ansi”: Return a strings (str) or StringIO.StringIO containing a terminal-style text representation for each time step. The text can include newlines and ANSI escape sequences (e.g. for colors).
“rgb_array_list” and “ansi_list”: List based version of render modes are possible (except Human) through the wrapper, gymnasium.wrappers.RenderCollection that is automatically applied during gymnasium.make(..., render_mode="rgb_array_list"). The frames collected are popped after render() is called or reset().

Note:: Make sure that your class’s metadata "render_modes" key includes the list of supported modes.

Changed in version 0.25.0: The render function was changed to no longer accept parameters, rather these parameters should be specified in the environment initialised, i.e., gymnasium.make("CartPole-v1", render_mode="human")

reset(*, seed: int | None = None, options: dict | None = None) → tuple[ndarray[Any, dtype[_ScalarType_co]], dict[str, int]][source]

Reset the environment - Create a new game.

Parameters:

seed – the seed to use in np_random.
options – ignored, both this parameter & seed are passed in order to comply with gym.Env signature.

Returns:

the initial state of a new game and the id (turn) of my agent.

Note:

the order of turns in randomly chosen each time reset is called.

property state: SplendorState: return the current game state itself, not the feature vector of that state.

step(action: int) → tuple[ndarray[Any, dtype[_ScalarType_co]], float, bool, bool, dict][source]

Run one time-step of the environment’s dynamics.

Param:: which action to take.
Returns:: The new state (successor), the reward given, a flag indicating whether or not the game ended, truncated (will be ignored), additional information (will be ignored).
Note:: this method returns 2 redundant variables (truncated & info) only in order to comply with gym.Env.step signature.

property turn: int: return the turn of the player.

splendor.splendor.gym.envs.utils module

Collection of useful utility functions used in the implementation of SplendorEnv.

splendor.splendor.gym.envs.utils.build_action(action_index: int, state: SplendorState, agent_index: int) → dict[source]

Construct the action to be taken from it’s action index in the ALL_ACTION list.

Returns:: the corresponding action to the action_index, in the format required by SplendorGameRule.
Note:: when using this function for building a buying action the function doesn’t takes into account the wildcard gems (yellow) and the owned cards for the conclusion of the returned_gems - this can lead to a broken state where a player have a negative amount of gems…

splendor.splendor.gym.envs.utils.create_action_mapping(legal_actions: list[CollectAction | ReserveAction | BuyAction], state: SplendorState, agent_index: int) → dict[int, CollectAction | ReserveAction | BuyAction][source]: Create the mapping between action indices to legal actions. This would be in use by both SplendorEnv & by the PPO agent.

splendor.splendor.gym.envs.utils.create_legal_actions_mask(legal_actions: list[CollectAction | ReserveAction | BuyAction], state: SplendorState, agent_index: int) → ndarray[Any, dtype[_ScalarType_co]][source]: Create an array of shape (len(ALL_ACTIONS),) whose values are 0’s or 1’s. If the at the i’th index the mask[i] == 1 then the i’th action is legal, otherwise it’s illegal.

Module contents

Import SplendorEnv whenever someone import splendor.Splendor.gym.envs.

class splendor.splendor.gym.envs.SplendorEnv(agents: list[Agent], shuffle_turns: bool = True, fixed_turn: int | None = None, render_mode: str | None = None)[source]