splendor.agents.our_agents.ppo.self_attn package
Submodules
splendor.agents.our_agents.ppo.self_attn.constants module
Constants related to PPO with Self-Attention.
splendor.agents.our_agents.ppo.self_attn.network module
Implementation of the neural network, with self-attention, for the PPO.
- class splendor.agents.our_agents.ppo.self_attn.network.PPOSelfAttention(input_dim: int, output_dim: int, hidden_layers_dims: list[int] | None = None, dropout: float = 0.2)[source]
Bases:
PPOBase
PPO neural network with self-attention.
- forward(x: Float[Tensor, 'batch sequence features'] | Float[Tensor, 'batch features'] | Float[Tensor, 'features'], action_mask: Float[Tensor, 'batch actions'] | Float[Tensor, 'actions'], *args, **kwargs) tuple[Float[Tensor, 'batch actions'], Float[Tensor, 'batch 1'], None] [source]
Pass input through the network to gain predictions.
- Parameters:
x – the input to the network. expected shape: one of the following: (features,) or (batch_size, features) or (batch_size, sequance_length, features).
action_mask – a binary masking tensor, 1’s signals a valid action and 0’s signals an invalid action. expected shape: (actions,) or (batch_size, actions). where actions are equal to len(ALL_ACTIONS) which comes from Engine.Splendor.gym.envs.actions
hidden_state – hidden state of the recurrent unit. expected shape: (batch_size, num_layers, hidden_state_dim) or (num_layers, hidden_state_dim).
- Returns:
the actions probabilities and the value estimate.
splendor.agents.our_agents.ppo.self_attn.ppo_agent module
An agent which uses PPO with self-attention.
- class splendor.agents.our_agents.ppo.self_attn.ppo_agent.PPOSelfAttentionAgent(_id: int, load_net: bool = True)[source]
Bases:
PPOAgentBase
PPO agent with self-attention.
- SelectAction(actions: list[CollectAction | ReserveAction | BuyAction], game_state: SplendorState, game_rule: SplendorGameRule) CollectAction | ReserveAction | BuyAction [source]
select an action to play from the given actions.
- splendor.agents.our_agents.ppo.self_attn.ppo_agent.myAgent
alias of
PPOSelfAttentionAgent