stochastic_start_epsilon_greedy_policy
Classes
Epsilon-Greedy Policy is a specific implementation of an epsilon-soft policy. |
Module Contents
- class stochastic_start_epsilon_greedy_policy.StochasticStartEpsilonGreedyPolicy(num_actions: int, action_space: gymnasium.spaces.space.Space | None = None, epsilon: float = 0.1)[source]
Bases:
gridmind.policies.soft.base_soft_policy.BaseSoftPolicyEpsilon-Greedy Policy is a specific implementation of an epsilon-soft policy. The epsilon-greedy policy is a specific type of action selection strategy where, with a probability ϵ, the agent selects a random action (exploration), and with a probability 1-ϵ, it selects the action with the highest estimated value (greedy action).