stochastic_start_epsilon_greedy_policy
======================================

.. py:module:: stochastic_start_epsilon_greedy_policy


Classes
-------

.. autoapisummary::

   stochastic_start_epsilon_greedy_policy.StochasticStartEpsilonGreedyPolicy


Module Contents
---------------

.. py:class:: StochasticStartEpsilonGreedyPolicy(num_actions: int, action_space: Optional[gymnasium.spaces.space.Space] = None, epsilon: float = 0.1)

   Bases: :py:obj:`gridmind.policies.soft.base_soft_policy.BaseSoftPolicy`


   Epsilon-Greedy Policy is a specific implementation of an epsilon-soft policy.
   The epsilon-greedy policy is a specific type of action selection strategy where, with a probability
   ϵ, the agent selects a random action (exploration), and with a probability 1-ϵ, it selects the action
   with the highest estimated value (greedy action).


   .. py:attribute:: action_space
      :value: None


   .. py:attribute:: num_actions


   .. py:attribute:: epsilon
      :value: 0.1


   .. py:attribute:: policy_dict


   .. py:method:: _get_random_action()


   .. py:method:: get_action(state)


   .. py:method:: get_actions(states)


   .. py:method:: _get_greedy_action(state)


   .. py:method:: convert_to_scalar(state)


   .. py:method:: get_action_prob(state, action)


   .. py:method:: get_all_action_probabilities(states)


   .. py:method:: update(state, action)


   .. py:method:: get_action_deterministic(state)


   .. py:method:: set_policy_dict(policy_dict)


   .. py:method:: get_policy_dict()