![]() |
Mungojerrie
1.1
Mungojerrie
|
Public Types | |
| enum | GymRewardTypes { GymRewardTypes::default_type = 0, GymRewardTypes::prism = 1, GymRewardTypes::zeta_reach = 2, GymRewardTypes::zeta_acc = 3, GymRewardTypes::zeta_discount = 4, GymRewardTypes::reward_on_acc = 5, GymRewardTypes::multi_discount = 6, GymRewardTypes::parity = 7, GymRewardTypes::pri_tracker = 8, GymRewardTypes::lexo = 9, GymRewardTypes::avg = 10 } |
Public Attributes | |
| unsigned int | episodeLength |
| double | zeta |
| double | gammaB |
| std::vector< double > | tolerance |
| double | priEpsilon |
| GymRewardTypes | rewardType |
| bool | noResetOnAcc |
| bool | terminalUpdateOnTimeLimit |
| bool | p1NotStrategic |
| bool | concatActionsInCSV |
| double | fInvScale |
| double | resetPenalty |
| bool | randInit |
|
strong |
| Enumerator | |
|---|---|
| default_type | Default type is zeta-reach for 1-1/2 and parity for 2-1/2 players |
| prism | Reward from PRISM file. Continuing (non-episodic) setting. Automaton epsilon edges receive zero reward. |
| zeta_reach | Zeta-based reachability | See “Omega-Regular Objectives in Model-Free Reinforcement Learning”. |
| zeta_acc | Zeta-based reachability with reward on accepting transitions | See “Faithful and Effective Reward Schemes for Model-Free Reinforcement Learning of Omega-Regular Objectives”. |
| zeta_discount | Zeta-based discounted reward on accepting transitions | See “Faithful and Effective Reward Schemes for Model-Free Reinforcement Learning of Omega-Regular Objectives”. |
| reward_on_acc | Reward on each accepting transition. MAY LEAD TO INCORRECT STRATEGIES | See "A learning based approach to control synthesis of Markov decision processes for linear temporal logic specifications". |
| multi_discount | Multi-discount reward | See "Control Synthesis from Linear Temporal Logic Specifications using Model-Free Reinforcement Learning". |
| parity | Reward from parity objectives | See “Model-Free Reinforcement Learning for Stochastic Parity Games”. |
| pri_tracker | Reward from priority tracker gadget | See “Model-Free Reinforcement Learning for Stochastic Parity Games”. |
| lexo | Reward from lexicographic objectives. | See “Model-Free Reinforcement Learning for Lexicographic Omega-Regular Objectives”. |
| avg | Average reward for absolute liveness properties. | See "Translating Omega-Regular Specifications to Average Objectives for Model-Free Reinforcement Learning". |
| bool GymOptions::noResetOnAcc |
Turns off resetting episode step when an accepting edge is passed for zeta-reach and zeta-acc (not recommended)
| bool GymOptions::p1NotStrategic |
Does not allow player 1 to change their strategy to the optimal counter-strategy to player 0 during the verification of the learned strategies. Instead, player 1 uses learned strategy.
| bool GymOptions::terminalUpdateOnTimeLimit |
Treat end of episodes that end due to episode limit as transitioning to zero value sink (not recommended)
1.8.17