• Martin Thoma
  • Home
  • Categories
  • Tags
  • Archives
  • Support me

RL Agents

Contents

  • RL Agents
    • Twitter-Style Explanations
    • Comparisons
      • CartPole-v0
      • CartPole-v1
      • All environments
      • Other
Name Observation Space Action Space Paper
SARSA discrete or continuous discrete Sutton and Barto, 2011, Blog Post
DQN discrete or continuous discrete [MKSG+13], [MKSR+15], [HGS15]
CEM discrete or continuous discrete Szita et al., 2006, Schulman, 2016
DDPG discrete or continuous continuous [LHPH+15]
NAF discrete or continuous continuous [GLSL16]

Twitter-Style Explanations

DQN
Like Q-Learning, but represent the current q-function by a neural network as function approximator.
SARSA
Initialize the Q-Function $Q: \mathcal{X} \times \mathcal{A} \rightarrow \mathbb{R}$ randomly, adjust it by time. See [Pseudocode](https://martin-thoma.com/probabilistische-planung/#sarsa)
DDPG
?
NAF
?

Comparisons

The Code is on Github.

CartPole-v0

The CartPole-v0 environemnt has 2 actions: move the paddle to the right or to the left. A reward of +1 is given is the pole is upright. The episode is finished when the pole is more than 15 degrees from vertical or moves more than 2.4 units from the center.

CartPole-v0 defines "solving" as getting average reward of 195.0 over 100 consecutive trials.

Agent NN Parameters Configuration Time Test reward
CEM 10 steps=1000 9s mean= 9.49, std= 0.79, min= 8.00, max=11.00
CEM 10 (default, steps=10000) 39s mean=77.14, std=44.18, min=41.00, max=200.00
CEM 10 steps=100000 284s mean=106.21, std=19.99, min=71.00, max=185.00
CEM 658 bigger NN 60s mean=42.61, std=36.36, min=10.00, max=200.00
CEM 658 steps=10000, bigger NN 60s mean=200.00, std= 0.00, min=200.00, max=200.00
DQN (default) 40s mean=42.61, std=36.36, min=10.00, max=200.00

The bigger NN is

model = Sequential()
model.add(Flatten(input_shape=input_shape))
model.add(Dense(16))
model.add(Activation("relu"))
model.add(Dense(16))
model.add(Activation("relu"))
model.add(Dense(16))
model.add(Activation("relu"))
model.add(Dense(nb_actions))
model.add(Activation("softmax"))

Ok, so the bigger network is important. Also, 1000 training steps are not enough, but 10000 are. Let's see if we can reduce the episode memory. The episode memory is what is used for training.

Agent EpisodeParameterMemory Time Test reward
CEM 1000 60s mean=200.00, std= 0.00, min=200.00, max=200.00
CEM 500 65s  mean=200.00, std= 0.00, min=200.00, max=200.00
CEM 450 68s mean=200.00, std= 0.00, min=200.00, max=200.00
CEM 400 59s mean=200.00, std= 0.00, min=200.00, max=200.00
CEM 300 51s mean=103.73, std=37.07, min=55.00, max=200.00
CEM 200 38s mean=34.22, std= 6.75, min=17.00, max=52.00
CEM 100 56s mean=82.77, std=25.05, min=32.00, max=172.00

CartPole-v1

Agent Config Time Test reward
CEM (default) 100s mean=461.70, std=66.26, min=264.00, max=500.00
DQN (default) 30s mean=10.62, std= 4.31, min= 8.00, max=30.00

All environments

You can list all environments with

#!/usr/bin/env python

"""Print OpenAI Gym Environment data."""

import gym
from gym import envs

envids = [spec.id for spec in envs.registry.all()]
print('<table class="table">')
for i, envid in enumerate(sorted(envids), start=1):
    try:
        env = gym.make(envid)
        observations = env.observation_space
        actions = env.action_space
    except:
        observations = "Error"
        actions = "Error"
    print(
        '<tr><td id="env-{i}">{i}</td>'
        '<td><a href="https://gym.openai.com/envs/{envid}/" id="{envid}">'
        "{envid}</a></td>"
        "<td>{observations}</td>"
        "<td>{actions}</td></tr>".format(
            i=i, envid=envid, observations=str(observations), actions=str(actions)
        )
    )
print("</table>")

which gives

# Environment Observation Space Action Space Reward Range
1Acrobot-v1Box(6,)Discrete(3)(-inf, inf)
2AirRaid-ram-v0Box(128,)Discrete(6)(-inf, inf)
3AirRaid-ram-v4Box(128,)Discrete(6)(-inf, inf)
4AirRaid-ramDeterministic-v0Box(128,)Discrete(6)(-inf, inf)
5AirRaid-ramDeterministic-v4Box(128,)Discrete(6)(-inf, inf)
6AirRaid-ramNoFrameskip-v0Box(128,)Discrete(6)(-inf, inf)
7AirRaid-ramNoFrameskip-v4Box(128,)Discrete(6)(-inf, inf)
8AirRaid-v0Box(250, 160, 3)Discrete(6)(-inf, inf)
9AirRaid-v4Box(250, 160, 3)Discrete(6)(-inf, inf)
10AirRaidDeterministic-v0Box(250, 160, 3)Discrete(6)(-inf, inf)
11AirRaidDeterministic-v4Box(250, 160, 3)Discrete(6)(-inf, inf)
12AirRaidNoFrameskip-v0Box(250, 160, 3)Discrete(6)(-inf, inf)
13AirRaidNoFrameskip-v4Box(250, 160, 3)Discrete(6)(-inf, inf)
14Alien-ram-v0Box(128,)Discrete(18)(-inf, inf)
15Alien-ram-v4Box(128,)Discrete(18)(-inf, inf)
16Alien-ramDeterministic-v0Box(128,)Discrete(18)(-inf, inf)
17Alien-ramDeterministic-v4Box(128,)Discrete(18)(-inf, inf)
18Alien-ramNoFrameskip-v0Box(128,)Discrete(18)(-inf, inf)
19Alien-ramNoFrameskip-v4Box(128,)Discrete(18)(-inf, inf)
20Alien-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
21Alien-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
22AlienDeterministic-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
23AlienDeterministic-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
24AlienNoFrameskip-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
25AlienNoFrameskip-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
26Amidar-ram-v0Box(128,)Discrete(10)(-inf, inf)
27Amidar-ram-v4Box(128,)Discrete(10)(-inf, inf)
28Amidar-ramDeterministic-v0Box(128,)Discrete(10)(-inf, inf)
29Amidar-ramDeterministic-v4Box(128,)Discrete(10)(-inf, inf)
30Amidar-ramNoFrameskip-v0Box(128,)Discrete(10)(-inf, inf)
31Amidar-ramNoFrameskip-v4Box(128,)Discrete(10)(-inf, inf)
32Amidar-v0Box(250, 160, 3)Discrete(10)(-inf, inf)
33Amidar-v4Box(250, 160, 3)Discrete(10)(-inf, inf)
34AmidarDeterministic-v0Box(250, 160, 3)Discrete(10)(-inf, inf)
35AmidarDeterministic-v4Box(250, 160, 3)Discrete(10)(-inf, inf)
36AmidarNoFrameskip-v0Box(250, 160, 3)Discrete(10)(-inf, inf)
37AmidarNoFrameskip-v4Box(250, 160, 3)Discrete(10)(-inf, inf)
38Ant-v1ErrorErrorError
39Assault-ram-v0Box(128,)Discrete(7)(-inf, inf)
40Assault-ram-v4Box(128,)Discrete(7)(-inf, inf)
41Assault-ramDeterministic-v0Box(128,)Discrete(7)(-inf, inf)
42Assault-ramDeterministic-v4Box(128,)Discrete(7)(-inf, inf)
43Assault-ramNoFrameskip-v0Box(128,)Discrete(7)(-inf, inf)
44Assault-ramNoFrameskip-v4Box(128,)Discrete(7)(-inf, inf)
45Assault-v0Box(250, 160, 3)Discrete(7)(-inf, inf)
46Assault-v4Box(250, 160, 3)Discrete(7)(-inf, inf)
47AssaultDeterministic-v0Box(250, 160, 3)Discrete(7)(-inf, inf)
48AssaultDeterministic-v4Box(250, 160, 3)Discrete(7)(-inf, inf)
49AssaultNoFrameskip-v0Box(250, 160, 3)Discrete(7)(-inf, inf)
50AssaultNoFrameskip-v4Box(250, 160, 3)Discrete(7)(-inf, inf)
51Asterix-ram-v0Box(128,)Discrete(9)(-inf, inf)
52Asterix-ram-v4Box(128,)Discrete(9)(-inf, inf)
53Asterix-ramDeterministic-v0Box(128,)Discrete(9)(-inf, inf)
54Asterix-ramDeterministic-v4Box(128,)Discrete(9)(-inf, inf)
55Asterix-ramNoFrameskip-v0Box(128,)Discrete(9)(-inf, inf)
56Asterix-ramNoFrameskip-v4Box(128,)Discrete(9)(-inf, inf)
57Asterix-v0Box(210, 160, 3)Discrete(9)(-inf, inf)
58Asterix-v4Box(210, 160, 3)Discrete(9)(-inf, inf)
59AsterixDeterministic-v0Box(210, 160, 3)Discrete(9)(-inf, inf)
60AsterixDeterministic-v4Box(210, 160, 3)Discrete(9)(-inf, inf)
61AsterixNoFrameskip-v0Box(210, 160, 3)Discrete(9)(-inf, inf)
62AsterixNoFrameskip-v4Box(210, 160, 3)Discrete(9)(-inf, inf)
63Asteroids-ram-v0Box(128,)Discrete(14)(-inf, inf)
64Asteroids-ram-v4Box(128,)Discrete(14)(-inf, inf)
65Asteroids-ramDeterministic-v0Box(128,)Discrete(14)(-inf, inf)
66Asteroids-ramDeterministic-v4Box(128,)Discrete(14)(-inf, inf)
67Asteroids-ramNoFrameskip-v0Box(128,)Discrete(14)(-inf, inf)
68Asteroids-ramNoFrameskip-v4Box(128,)Discrete(14)(-inf, inf)
69Asteroids-v0Box(210, 160, 3)Discrete(14)(-inf, inf)
70Asteroids-v4Box(210, 160, 3)Discrete(14)(-inf, inf)
71AsteroidsDeterministic-v0Box(210, 160, 3)Discrete(14)(-inf, inf)
72AsteroidsDeterministic-v4Box(210, 160, 3)Discrete(14)(-inf, inf)
73AsteroidsNoFrameskip-v0Box(210, 160, 3)Discrete(14)(-inf, inf)
74AsteroidsNoFrameskip-v4Box(210, 160, 3)Discrete(14)(-inf, inf)
75Atlantis-ram-v0Box(128,)Discrete(4)(-inf, inf)
76Atlantis-ram-v4Box(128,)Discrete(4)(-inf, inf)
77Atlantis-ramDeterministic-v0Box(128,)Discrete(4)(-inf, inf)
78Atlantis-ramDeterministic-v4Box(128,)Discrete(4)(-inf, inf)
79Atlantis-ramNoFrameskip-v0Box(128,)Discrete(4)(-inf, inf)
80Atlantis-ramNoFrameskip-v4Box(128,)Discrete(4)(-inf, inf)
81Atlantis-v0Box(210, 160, 3)Discrete(4)(-inf, inf)
82Atlantis-v4Box(210, 160, 3)Discrete(4)(-inf, inf)
83AtlantisDeterministic-v0Box(210, 160, 3)Discrete(4)(-inf, inf)
84AtlantisDeterministic-v4Box(210, 160, 3)Discrete(4)(-inf, inf)
85AtlantisNoFrameskip-v0Box(210, 160, 3)Discrete(4)(-inf, inf)
86AtlantisNoFrameskip-v4Box(210, 160, 3)Discrete(4)(-inf, inf)
87BankHeist-ram-v0Box(128,)Discrete(18)(-inf, inf)
88BankHeist-ram-v4Box(128,)Discrete(18)(-inf, inf)
89BankHeist-ramDeterministic-v0Box(128,)Discrete(18)(-inf, inf)
90BankHeist-ramDeterministic-v4Box(128,)Discrete(18)(-inf, inf)
91BankHeist-ramNoFrameskip-v0Box(128,)Discrete(18)(-inf, inf)
92BankHeist-ramNoFrameskip-v4Box(128,)Discrete(18)(-inf, inf)
93BankHeist-v0Box(250, 160, 3)Discrete(18)(-inf, inf)
94BankHeist-v4Box(250, 160, 3)Discrete(18)(-inf, inf)
95BankHeistDeterministic-v0Box(250, 160, 3)Discrete(18)(-inf, inf)
96BankHeistDeterministic-v4Box(250, 160, 3)Discrete(18)(-inf, inf)
97BankHeistNoFrameskip-v0Box(250, 160, 3)Discrete(18)(-inf, inf)
98BankHeistNoFrameskip-v4Box(250, 160, 3)Discrete(18)(-inf, inf)
99BattleZone-ram-v0Box(128,)Discrete(18)(-inf, inf)
100BattleZone-ram-v4Box(128,)Discrete(18)(-inf, inf)
101BattleZone-ramDeterministic-v0Box(128,)Discrete(18)(-inf, inf)
102BattleZone-ramDeterministic-v4Box(128,)Discrete(18)(-inf, inf)
103BattleZone-ramNoFrameskip-v0Box(128,)Discrete(18)(-inf, inf)
104BattleZone-ramNoFrameskip-v4Box(128,)Discrete(18)(-inf, inf)
105BattleZone-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
106BattleZone-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
107BattleZoneDeterministic-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
108BattleZoneDeterministic-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
109BattleZoneNoFrameskip-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
110BattleZoneNoFrameskip-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
111BeamRider-ram-v0Box(128,)Discrete(9)(-inf, inf)
112BeamRider-ram-v4Box(128,)Discrete(9)(-inf, inf)
113BeamRider-ramDeterministic-v0Box(128,)Discrete(9)(-inf, inf)
114BeamRider-ramDeterministic-v4Box(128,)Discrete(9)(-inf, inf)
115BeamRider-ramNoFrameskip-v0Box(128,)Discrete(9)(-inf, inf)
116BeamRider-ramNoFrameskip-v4Box(128,)Discrete(9)(-inf, inf)
117BeamRider-v0Box(210, 160, 3)Discrete(9)(-inf, inf)
118BeamRider-v4Box(210, 160, 3)Discrete(9)(-inf, inf)
119BeamRiderDeterministic-v0Box(210, 160, 3)Discrete(9)(-inf, inf)
120BeamRiderDeterministic-v4Box(210, 160, 3)Discrete(9)(-inf, inf)
121BeamRiderNoFrameskip-v0Box(210, 160, 3)Discrete(9)(-inf, inf)
122BeamRiderNoFrameskip-v4Box(210, 160, 3)Discrete(9)(-inf, inf)
123Berzerk-ram-v0Box(128,)Discrete(18)(-inf, inf)
124Berzerk-ram-v4Box(128,)Discrete(18)(-inf, inf)
125Berzerk-ramDeterministic-v0Box(128,)Discrete(18)(-inf, inf)
126Berzerk-ramDeterministic-v4Box(128,)Discrete(18)(-inf, inf)
127Berzerk-ramNoFrameskip-v0Box(128,)Discrete(18)(-inf, inf)
128Berzerk-ramNoFrameskip-v4Box(128,)Discrete(18)(-inf, inf)
129Berzerk-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
130Berzerk-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
131BerzerkDeterministic-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
132BerzerkDeterministic-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
133BerzerkNoFrameskip-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
134BerzerkNoFrameskip-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
135BipedalWalker-v2ErrorErrorError
136BipedalWalkerHardcore-v2ErrorErrorError
137Blackjack-v0Tuple(Discrete(32), Discrete(11), Discrete(2))Discrete(2)(-inf, inf)
138Bowling-ram-v0Box(128,)Discrete(6)(-inf, inf)
139Bowling-ram-v4Box(128,)Discrete(6)(-inf, inf)
140Bowling-ramDeterministic-v0Box(128,)Discrete(6)(-inf, inf)
141Bowling-ramDeterministic-v4Box(128,)Discrete(6)(-inf, inf)
142Bowling-ramNoFrameskip-v0Box(128,)Discrete(6)(-inf, inf)
143Bowling-ramNoFrameskip-v4Box(128,)Discrete(6)(-inf, inf)
144Bowling-v0Box(210, 160, 3)Discrete(6)(-inf, inf)
145Bowling-v4Box(210, 160, 3)Discrete(6)(-inf, inf)
146BowlingDeterministic-v0Box(210, 160, 3)Discrete(6)(-inf, inf)
147BowlingDeterministic-v4Box(210, 160, 3)Discrete(6)(-inf, inf)
148BowlingNoFrameskip-v0Box(210, 160, 3)Discrete(6)(-inf, inf)
149BowlingNoFrameskip-v4Box(210, 160, 3)Discrete(6)(-inf, inf)
150Boxing-ram-v0Box(128,)Discrete(18)(-inf, inf)
151Boxing-ram-v4Box(128,)Discrete(18)(-inf, inf)
152Boxing-ramDeterministic-v0Box(128,)Discrete(18)(-inf, inf)
153Boxing-ramDeterministic-v4Box(128,)Discrete(18)(-inf, inf)
154Boxing-ramNoFrameskip-v0Box(128,)Discrete(18)(-inf, inf)
155Boxing-ramNoFrameskip-v4Box(128,)Discrete(18)(-inf, inf)
156Boxing-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
157Boxing-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
158BoxingDeterministic-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
159BoxingDeterministic-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
160BoxingNoFrameskip-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
161BoxingNoFrameskip-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
162Breakout-ram-v0Box(128,)Discrete(6)(-inf, inf)
163Breakout-ram-v4Box(128,)Discrete(6)(-inf, inf)
164Breakout-ramDeterministic-v0Box(128,)Discrete(6)(-inf, inf)
165Breakout-ramDeterministic-v4Box(128,)Discrete(6)(-inf, inf)
166Breakout-ramNoFrameskip-v0Box(128,)Discrete(6)(-inf, inf)
167Breakout-ramNoFrameskip-v4Box(128,)Discrete(6)(-inf, inf)
168Breakout-v0Box(210, 160, 3)Discrete(6)(-inf, inf)
169Breakout-v4Box(210, 160, 3)Discrete(6)(-inf, inf)
170BreakoutDeterministic-v0Box(210, 160, 3)Discrete(6)(-inf, inf)
171BreakoutDeterministic-v4Box(210, 160, 3)Discrete(6)(-inf, inf)
172BreakoutNoFrameskip-v0Box(210, 160, 3)Discrete(6)(-inf, inf)
173BreakoutNoFrameskip-v4Box(210, 160, 3)Discrete(6)(-inf, inf)
174CNNClassifierTraining-v0ErrorErrorError
175CarRacing-v0ErrorErrorError
176Carnival-ram-v0Box(128,)Discrete(6)(-inf, inf)
177Carnival-ram-v4Box(128,)Discrete(6)(-inf, inf)
178Carnival-ramDeterministic-v0Box(128,)Discrete(6)(-inf, inf)
179Carnival-ramDeterministic-v4Box(128,)Discrete(6)(-inf, inf)
180Carnival-ramNoFrameskip-v0Box(128,)Discrete(6)(-inf, inf)
181Carnival-ramNoFrameskip-v4Box(128,)Discrete(6)(-inf, inf)
182Carnival-v0Box(250, 160, 3)Discrete(6)(-inf, inf)
183Carnival-v4Box(250, 160, 3)Discrete(6)(-inf, inf)
184CarnivalDeterministic-v0Box(250, 160, 3)Discrete(6)(-inf, inf)
185CarnivalDeterministic-v4Box(250, 160, 3)Discrete(6)(-inf, inf)
186CarnivalNoFrameskip-v0Box(250, 160, 3)Discrete(6)(-inf, inf)
187CarnivalNoFrameskip-v4Box(250, 160, 3)Discrete(6)(-inf, inf)
188CartPole-v0Box(4,)Discrete(2)(-inf, inf)
189CartPole-v1Box(4,)Discrete(2)(-inf, inf)
190Centipede-ram-v0Box(128,)Discrete(18)(-inf, inf)
191Centipede-ram-v4Box(128,)Discrete(18)(-inf, inf)
192Centipede-ramDeterministic-v0Box(128,)Discrete(18)(-inf, inf)
193Centipede-ramDeterministic-v4Box(128,)Discrete(18)(-inf, inf)
194Centipede-ramNoFrameskip-v0Box(128,)Discrete(18)(-inf, inf)
195Centipede-ramNoFrameskip-v4Box(128,)Discrete(18)(-inf, inf)
196Centipede-v0Box(250, 160, 3)Discrete(18)(-inf, inf)
197Centipede-v4Box(250, 160, 3)Discrete(18)(-inf, inf)
198CentipedeDeterministic-v0Box(250, 160, 3)Discrete(18)(-inf, inf)
199CentipedeDeterministic-v4Box(250, 160, 3)Discrete(18)(-inf, inf)
200CentipedeNoFrameskip-v0Box(250, 160, 3)Discrete(18)(-inf, inf)
201CentipedeNoFrameskip-v4Box(250, 160, 3)Discrete(18)(-inf, inf)
202ChopperCommand-ram-v0Box(128,)Discrete(18)(-inf, inf)
203ChopperCommand-ram-v4Box(128,)Discrete(18)(-inf, inf)
204ChopperCommand-ramDeterministic-v0Box(128,)Discrete(18)(-inf, inf)
205ChopperCommand-ramDeterministic-v4Box(128,)Discrete(18)(-inf, inf)
206ChopperCommand-ramNoFrameskip-v0Box(128,)Discrete(18)(-inf, inf)
207ChopperCommand-ramNoFrameskip-v4Box(128,)Discrete(18)(-inf, inf)
208ChopperCommand-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
209ChopperCommand-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
210ChopperCommandDeterministic-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
211ChopperCommandDeterministic-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
212ChopperCommandNoFrameskip-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
213ChopperCommandNoFrameskip-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
214CliffWalking-v0Discrete(48)Discrete(4)(-inf, inf)
215ConvergenceControl-v0ErrorErrorError
216Copy-v0Discrete(6)Tuple(Discrete(2), Discrete(2), Discrete(5))(-inf, inf)
217CrazyClimber-ram-v0Box(128,)Discrete(9)(-inf, inf)
218CrazyClimber-ram-v4Box(128,)Discrete(9)(-inf, inf)
219CrazyClimber-ramDeterministic-v0Box(128,)Discrete(9)(-inf, inf)
220CrazyClimber-ramDeterministic-v4Box(128,)Discrete(9)(-inf, inf)
221CrazyClimber-ramNoFrameskip-v0Box(128,)Discrete(9)(-inf, inf)
222CrazyClimber-ramNoFrameskip-v4Box(128,)Discrete(9)(-inf, inf)
223CrazyClimber-v0Box(210, 160, 3)Discrete(9)(-inf, inf)
224CrazyClimber-v4Box(210, 160, 3)Discrete(9)(-inf, inf)
225CrazyClimberDeterministic-v0Box(210, 160, 3)Discrete(9)(-inf, inf)
226CrazyClimberDeterministic-v4Box(210, 160, 3)Discrete(9)(-inf, inf)
227CrazyClimberNoFrameskip-v0Box(210, 160, 3)Discrete(9)(-inf, inf)
228CrazyClimberNoFrameskip-v4Box(210, 160, 3)Discrete(9)(-inf, inf)
229DemonAttack-ram-v0Box(128,)Discrete(6)(-inf, inf)
230DemonAttack-ram-v4Box(128,)Discrete(6)(-inf, inf)
231DemonAttack-ramDeterministic-v0Box(128,)Discrete(6)(-inf, inf)
232DemonAttack-ramDeterministic-v4Box(128,)Discrete(6)(-inf, inf)
233DemonAttack-ramNoFrameskip-v0Box(128,)Discrete(6)(-inf, inf)
234DemonAttack-ramNoFrameskip-v4Box(128,)Discrete(6)(-inf, inf)
235DemonAttack-v0Box(210, 160, 3)Discrete(6)(-inf, inf)
236DemonAttack-v4Box(210, 160, 3)Discrete(6)(-inf, inf)
237DemonAttackDeterministic-v0Box(210, 160, 3)Discrete(6)(-inf, inf)
238DemonAttackDeterministic-v4Box(210, 160, 3)Discrete(6)(-inf, inf)
239DemonAttackNoFrameskip-v0Box(210, 160, 3)Discrete(6)(-inf, inf)
240DemonAttackNoFrameskip-v4Box(210, 160, 3)Discrete(6)(-inf, inf)
241DoubleDunk-ram-v0Box(128,)Discrete(18)(-inf, inf)
242DoubleDunk-ram-v4Box(128,)Discrete(18)(-inf, inf)
243DoubleDunk-ramDeterministic-v0Box(128,)Discrete(18)(-inf, inf)
244DoubleDunk-ramDeterministic-v4Box(128,)Discrete(18)(-inf, inf)
245DoubleDunk-ramNoFrameskip-v0Box(128,)Discrete(18)(-inf, inf)
246DoubleDunk-ramNoFrameskip-v4Box(128,)Discrete(18)(-inf, inf)
247DoubleDunk-v0Box(250, 160, 3)Discrete(18)(-inf, inf)
248DoubleDunk-v4Box(250, 160, 3)Discrete(18)(-inf, inf)
249DoubleDunkDeterministic-v0Box(250, 160, 3)Discrete(18)(-inf, inf)
250DoubleDunkDeterministic-v4Box(250, 160, 3)Discrete(18)(-inf, inf)
251DoubleDunkNoFrameskip-v0Box(250, 160, 3)Discrete(18)(-inf, inf)
252DoubleDunkNoFrameskip-v4Box(250, 160, 3)Discrete(18)(-inf, inf)
253DuplicatedInput-v0Discrete(6)Tuple(Discrete(2), Discrete(2), Discrete(5))(-inf, inf)
254ElevatorAction-ram-v0Box(128,)Discrete(18)(-inf, inf)
255ElevatorAction-ram-v4Box(128,)Discrete(18)(-inf, inf)
256ElevatorAction-ramDeterministic-v0Box(128,)Discrete(18)(-inf, inf)
257ElevatorAction-ramDeterministic-v4Box(128,)Discrete(18)(-inf, inf)
258ElevatorAction-ramNoFrameskip-v0Box(128,)Discrete(18)(-inf, inf)
259ElevatorAction-ramNoFrameskip-v4Box(128,)Discrete(18)(-inf, inf)
260ElevatorAction-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
261ElevatorAction-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
262ElevatorActionDeterministic-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
263ElevatorActionDeterministic-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
264ElevatorActionNoFrameskip-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
265ElevatorActionNoFrameskip-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
266Enduro-ram-v0Box(128,)Discrete(9)(-inf, inf)
267Enduro-ram-v4Box(128,)Discrete(9)(-inf, inf)
268Enduro-ramDeterministic-v0Box(128,)Discrete(9)(-inf, inf)
269Enduro-ramDeterministic-v4Box(128,)Discrete(9)(-inf, inf)
270Enduro-ramNoFrameskip-v0Box(128,)Discrete(9)(-inf, inf)
271Enduro-ramNoFrameskip-v4Box(128,)Discrete(9)(-inf, inf)
272Enduro-v0Box(210, 160, 3)Discrete(9)(-inf, inf)
273Enduro-v4Box(210, 160, 3)Discrete(9)(-inf, inf)
274EnduroDeterministic-v0Box(210, 160, 3)Discrete(9)(-inf, inf)
275EnduroDeterministic-v4Box(210, 160, 3)Discrete(9)(-inf, inf)
276EnduroNoFrameskip-v0Box(210, 160, 3)Discrete(9)(-inf, inf)
277EnduroNoFrameskip-v4Box(210, 160, 3)Discrete(9)(-inf, inf)
278FishingDerby-ram-v0Box(128,)Discrete(18)(-inf, inf)
279FishingDerby-ram-v4Box(128,)Discrete(18)(-inf, inf)
280FishingDerby-ramDeterministic-v0Box(128,)Discrete(18)(-inf, inf)
281FishingDerby-ramDeterministic-v4Box(128,)Discrete(18)(-inf, inf)
282FishingDerby-ramNoFrameskip-v0Box(128,)Discrete(18)(-inf, inf)
283FishingDerby-ramNoFrameskip-v4Box(128,)Discrete(18)(-inf, inf)
284FishingDerby-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
285FishingDerby-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
286FishingDerbyDeterministic-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
287FishingDerbyDeterministic-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
288FishingDerbyNoFrameskip-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
289FishingDerbyNoFrameskip-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
290Freeway-ram-v0Box(128,)Discrete(3)(-inf, inf)
291Freeway-ram-v4Box(128,)Discrete(3)(-inf, inf)
292Freeway-ramDeterministic-v0Box(128,)Discrete(3)(-inf, inf)
293Freeway-ramDeterministic-v4Box(128,)Discrete(3)(-inf, inf)
294Freeway-ramNoFrameskip-v0Box(128,)Discrete(3)(-inf, inf)
295Freeway-ramNoFrameskip-v4Box(128,)Discrete(3)(-inf, inf)
296Freeway-v0Box(210, 160, 3)Discrete(3)(-inf, inf)
297Freeway-v4Box(210, 160, 3)Discrete(3)(-inf, inf)
298FreewayDeterministic-v0Box(210, 160, 3)Discrete(3)(-inf, inf)
299FreewayDeterministic-v4Box(210, 160, 3)Discrete(3)(-inf, inf)
300FreewayNoFrameskip-v0Box(210, 160, 3)Discrete(3)(-inf, inf)
301FreewayNoFrameskip-v4Box(210, 160, 3)Discrete(3)(-inf, inf)
302Frostbite-ram-v0Box(128,)Discrete(18)(-inf, inf)
303Frostbite-ram-v4Box(128,)Discrete(18)(-inf, inf)
304Frostbite-ramDeterministic-v0Box(128,)Discrete(18)(-inf, inf)
305Frostbite-ramDeterministic-v4Box(128,)Discrete(18)(-inf, inf)
306Frostbite-ramNoFrameskip-v0Box(128,)Discrete(18)(-inf, inf)
307Frostbite-ramNoFrameskip-v4Box(128,)Discrete(18)(-inf, inf)
308Frostbite-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
309Frostbite-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
310FrostbiteDeterministic-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
311FrostbiteDeterministic-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
312FrostbiteNoFrameskip-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
313FrostbiteNoFrameskip-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
314FrozenLake-v0Discrete(16)Discrete(4)(-inf, inf)
315FrozenLake8x8-v0Discrete(64)Discrete(4)(-inf, inf)
316Go19x19-v0ErrorErrorError
317Go9x9-v0ErrorErrorError
318Gopher-ram-v0Box(128,)Discrete(8)(-inf, inf)
319Gopher-ram-v4Box(128,)Discrete(8)(-inf, inf)
320Gopher-ramDeterministic-v0Box(128,)Discrete(8)(-inf, inf)
321Gopher-ramDeterministic-v4Box(128,)Discrete(8)(-inf, inf)
322Gopher-ramNoFrameskip-v0Box(128,)Discrete(8)(-inf, inf)
323Gopher-ramNoFrameskip-v4Box(128,)Discrete(8)(-inf, inf)
324Gopher-v0Box(250, 160, 3)Discrete(8)(-inf, inf)
325Gopher-v4Box(250, 160, 3)Discrete(8)(-inf, inf)
326GopherDeterministic-v0Box(250, 160, 3)Discrete(8)(-inf, inf)
327GopherDeterministic-v4Box(250, 160, 3)Discrete(8)(-inf, inf)
328GopherNoFrameskip-v0Box(250, 160, 3)Discrete(8)(-inf, inf)
329GopherNoFrameskip-v4Box(250, 160, 3)Discrete(8)(-inf, inf)
330Gravitar-ram-v0Box(128,)Discrete(18)(-inf, inf)
331Gravitar-ram-v4Box(128,)Discrete(18)(-inf, inf)
332Gravitar-ramDeterministic-v0Box(128,)Discrete(18)(-inf, inf)
333Gravitar-ramDeterministic-v4Box(128,)Discrete(18)(-inf, inf)
334Gravitar-ramNoFrameskip-v0Box(128,)Discrete(18)(-inf, inf)
335Gravitar-ramNoFrameskip-v4Box(128,)Discrete(18)(-inf, inf)
336Gravitar-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
337Gravitar-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
338GravitarDeterministic-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
339GravitarDeterministic-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
340GravitarNoFrameskip-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
341GravitarNoFrameskip-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
342GuessingGame-v0Discrete(4)Box(1,)(-inf, inf)
343HalfCheetah-v1ErrorErrorError
344Hero-ram-v0ErrorErrorError
345Hero-ram-v4ErrorErrorError
346Hero-ramDeterministic-v0ErrorErrorError
347Hero-ramDeterministic-v4ErrorErrorError
348Hero-ramNoFrameskip-v0ErrorErrorError
349Hero-ramNoFrameskip-v4ErrorErrorError
350Hero-v0ErrorErrorError
351Hero-v4ErrorErrorError
352HeroDeterministic-v0ErrorErrorError
353HeroDeterministic-v4ErrorErrorError
354HeroNoFrameskip-v0ErrorErrorError
355HeroNoFrameskip-v4ErrorErrorError
356Hex9x9-v0ErrorErrorError
357Hopper-v1ErrorErrorError
358HotterColder-v0Discrete(4)Box(1,)(-inf, inf)
359Humanoid-v1ErrorErrorError
360HumanoidStandup-v1ErrorErrorError
361IceHockey-ram-v0Box(128,)Discrete(18)(-inf, inf)
362IceHockey-ram-v4Box(128,)Discrete(18)(-inf, inf)
363IceHockey-ramDeterministic-v0Box(128,)Discrete(18)(-inf, inf)
364IceHockey-ramDeterministic-v4Box(128,)Discrete(18)(-inf, inf)
365IceHockey-ramNoFrameskip-v0Box(128,)Discrete(18)(-inf, inf)
366IceHockey-ramNoFrameskip-v4Box(128,)Discrete(18)(-inf, inf)
367IceHockey-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
368IceHockey-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
369IceHockeyDeterministic-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
370IceHockeyDeterministic-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
371IceHockeyNoFrameskip-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
372IceHockeyNoFrameskip-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
373InvertedDoublePendulum-v1ErrorErrorError
374InvertedPendulum-v1ErrorErrorError
375Jamesbond-ram-v0Box(128,)Discrete(18)(-inf, inf)
376Jamesbond-ram-v4Box(128,)Discrete(18)(-inf, inf)
377Jamesbond-ramDeterministic-v0Box(128,)Discrete(18)(-inf, inf)
378Jamesbond-ramDeterministic-v4Box(128,)Discrete(18)(-inf, inf)
379Jamesbond-ramNoFrameskip-v0Box(128,)Discrete(18)(-inf, inf)
380Jamesbond-ramNoFrameskip-v4Box(128,)Discrete(18)(-inf, inf)
381Jamesbond-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
382Jamesbond-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
383JamesbondDeterministic-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
384JamesbondDeterministic-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
385JamesbondNoFrameskip-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
386JamesbondNoFrameskip-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
387JourneyEscape-ram-v0Box(128,)Discrete(16)(-inf, inf)
388JourneyEscape-ram-v4Box(128,)Discrete(16)(-inf, inf)
389JourneyEscape-ramDeterministic-v0Box(128,)Discrete(16)(-inf, inf)
390JourneyEscape-ramDeterministic-v4Box(128,)Discrete(16)(-inf, inf)
391JourneyEscape-ramNoFrameskip-v0Box(128,)Discrete(16)(-inf, inf)
392JourneyEscape-ramNoFrameskip-v4Box(128,)Discrete(16)(-inf, inf)
393JourneyEscape-v0Box(230, 160, 3)Discrete(16)(-inf, inf)
394JourneyEscape-v4Box(230, 160, 3)Discrete(16)(-inf, inf)
395JourneyEscapeDeterministic-v0Box(230, 160, 3)Discrete(16)(-inf, inf)
396JourneyEscapeDeterministic-v4Box(230, 160, 3)Discrete(16)(-inf, inf)
397JourneyEscapeNoFrameskip-v0Box(230, 160, 3)Discrete(16)(-inf, inf)
398JourneyEscapeNoFrameskip-v4Box(230, 160, 3)Discrete(16)(-inf, inf)
399Kangaroo-ram-v0Box(128,)Discrete(18)(-inf, inf)
400Kangaroo-ram-v4Box(128,)Discrete(18)(-inf, inf)
401Kangaroo-ramDeterministic-v0Box(128,)Discrete(18)(-inf, inf)
402Kangaroo-ramDeterministic-v4Box(128,)Discrete(18)(-inf, inf)
403Kangaroo-ramNoFrameskip-v0Box(128,)Discrete(18)(-inf, inf)
404Kangaroo-ramNoFrameskip-v4Box(128,)Discrete(18)(-inf, inf)
405Kangaroo-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
406Kangaroo-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
407KangarooDeterministic-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
408KangarooDeterministic-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
409KangarooNoFrameskip-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
410KangarooNoFrameskip-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
411KellyCoinflip-v0Tuple(Box(1,), Discrete(301))Discrete(25000)(0, 250.0)
412KellyCoinflipGeneralized-v0Tuple(Box(1,), Discrete(280), Discrete(280), Discrete(280), Box(1,))Discrete(20300)(0, 203.0)
413Krull-ram-v0Box(128,)Discrete(18)(-inf, inf)
414Krull-ram-v4Box(128,)Discrete(18)(-inf, inf)
415Krull-ramDeterministic-v0Box(128,)Discrete(18)(-inf, inf)
416Krull-ramDeterministic-v4Box(128,)Discrete(18)(-inf, inf)
417Krull-ramNoFrameskip-v0Box(128,)Discrete(18)(-inf, inf)
418Krull-ramNoFrameskip-v4Box(128,)Discrete(18)(-inf, inf)
419Krull-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
420Krull-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
421KrullDeterministic-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
422KrullDeterministic-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
423KrullNoFrameskip-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
424KrullNoFrameskip-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
425KungFuMaster-ram-v0Box(128,)Discrete(14)(-inf, inf)
426KungFuMaster-ram-v4Box(128,)Discrete(14)(-inf, inf)
427KungFuMaster-ramDeterministic-v0Box(128,)Discrete(14)(-inf, inf)
428KungFuMaster-ramDeterministic-v4Box(128,)Discrete(14)(-inf, inf)
429KungFuMaster-ramNoFrameskip-v0Box(128,)Discrete(14)(-inf, inf)
430KungFuMaster-ramNoFrameskip-v4Box(128,)Discrete(14)(-inf, inf)
431KungFuMaster-v0Box(210, 160, 3)Discrete(14)(-inf, inf)
432KungFuMaster-v4Box(210, 160, 3)Discrete(14)(-inf, inf)
433KungFuMasterDeterministic-v0Box(210, 160, 3)Discrete(14)(-inf, inf)
434KungFuMasterDeterministic-v4Box(210, 160, 3)Discrete(14)(-inf, inf)
435KungFuMasterNoFrameskip-v0Box(210, 160, 3)Discrete(14)(-inf, inf)
436KungFuMasterNoFrameskip-v4Box(210, 160, 3)Discrete(14)(-inf, inf)
437LunarLander-v2ErrorErrorError
438LunarLanderContinuous-v2ErrorErrorError
439MontezumaRevenge-ram-v0Box(128,)Discrete(18)(-inf, inf)
440MontezumaRevenge-ram-v4Box(128,)Discrete(18)(-inf, inf)
441MontezumaRevenge-ramDeterministic-v0Box(128,)Discrete(18)(-inf, inf)
442MontezumaRevenge-ramDeterministic-v4Box(128,)Discrete(18)(-inf, inf)
443MontezumaRevenge-ramNoFrameskip-v0Box(128,)Discrete(18)(-inf, inf)
444MontezumaRevenge-ramNoFrameskip-v4Box(128,)Discrete(18)(-inf, inf)
445MontezumaRevenge-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
446MontezumaRevenge-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
447MontezumaRevengeDeterministic-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
448MontezumaRevengeDeterministic-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
449MontezumaRevengeNoFrameskip-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
450MontezumaRevengeNoFrameskip-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
451MountainCar-v0Box(2,)Discrete(3)(-inf, inf)
452MountainCarContinuous-v0Box(2,)Box(1,)(-inf, inf)
453MsPacman-ram-v0Box(128,)Discrete(9)(-inf, inf)
454MsPacman-ram-v4Box(128,)Discrete(9)(-inf, inf)
455MsPacman-ramDeterministic-v0Box(128,)Discrete(9)(-inf, inf)
456MsPacman-ramDeterministic-v4Box(128,)Discrete(9)(-inf, inf)
457MsPacman-ramNoFrameskip-v0Box(128,)Discrete(9)(-inf, inf)
458MsPacman-ramNoFrameskip-v4Box(128,)Discrete(9)(-inf, inf)
459MsPacman-v0Box(210, 160, 3)Discrete(9)(-inf, inf)
460MsPacman-v4Box(210, 160, 3)Discrete(9)(-inf, inf)
461MsPacmanDeterministic-v0Box(210, 160, 3)Discrete(9)(-inf, inf)
462MsPacmanDeterministic-v4Box(210, 160, 3)Discrete(9)(-inf, inf)
463MsPacmanNoFrameskip-v0Box(210, 160, 3)Discrete(9)(-inf, inf)
464MsPacmanNoFrameskip-v4Box(210, 160, 3)Discrete(9)(-inf, inf)
465NChain-v0Discrete(5)Discrete(2)(-inf, inf)
466NameThisGame-ram-v0Box(128,)Discrete(6)(-inf, inf)
467NameThisGame-ram-v4Box(128,)Discrete(6)(-inf, inf)
468NameThisGame-ramDeterministic-v0Box(128,)Discrete(6)(-inf, inf)
469NameThisGame-ramDeterministic-v4Box(128,)Discrete(6)(-inf, inf)
470NameThisGame-ramNoFrameskip-v0Box(128,)Discrete(6)(-inf, inf)
471NameThisGame-ramNoFrameskip-v4Box(128,)Discrete(6)(-inf, inf)
472NameThisGame-v0Box(210, 160, 3)Discrete(6)(-inf, inf)
473NameThisGame-v4Box(210, 160, 3)Discrete(6)(-inf, inf)
474NameThisGameDeterministic-v0Box(210, 160, 3)Discrete(6)(-inf, inf)
475NameThisGameDeterministic-v4Box(210, 160, 3)Discrete(6)(-inf, inf)
476NameThisGameNoFrameskip-v0Box(210, 160, 3)Discrete(6)(-inf, inf)
477NameThisGameNoFrameskip-v4Box(210, 160, 3)Discrete(6)(-inf, inf)
478OffSwitchCartpole-v0Tuple(Discrete(2), Box(4,))Discrete(2)(-inf, inf)
479OffSwitchCartpoleProb-v0Tuple(Discrete(2), Box(4,))Discrete(2)(-inf, inf)
480OneRoundDeterministicReward-v0Discrete(1)Discrete(2)(-inf, inf)
481OneRoundNondeterministicReward-v0Discrete(1)Discrete(2)(-inf, inf)
482Pendulum-v0Box(3,)Box(1,)(-inf, inf)
483Phoenix-ram-v0Box(128,)Discrete(8)(-inf, inf)
484Phoenix-ram-v4Box(128,)Discrete(8)(-inf, inf)
485Phoenix-ramDeterministic-v0Box(128,)Discrete(8)(-inf, inf)
486Phoenix-ramDeterministic-v4Box(128,)Discrete(8)(-inf, inf)
487Phoenix-ramNoFrameskip-v0Box(128,)Discrete(8)(-inf, inf)
488Phoenix-ramNoFrameskip-v4Box(128,)Discrete(8)(-inf, inf)
489Phoenix-v0Box(210, 160, 3)Discrete(8)(-inf, inf)
490Phoenix-v4Box(210, 160, 3)Discrete(8)(-inf, inf)
491PhoenixDeterministic-v0Box(210, 160, 3)Discrete(8)(-inf, inf)
492PhoenixDeterministic-v4Box(210, 160, 3)Discrete(8)(-inf, inf)
493PhoenixNoFrameskip-v0Box(210, 160, 3)Discrete(8)(-inf, inf)
494PhoenixNoFrameskip-v4Box(210, 160, 3)Discrete(8)(-inf, inf)
495Pitfall-ram-v0Box(128,)Discrete(18)(-inf, inf)
496Pitfall-ram-v4Box(128,)Discrete(18)(-inf, inf)
497Pitfall-ramDeterministic-v0Box(128,)Discrete(18)(-inf, inf)
498Pitfall-ramDeterministic-v4Box(128,)Discrete(18)(-inf, inf)
499Pitfall-ramNoFrameskip-v0Box(128,)Discrete(18)(-inf, inf)
500Pitfall-ramNoFrameskip-v4Box(128,)Discrete(18)(-inf, inf)
501Pitfall-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
502Pitfall-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
503PitfallDeterministic-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
504PitfallDeterministic-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
505PitfallNoFrameskip-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
506PitfallNoFrameskip-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
507Pong-ram-v0Box(128,)Discrete(6)(-inf, inf)
508Pong-ram-v4Box(128,)Discrete(6)(-inf, inf)
509Pong-ramDeterministic-v0Box(128,)Discrete(6)(-inf, inf)
510Pong-ramDeterministic-v4Box(128,)Discrete(6)(-inf, inf)
511Pong-ramNoFrameskip-v0Box(128,)Discrete(6)(-inf, inf)
512Pong-ramNoFrameskip-v4Box(128,)Discrete(6)(-inf, inf)
513Pong-v0Box(210, 160, 3)Discrete(6)(-inf, inf)
514Pong-v4Box(210, 160, 3)Discrete(6)(-inf, inf)
515PongDeterministic-v0Box(210, 160, 3)Discrete(6)(-inf, inf)
516PongDeterministic-v4Box(210, 160, 3)Discrete(6)(-inf, inf)
517PongNoFrameskip-v0Box(210, 160, 3)Discrete(6)(-inf, inf)
518PongNoFrameskip-v4Box(210, 160, 3)Discrete(6)(-inf, inf)
519Pooyan-ram-v0Box(128,)Discrete(6)(-inf, inf)
520Pooyan-ram-v4Box(128,)Discrete(6)(-inf, inf)
521Pooyan-ramDeterministic-v0Box(128,)Discrete(6)(-inf, inf)
522Pooyan-ramDeterministic-v4Box(128,)Discrete(6)(-inf, inf)
523Pooyan-ramNoFrameskip-v0Box(128,)Discrete(6)(-inf, inf)
524Pooyan-ramNoFrameskip-v4Box(128,)Discrete(6)(-inf, inf)
525Pooyan-v0Box(250, 160, 3)Discrete(6)(-inf, inf)
526Pooyan-v4Box(250, 160, 3)Discrete(6)(-inf, inf)
527PooyanDeterministic-v0Box(250, 160, 3)Discrete(6)(-inf, inf)
528PooyanDeterministic-v4Box(250, 160, 3)Discrete(6)(-inf, inf)
529PooyanNoFrameskip-v0Box(250, 160, 3)Discrete(6)(-inf, inf)
530PooyanNoFrameskip-v4Box(250, 160, 3)Discrete(6)(-inf, inf)
531PredictActionsCartpole-v0Box(4,)Tuple(Discrete(2), Discrete(2), Discrete(2), Discrete(2), Discrete(2), Discrete(2))(-inf, inf)
532PredictObsCartpole-v0Box(4,)Tuple(Discrete(2), Box(4,), Box(4,), Box(4,), Box(4,), Box(4,))(-inf, inf)
533PrivateEye-ram-v0Box(128,)Discrete(18)(-inf, inf)
534PrivateEye-ram-v4Box(128,)Discrete(18)(-inf, inf)
535PrivateEye-ramDeterministic-v0Box(128,)Discrete(18)(-inf, inf)
536PrivateEye-ramDeterministic-v4Box(128,)Discrete(18)(-inf, inf)
537PrivateEye-ramNoFrameskip-v0Box(128,)Discrete(18)(-inf, inf)
538PrivateEye-ramNoFrameskip-v4Box(128,)Discrete(18)(-inf, inf)
539PrivateEye-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
540PrivateEye-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
541PrivateEyeDeterministic-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
542PrivateEyeDeterministic-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
543PrivateEyeNoFrameskip-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
544PrivateEyeNoFrameskip-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
545Pusher-v0ErrorErrorError
546Qbert-ram-v0Box(128,)Discrete(6)(-inf, inf)
547Qbert-ram-v4Box(128,)Discrete(6)(-inf, inf)
548Qbert-ramDeterministic-v0Box(128,)Discrete(6)(-inf, inf)
549Qbert-ramDeterministic-v4Box(128,)Discrete(6)(-inf, inf)
550Qbert-ramNoFrameskip-v0Box(128,)Discrete(6)(-inf, inf)
551Qbert-ramNoFrameskip-v4Box(128,)Discrete(6)(-inf, inf)
552Qbert-v0Box(210, 160, 3)Discrete(6)(-inf, inf)
553Qbert-v4Box(210, 160, 3)Discrete(6)(-inf, inf)
554QbertDeterministic-v0Box(210, 160, 3)Discrete(6)(-inf, inf)
555QbertDeterministic-v4Box(210, 160, 3)Discrete(6)(-inf, inf)
556QbertNoFrameskip-v0Box(210, 160, 3)Discrete(6)(-inf, inf)
557QbertNoFrameskip-v4Box(210, 160, 3)Discrete(6)(-inf, inf)
558Reacher-v1ErrorErrorError
559RepeatCopy-v0Discrete(6)Tuple(Discrete(2), Discrete(2), Discrete(5))(-inf, inf)
560Reverse-v0Discrete(3)Tuple(Discrete(2), Discrete(2), Discrete(2))(-inf, inf)
561ReversedAddition-v0Discrete(4)Tuple(Discrete(4), Discrete(2), Discrete(3))(-inf, inf)
562ReversedAddition3-v0Discrete(4)Tuple(Discrete(4), Discrete(2), Discrete(3))(-inf, inf)
563Riverraid-ram-v0Box(128,)Discrete(18)(-inf, inf)
564Riverraid-ram-v4Box(128,)Discrete(18)(-inf, inf)
565Riverraid-ramDeterministic-v0Box(128,)Discrete(18)(-inf, inf)
566Riverraid-ramDeterministic-v4Box(128,)Discrete(18)(-inf, inf)
567Riverraid-ramNoFrameskip-v0Box(128,)Discrete(18)(-inf, inf)
568Riverraid-ramNoFrameskip-v4Box(128,)Discrete(18)(-inf, inf)
569Riverraid-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
570Riverraid-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
571RiverraidDeterministic-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
572RiverraidDeterministic-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
573RiverraidNoFrameskip-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
574RiverraidNoFrameskip-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
575RoadRunner-ram-v0Box(128,)Discrete(18)(-inf, inf)
576RoadRunner-ram-v4Box(128,)Discrete(18)(-inf, inf)
577RoadRunner-ramDeterministic-v0Box(128,)Discrete(18)(-inf, inf)
578RoadRunner-ramDeterministic-v4Box(128,)Discrete(18)(-inf, inf)
579RoadRunner-ramNoFrameskip-v0Box(128,)Discrete(18)(-inf, inf)
580RoadRunner-ramNoFrameskip-v4Box(128,)Discrete(18)(-inf, inf)
581RoadRunner-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
582RoadRunner-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
583RoadRunnerDeterministic-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
584RoadRunnerDeterministic-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
585RoadRunnerNoFrameskip-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
586RoadRunnerNoFrameskip-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
587Robotank-ram-v0Box(128,)Discrete(18)(-inf, inf)
588Robotank-ram-v4Box(128,)Discrete(18)(-inf, inf)
589Robotank-ramDeterministic-v0Box(128,)Discrete(18)(-inf, inf)
590Robotank-ramDeterministic-v4Box(128,)Discrete(18)(-inf, inf)
591Robotank-ramNoFrameskip-v0Box(128,)Discrete(18)(-inf, inf)
592Robotank-ramNoFrameskip-v4Box(128,)Discrete(18)(-inf, inf)
593Robotank-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
594Robotank-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
595RobotankDeterministic-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
596RobotankDeterministic-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
597RobotankNoFrameskip-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
598RobotankNoFrameskip-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
599Roulette-v0Discrete(1)Discrete(38)(-inf, inf)
600Seaquest-ram-v0Box(128,)Discrete(18)(-inf, inf)
601Seaquest-ram-v4Box(128,)Discrete(18)(-inf, inf)
602Seaquest-ramDeterministic-v0Box(128,)Discrete(18)(-inf, inf)
603Seaquest-ramDeterministic-v4Box(128,)Discrete(18)(-inf, inf)
604Seaquest-ramNoFrameskip-v0Box(128,)Discrete(18)(-inf, inf)
605Seaquest-ramNoFrameskip-v4Box(128,)Discrete(18)(-inf, inf)
606Seaquest-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
607Seaquest-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
608SeaquestDeterministic-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
609SeaquestDeterministic-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
610SeaquestNoFrameskip-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
611SeaquestNoFrameskip-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
612SemisuperPendulumDecay-v0Box(3,)Box(1,)(-inf, inf)
613SemisuperPendulumNoise-v0Box(3,)Box(1,)(-inf, inf)
614SemisuperPendulumRandom-v0Box(3,)Box(1,)(-inf, inf)
615Skiing-ram-v0Box(128,)Discrete(3)(-inf, inf)
616Skiing-ram-v4Box(128,)Discrete(3)(-inf, inf)
617Skiing-ramDeterministic-v0Box(128,)Discrete(3)(-inf, inf)
618Skiing-ramDeterministic-v4Box(128,)Discrete(3)(-inf, inf)
619Skiing-ramNoFrameskip-v0Box(128,)Discrete(3)(-inf, inf)
620Skiing-ramNoFrameskip-v4Box(128,)Discrete(3)(-inf, inf)
621Skiing-v0Box(250, 160, 3)Discrete(3)(-inf, inf)
622Skiing-v4Box(250, 160, 3)Discrete(3)(-inf, inf)
623SkiingDeterministic-v0Box(250, 160, 3)Discrete(3)(-inf, inf)
624SkiingDeterministic-v4Box(250, 160, 3)Discrete(3)(-inf, inf)
625SkiingNoFrameskip-v0Box(250, 160, 3)Discrete(3)(-inf, inf)
626SkiingNoFrameskip-v4Box(250, 160, 3)Discrete(3)(-inf, inf)
627Solaris-ram-v0Box(128,)Discrete(18)(-inf, inf)
628Solaris-ram-v4Box(128,)Discrete(18)(-inf, inf)
629Solaris-ramDeterministic-v0Box(128,)Discrete(18)(-inf, inf)
630Solaris-ramDeterministic-v4Box(128,)Discrete(18)(-inf, inf)
631Solaris-ramNoFrameskip-v0Box(128,)Discrete(18)(-inf, inf)
632Solaris-ramNoFrameskip-v4Box(128,)Discrete(18)(-inf, inf)
633Solaris-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
634Solaris-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
635SolarisDeterministic-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
636SolarisDeterministic-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
637SolarisNoFrameskip-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
638SolarisNoFrameskip-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
639SpaceInvaders-ram-v0Box(128,)Discrete(6)(-inf, inf)
640SpaceInvaders-ram-v4Box(128,)Discrete(6)(-inf, inf)
641SpaceInvaders-ramDeterministic-v0Box(128,)Discrete(6)(-inf, inf)
642SpaceInvaders-ramDeterministic-v4Box(128,)Discrete(6)(-inf, inf)
643SpaceInvaders-ramNoFrameskip-v0Box(128,)Discrete(6)(-inf, inf)
644SpaceInvaders-ramNoFrameskip-v4Box(128,)Discrete(6)(-inf, inf)
645SpaceInvaders-v0Box(210, 160, 3)Discrete(6)(-inf, inf)
646SpaceInvaders-v4Box(210, 160, 3)Discrete(6)(-inf, inf)
647SpaceInvadersDeterministic-v0Box(210, 160, 3)Discrete(6)(-inf, inf)
648SpaceInvadersDeterministic-v4Box(210, 160, 3)Discrete(6)(-inf, inf)
649SpaceInvadersNoFrameskip-v0Box(210, 160, 3)Discrete(6)(-inf, inf)
650SpaceInvadersNoFrameskip-v4Box(210, 160, 3)Discrete(6)(-inf, inf)
651StarGunner-ram-v0Box(128,)Discrete(18)(-inf, inf)
652StarGunner-ram-v4Box(128,)Discrete(18)(-inf, inf)
653StarGunner-ramDeterministic-v0Box(128,)Discrete(18)(-inf, inf)
654StarGunner-ramDeterministic-v4Box(128,)Discrete(18)(-inf, inf)
655StarGunner-ramNoFrameskip-v0Box(128,)Discrete(18)(-inf, inf)
656StarGunner-ramNoFrameskip-v4Box(128,)Discrete(18)(-inf, inf)
657StarGunner-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
658StarGunner-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
659StarGunnerDeterministic-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
660StarGunnerDeterministic-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
661StarGunnerNoFrameskip-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
662StarGunnerNoFrameskip-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
663Striker-v0ErrorErrorError
664Swimmer-v1ErrorErrorError
665Taxi-v2Discrete(500)Discrete(6)(-inf, inf)
666Tennis-ram-v0Box(128,)Discrete(18)(-inf, inf)
667Tennis-ram-v4Box(128,)Discrete(18)(-inf, inf)
668Tennis-ramDeterministic-v0Box(128,)Discrete(18)(-inf, inf)
669Tennis-ramDeterministic-v4Box(128,)Discrete(18)(-inf, inf)
670Tennis-ramNoFrameskip-v0Box(128,)Discrete(18)(-inf, inf)
671Tennis-ramNoFrameskip-v4Box(128,)Discrete(18)(-inf, inf)
672Tennis-v0Box(250, 160, 3)Discrete(18)(-inf, inf)
673Tennis-v4Box(250, 160, 3)Discrete(18)(-inf, inf)
674TennisDeterministic-v0Box(250, 160, 3)Discrete(18)(-inf, inf)
675TennisDeterministic-v4Box(250, 160, 3)Discrete(18)(-inf, inf)
676TennisNoFrameskip-v0Box(250, 160, 3)Discrete(18)(-inf, inf)
677TennisNoFrameskip-v4Box(250, 160, 3)Discrete(18)(-inf, inf)
678Thrower-v0ErrorErrorError
679TimePilot-ram-v0Box(128,)Discrete(10)(-inf, inf)
680TimePilot-ram-v4Box(128,)Discrete(10)(-inf, inf)
681TimePilot-ramDeterministic-v0Box(128,)Discrete(10)(-inf, inf)
682TimePilot-ramDeterministic-v4Box(128,)Discrete(10)(-inf, inf)
683TimePilot-ramNoFrameskip-v0Box(128,)Discrete(10)(-inf, inf)
684TimePilot-ramNoFrameskip-v4Box(128,)Discrete(10)(-inf, inf)
685TimePilot-v0Box(210, 160, 3)Discrete(10)(-inf, inf)
686TimePilot-v4Box(210, 160, 3)Discrete(10)(-inf, inf)
687TimePilotDeterministic-v0Box(210, 160, 3)Discrete(10)(-inf, inf)
688TimePilotDeterministic-v4Box(210, 160, 3)Discrete(10)(-inf, inf)
689TimePilotNoFrameskip-v0Box(210, 160, 3)Discrete(10)(-inf, inf)
690TimePilotNoFrameskip-v4Box(210, 160, 3)Discrete(10)(-inf, inf)
691Tutankham-ram-v0Box(128,)Discrete(8)(-inf, inf)
692Tutankham-ram-v4Box(128,)Discrete(8)(-inf, inf)
693Tutankham-ramDeterministic-v0Box(128,)Discrete(8)(-inf, inf)
694Tutankham-ramDeterministic-v4Box(128,)Discrete(8)(-inf, inf)
695Tutankham-ramNoFrameskip-v0Box(128,)Discrete(8)(-inf, inf)
696Tutankham-ramNoFrameskip-v4Box(128,)Discrete(8)(-inf, inf)
697Tutankham-v0Box(250, 160, 3)Discrete(8)(-inf, inf)
698Tutankham-v4Box(250, 160, 3)Discrete(8)(-inf, inf)
699TutankhamDeterministic-v0Box(250, 160, 3)Discrete(8)(-inf, inf)
700TutankhamDeterministic-v4Box(250, 160, 3)Discrete(8)(-inf, inf)
701TutankhamNoFrameskip-v0Box(250, 160, 3)Discrete(8)(-inf, inf)
702TutankhamNoFrameskip-v4Box(250, 160, 3)Discrete(8)(-inf, inf)
703TwoRoundDeterministicReward-v0Discrete(3)Discrete(2)(-inf, inf)
704TwoRoundNondeterministicReward-v0Discrete(3)Discrete(2)(-inf, inf)
705UpNDown-ram-v0Box(128,)Discrete(6)(-inf, inf)
706UpNDown-ram-v4Box(128,)Discrete(6)(-inf, inf)
707UpNDown-ramDeterministic-v0Box(128,)Discrete(6)(-inf, inf)
708UpNDown-ramDeterministic-v4Box(128,)Discrete(6)(-inf, inf)
709UpNDown-ramNoFrameskip-v0Box(128,)Discrete(6)(-inf, inf)
710UpNDown-ramNoFrameskip-v4Box(128,)Discrete(6)(-inf, inf)
711UpNDown-v0Box(210, 160, 3)Discrete(6)(-inf, inf)
712UpNDown-v4Box(210, 160, 3)Discrete(6)(-inf, inf)
713UpNDownDeterministic-v0Box(210, 160, 3)Discrete(6)(-inf, inf)
714UpNDownDeterministic-v4Box(210, 160, 3)Discrete(6)(-inf, inf)
715UpNDownNoFrameskip-v0Box(210, 160, 3)Discrete(6)(-inf, inf)
716UpNDownNoFrameskip-v4Box(210, 160, 3)Discrete(6)(-inf, inf)
717Venture-ram-v0Box(128,)Discrete(18)(-inf, inf)
718Venture-ram-v4Box(128,)Discrete(18)(-inf, inf)
719Venture-ramDeterministic-v0Box(128,)Discrete(18)(-inf, inf)
720Venture-ramDeterministic-v4Box(128,)Discrete(18)(-inf, inf)
721Venture-ramNoFrameskip-v0Box(128,)Discrete(18)(-inf, inf)
722Venture-ramNoFrameskip-v4Box(128,)Discrete(18)(-inf, inf)
723Venture-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
724Venture-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
725VentureDeterministic-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
726VentureDeterministic-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
727VentureNoFrameskip-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
728VentureNoFrameskip-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
729VideoPinball-ram-v0Box(128,)Discrete(9)(-inf, inf)
730VideoPinball-ram-v4Box(128,)Discrete(9)(-inf, inf)
731VideoPinball-ramDeterministic-v0Box(128,)Discrete(9)(-inf, inf)
732VideoPinball-ramDeterministic-v4Box(128,)Discrete(9)(-inf, inf)
733VideoPinball-ramNoFrameskip-v0Box(128,)Discrete(9)(-inf, inf)
734VideoPinball-ramNoFrameskip-v4Box(128,)Discrete(9)(-inf, inf)
735VideoPinball-v0Box(250, 160, 3)Discrete(9)(-inf, inf)
736VideoPinball-v4Box(250, 160, 3)Discrete(9)(-inf, inf)
737VideoPinballDeterministic-v0Box(250, 160, 3)Discrete(9)(-inf, inf)
738VideoPinballDeterministic-v4Box(250, 160, 3)Discrete(9)(-inf, inf)
739VideoPinballNoFrameskip-v0Box(250, 160, 3)Discrete(9)(-inf, inf)
740VideoPinballNoFrameskip-v4Box(250, 160, 3)Discrete(9)(-inf, inf)
741Walker2d-v1ErrorErrorError
742WizardOfWor-ram-v0Box(128,)Discrete(10)(-inf, inf)
743WizardOfWor-ram-v4Box(128,)Discrete(10)(-inf, inf)
744WizardOfWor-ramDeterministic-v0Box(128,)Discrete(10)(-inf, inf)
745WizardOfWor-ramDeterministic-v4Box(128,)Discrete(10)(-inf, inf)
746WizardOfWor-ramNoFrameskip-v0Box(128,)Discrete(10)(-inf, inf)
747WizardOfWor-ramNoFrameskip-v4Box(128,)Discrete(10)(-inf, inf)
748WizardOfWor-v0Box(250, 160, 3)Discrete(10)(-inf, inf)
749WizardOfWor-v4Box(250, 160, 3)Discrete(10)(-inf, inf)
750WizardOfWorDeterministic-v0Box(250, 160, 3)Discrete(10)(-inf, inf)
751WizardOfWorDeterministic-v4Box(250, 160, 3)Discrete(10)(-inf, inf)
752WizardOfWorNoFrameskip-v0Box(250, 160, 3)Discrete(10)(-inf, inf)
753WizardOfWorNoFrameskip-v4Box(250, 160, 3)Discrete(10)(-inf, inf)
754YarsRevenge-ram-v0Box(128,)Discrete(18)(-inf, inf)
755YarsRevenge-ram-v4Box(128,)Discrete(18)(-inf, inf)
756YarsRevenge-ramDeterministic-v0Box(128,)Discrete(18)(-inf, inf)
757YarsRevenge-ramDeterministic-v4Box(128,)Discrete(18)(-inf, inf)
758YarsRevenge-ramNoFrameskip-v0Box(128,)Discrete(18)(-inf, inf)
759YarsRevenge-ramNoFrameskip-v4Box(128,)Discrete(18)(-inf, inf)
760YarsRevenge-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
761YarsRevenge-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
762YarsRevengeDeterministic-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
763YarsRevengeDeterministic-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
764YarsRevengeNoFrameskip-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
765YarsRevengeNoFrameskip-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
766Zaxxon-ram-v0Box(128,)Discrete(18)(-inf, inf)
767Zaxxon-ram-v4Box(128,)Discrete(18)(-inf, inf)
768Zaxxon-ramDeterministic-v0Box(128,)Discrete(18)(-inf, inf)
769Zaxxon-ramDeterministic-v4Box(128,)Discrete(18)(-inf, inf)
770Zaxxon-ramNoFrameskip-v0Box(128,)Discrete(18)(-inf, inf)
771Zaxxon-ramNoFrameskip-v4Box(128,)Discrete(18)(-inf, inf)
772Zaxxon-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
773Zaxxon-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
774ZaxxonDeterministic-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
775ZaxxonDeterministic-v4Box(210, 160, 3)Discrete(18)(-inf, inf)
776ZaxxonNoFrameskip-v0Box(210, 160, 3)Discrete(18)(-inf, inf)
777ZaxxonNoFrameskip-v4Box(210, 160, 3)Discrete(18)(-inf, inf)

To summarize it:

  • Most environment have an continous observation space and a discrete action space
  • The action space often has about 20 possibilities. The maximum is KellyCoinflip-v0 with 25000 actions.
  • The observation space has at least one input (OneRoundNondeterministicReward-v0) and never more than 500 (Taxi-v2)
  • The observation space shape of (210, 160, 3) is so common (264 times!) because it is the screen of an Atari game.

Other

  • Comparison of DQN performance with linear function approximator

Published

Nov 7, 2017
by Martin Thoma

Category

Machine Learning

Tags

  • Reinforcement Learning 7

Contact

  • Martin Thoma - A blog about Code, the Web and Cyberculture
  • E-mail subscription
  • RSS-Feed
  • Privacy/Datenschutzerklärung
  • Impressum
  • Powered by Pelican. Theme: Elegant by Talha Mansoor