Tutorial

This tutorial is based on Python, however, everything works analogously in Java and C++. The tutorial assumes a Linux box so the paths use "/". Additionally file extensions may differ (e.g. dll instead of so, exe instead of nothing).


Installation

Clone ViZDoom and follow the building instructions (remember about building the python bindings). As a result, the bin folder should be created and you should have the following files in place:

On Linux, you should also find these:

On Windows, you need to download freedoom2.wad. To be sure that everything is in place, consult the readme.

Now, go to examples/python and run basic.py. You should see the Doom marine (his hand actually) shooting randomly at a monster.

basic

freedoom2.wad is a free resource file with textures and other visual-related resources used by the game engine. The textures often differ from the original Doom's textures. Unfortunately, the original Doom2.wad file is proprietary, so we are not allowed to redistribute it - if you want to have nicer textures you need to acquire the file on your own (e.g., buy it on steam or gog).


Shortest Working Example

This example should be run from the examples/python subdirectory.

Below is the shortest reasonable example of how to use ViZDoom environment. It loads the configuration from a file and initializes the environment. Then, it runs 10 episodes with an agent making random actions. The agent gets a state and prints the obtained reward. (In the loaded scenario specified in the basic.cfg, the agent gets rewards for each action which is -1 for each time tick, -6 if he shots but misses, and 100 if he kills the monster.). The example uses the synchronous mode, so the game engine waits with the next frame for the agent decision, regardless how long it takes. Since the random agent makes decisions very quickly, for visualization purposes, we included a short sleep.

from vizdoom import *
import random
import time

game = DoomGame()
game.load_config("../config/basic.cfg")
game.init()

shoot = [0,0,1]
left = [1,0,0]
right = [0,1,0]
actions = [shoot, left, right]

episodes = 10
for i in range(episodes):
    game.new_episode()
    while not game.is_episode_finished():
        state = game.get_state()
        img = state.image_buffer
        misc = state.game_variables
        reward = game.make_action(random.choice(actions))
        print "\treward:",reward
        time.sleep(0.02)
    print "Result:", game.get_total_reward()
    time.sleep(2)

Basic example (in details)

Configuration

Let's go through the example in basic.py, line by line. Import is a good start (''from vizdoom import *'' would be much more concise):

from vizdoom import DoomGame
from vizdoom import Button
from vizdoom import GameVariable
from vizdoom import ScreenFormat
from vizdoom import ScreenResolution

We import a couple of classes:

Remember that Python, by default, will look for the library to import in the working directory.

We start by create a DoomGame object. Before it will be ready to play with, it requires configuration and initialization. The following code configures the paths for all required external files:

game = DoomGame()
game.set_doom_engine_path("../../bin/vizdoom")
game.set_doom_game_path("../../scenarios/freedoom2.wad")
game.set_doom_scenario_path("../../scenarios/basic.wad")
game.set_doom_map("map01")

Where:

You can also set some rendering options. The following code sets the screen resolution, the screen buffer format, and whether to render particular visual elements such as the crosshair or the HUD:

game.set_screen_resolution(ScreenResolution.RES_640X480)
game.set_screen_format(ScreenFormat.RGB24)
game.set_render_hud(False)
game.set_render_crosshair(False)
game.set_render_weapon(True)
game.set_render_decals(False)
game.set_render_particles(False)

Now, we should determine which buttons can be used by the agent. If we skip this step, no buttons will be available, so the whole endeavour will be pretty pointless.

game.add_available_button(Button.MOVE_LEFT)
game.add_available_button(Button.MOVE_RIGHT)
game.add_available_button(Button.ATTACK)

Next, we can determine which game variables (health, ammo, weapon availability etc.) will be included in the state we get in each timestep. Any game variable can be acquired anytime during the game but having them in the state may be more convenient. Here, we include only AMMO2, which is the pistol ammo:

game.add_available_game_variable(GameVariable.AMMO2)

We also specify some other settings such as the visibility of the window, the episode timeout (in tics/frames), or start time (initial tics are ommited by the environment but internally, the engine still runs them). Start time is useful to ommit initial event like spawning monsters, weapon producing etc.

game.set_episode_timeout(200)
game.set_episode_start_time(10)
game.set_window_visible(True)

In the basic world, life is mostly painful and agents get a bonus -1 reward for each move no matter what happens. It's achieved by setting the living reward.

game.set_living_reward(-1)

Finally, we can initialize the game, after which the Doom's window should appear:

game.init()

Game runtime

A single Doom "game" is called an episode. Episodes are independent and finish on player's death, timeout, or when some custom conditions are satisfied (e.g., the agent acomplishes some task), defined by scenario. In this example, the episode finishes after 300 tics or when the monster gets killed. Each action produces a reward: -6 for shooting and missing, 100 for killing the monster, and -1 otherwise (it would be -5, 100 and 0 accordingly but the living reward is set to -1).

for i in range(episodes):
    print("Episode #" + str(i+1))
    game.new_episode()

    while not game.is_episode_finished():

        s = game.get_state()
        r = game.make_action(choice(actions))

        print("State #" + str(s.number))
        print("Game variables:", s.game_variables[0])
        print("Reward:", r)
        print("=====================")
        if sleep_time>0:
            sleep(sleep_time)

    print("Episode finished.")
    print("Total reward:", game.get_total_reward())
    print("************************")

Acting in the Doom's world involves getting the current state (get_state), making actions (make_action) and obtaining rewards. get_state returns a GameState object and make_action takes an action as an input and returns a reward. The action should be a list of integers of the length equal to number of available buttons specified in the configuration (3 in this example: left, rigth, attack). Each list position maps to corresponding button so, e.g., [0,0,1] means "not left, not right, attack!". If the input list is too short, 0 will be used in the missing positions.

Strictly speaking actions sould be specified as lists of integers but internal representations of booleans comes up to integers so using boolean lists (e.g. [False,False,True] instead of [0,0,1]) is possible and semantically more acurate. However, integers should be used when specifying non-binary actions(like LOOK_LEFT_RIGHT_DELTA).

Finally, you can terminate DoomGame. It should be done automatically on program exit (or when you reassign the variable used by the DoomGame object)

game.close()

Configuration Files

Instead of configuring the experiment in code, you can load it from configuration file(s). Each file is read sequentially, so multiple entries with the same key will overwrite each other.

Format

Each entry in a configraution file is a pair of key and value separated by an equal sign ("="). The file format should also abide the following rules:

Violation of any of these rules will result in ignoring only the line with the error.

List parameters

available_buttons and available_game_variables are special parameters, which use multiple values and instead of a single value they expect a list of values separated by whitespaces and enclosed within braces ("{" and "}"). The list can stretch throughout as many lines as you like as long as all values are separated from each other by whitespaces.

Each list assignment (KEY = { VALUES })clears values specified for this key before (in other configuration files or in the code). Also, the *append operator(KEY += { VALUES })** is available. This way you can more easily combine multiple configuration files and tinker in code.

An example:

doom_engine_path = ../../bin/vizdoom
#doom_game_path = ../../scenarios/doom2.wad
doom_game_path = ../../scenarios/freedoom2.wad
doom_scenario_path = ../../scenarios/basic.wad
doom_map = map01

# Rewards
living_reward = -1

# Rendering options
screen_resolution = RES_320X240
screen_format = CRCGCB
render_hud = True
render_crosshair = false
render_weapon = true
render_decals = false
render_particles = false
window_visible = true

# make episodes start after 20 tics (after unholstering the gun)
episode_start_time = 14

# make episodes finish after 300 actions (tics)
episode_timeout = 300

# Available buttons 
available_buttons = 
    { 
        MOVE_LEFT 
        MOVE_RIGHT 
    }
#    
available_buttons += { ATTACK }

# Game variables that will be in the state
available_game_variables = { AMMO2}

mode = PLAYER
doom_skill = 5

#auto_new_episode = false
#new_episode_on_timeout = false
#new_episode_on_player_death = false
#new_episode_on_map_end = false

Other examples can be found here


Performing Actions

make_action

The most basic way to perform an action is to use the make_action method, which takes a list of button states and returns a reward - the result of making the action. Additionally, a second argument can be specified: tics which tells the environemnt to perform the same action tics number of frames (we call it "frame skipping". Skipping frames can improve performance of the game (no rendering).

advance_action and set_action

Unfortunately make_action method does not let you interfere with whatever happens during the skipped frames. To enable more versatility you can get more fine-grained control with set_action and advance_action:

...
game.set_action(my_action)
tics = 5
update_state = True # determines whether the state and reward will be updated
render_only = True  # if update_state==False, it determines whether a new frame (image only) will be rendered (can be retrieved using get_game_screen())

# action lasts 5 tics
game.advance_action(tics) 

# doesn't update the state but renders the screen buffer
game.advance_action(1, not update_state, render_only)

# skips one frame and updates the state
game.advance_action(2, update_state, render_only)

...

Modes

ViZDoom has 4 modes of operation:

PLAYER

The PLAYER mode lets the agent perceive the state and make actions. PLAYER mode is fully synchronous, thus ZDoom engine will wait for the agent to make_action or advance_action.

SPECTATOR

The SPECTATOR mode is intended mainly for apprenticeship learning and allows you (the human) to play the game (using keyboard and mouse) while the agent reads states of the game and your actions (get_last_action). The SPECTATOR mode is synchronous as well, so the engine's processing will wait until the agent gives his permission to continue (advance_action). The SPECTATOR mode is designed to run 35fps, so fast computations on agent's side will not cause overall acceleration. On the other hand, failing to match the speed of ~35fps will result in slow or/and choppy gameplay.

A snippet showing a sample episode in SPECTATOR mode:


game = DoomGame()

#CONFIGURATION
...

game.set_mode(Mode.SPECTATOR)
game.init()

episodes = 10
for i in range(episodes):
    print("Episode #" +str(i+1))

    game.new_episode()
    while not game.is_episode_finished():
        s = game.get_state()
        game.advance_action()
        a = game.get_last_action()
        r = game.get_last_reward()
        ...
 ...    

ASYNC_PLAYER & ASYNC_SPECTATOR

ASYNC versions of the PLAYER and SPECTATOR modes work asynchronously. It means that the engine runs at ~35 fps and does not wait for the agent to perform make_action/advance_action - being late results in missings tics. On the other hand, advance_action/make_action waits for the engine to process the next frame, so there is no risk of responding to the same state multiple times.

Custom Scenarios

To create a custom scenario (iwad file), you need to use a dedicated editor. Doom Builder and Slade are the software tools we recommend for this task.

Scenarios (iwad files) contain maps and ACS scripts. For starters, it is a good idea to analyze the sample scenarios, which come with ViZDoom (remember that these are binary files).

Reward

In order to use the rewarding mechanism you need to employ the global variable 0:

global int 0:reward;
...
script 1(void)
{
    ...
    reward += 100.0;
}
...

ViZDoom treats the reward as a fixed point numeral, so you need to use decimal points in ACS scripts. For example 1 is treated as an ordinary integer and 1.0 is a fixed point number. Using ordinary integer values will, most probably, result in unexpected behaviour.

User Variables

GameVariable represents non-visual game data such as ammo, health points or weapon readiness, which can be present in state or can be extracted at any moment. It is also possible to access user variables (USER1, USER2 ... USER32), which correspond to ACS scripts' global variables 1-32.

global int 0:reward;
global int 1:shaping_reward;
global int 2:some_int_value;
...
script 1(void)
{
    ...
    reward += 100.0;
    ...
    shaping_reward += 10.0;
    ...
    some_int_value += 1;
}
...

By default, the USER variables are treated as ordinary integers, so using fixed point numbers inside the script will result in a rubbish output. However, you can turn the rubbish into meaningful data using doom_fixed_to_double function.

...
rubbish = game.get_game_variable(GameVariable.USER1)
legitimate_integer = game.get_game_variable(GameVariable.USER2)
meaningful_data_as_double = doom_fixed_to_double(rubbish)
...