Coevolutionary Temporal Difference Learning for Othello

by Marcin Szubert, Wojciech Jaśkowski, Krzysztof Krawiec
Abstract:
This paper presents Coevolutionary Temporal Difference Learning (CTDL), a novel way of hybridizing coevolutionary search with reinforcement learning that works by interlacing one-population competitive coevolution with temporal difference learning. The coevolutionary part of the algorithm provides for exploration of the solution space, while the temporal difference learning performs its exploitation by local search. We apply CTDL to the board game of Othello, using weighted piece counter for representing players’ strategies. The results of an extensive computational experiment demonstrate CTDL’s superiority when compared to coevolution and reinforcement learning alone, particularly when coevolution maintains an archive to provide historical progress. The paper investigates the role of the relative intensity of coevolutionary search and temporal difference search, which turns out to be an essential parameter. The formulation of CTDL leads also to the introduction of Lamarckian form of coevolution, which we discuss in detail.
Reference:
Coevolutionary Temporal Difference Learning for Othello (Marcin Szubert, Wojciech Jaśkowski, Krzysztof Krawiec), In IEEE Symposium on Computational Intelligence and Games, 2009.
Bibtex Entry:
@InProceedings{Szubert2009coevolutionary,
  Title                    = {Coevolutionary Temporal Difference Learning for Othello},
  Author                   = {Marcin Szubert and Wojciech Jaśkowski and Krzysztof Krawiec},
  Booktitle                = {IEEE Symposium on Computational Intelligence and Games},
  Year                     = {2009},

  Address                  = {Milano, Italy},
  Pages                    = {104--111},

  Abstract                 = {This paper presents Coevolutionary Temporal Difference Learning (CTDL), a novel way of hybridizing coevolutionary search with reinforcement learning that works by interlacing one-population competitive coevolution with temporal difference learning. The coevolutionary part of the algorithm provides for exploration of the solution space, while the temporal difference learning performs its exploitation by local search. We apply CTDL to the board game of Othello, using weighted piece counter for representing players' strategies. The results of an extensive computational experiment demonstrate CTDL's superiority when compared to coevolution
and reinforcement learning alone, particularly when coevolution maintains an archive to provide historical progress. The paper investigates the role of the relative intensity of coevolutionary search and temporal difference search, which turns out to be an essential parameter. The formulation of CTDL leads also to the introduction of Lamarckian form of coevolution, which we discuss in detail.},
  Keywords                 = {coevolution, coevolutionary algorithm, reinforcement learning, temporal difference learning, othello, games},
  Url                      = {http://www.cs.put.poznan.pl/wjaskowski/pub/papers/szubert09coevolutionary.pdf}
}

This entry was posted by . Bookmark the permalink.