On Scalability, Generalization, and Hybridization of Coevolutionary Learning: a Case Study for Othello

by Marcin Szubert, Wojciech Ja’skowski, Krzysztof Krawiec
Abstract:
This study investigates different methods of learning to play the game of Othello. The main questions posed concern scalability of algorithms with respect to the search space size and their capability to generalize and produce players that fare well against various opponents. The considered algorithms represent strategies as n-tuple networks, and employ self-play temporal difference learning (TDL), evolutionary and coevolutionary learning, and hybrids thereof. To assess the performance, three different measures are used: score against an a priori given opponent (a fixed heuristic strategy), against opponents trained by other methods (round-robin tournament), and against the top-ranked players from the online Othello League. We demonstrate that although evolutionary-based methods yield players that fare best against a fixed heuristic player, it is the coevolutionary temporal difference learning (CTDL), a hybrid of coevolution and TDL, that generalizes better and proves superior when confronted with a pool of previously unseen opponents. Moreover, CTDL scales well with the size of representation, attaining better results for larger n-tuple networks. By showing that a strategy learned in this way wins against the top entries from the Othello League, we conclude that it is one of the best 1-ply Othello players obtained to date without explicit use of human knowledge.
Reference:
On Scalability, Generalization, and Hybridization of Coevolutionary Learning: a Case Study for Othello (Marcin Szubert, Wojciech Ja’skowski, Krzysztof Krawiec), In IEEE Transactions on Computational Intelligence and AI in Games, volume 5, 2013.
Bibtex Entry:
@Article{Szubert2013scalability,
  Title                    = {On Scalability, Generalization, and Hybridization of Coevolutionary Learning: a Case Study for Othello},
  Author                   = {Marcin Szubert and Wojciech Ja'skowski and Krzysztof Krawiec},
  Journal                  = {IEEE Transactions on Computational Intelligence and AI in Games},
  Year                     = {2013},
  Number                   = {3},
  Pages                    = {214--226},
  Volume                   = {5},

  Abstract                 = {This study investigates different methods of learning to play the game of Othello. The main questions posed concern scalability of algorithms with respect to the search space size and their capability to generalize and produce players that fare well against various opponents. The considered algorithms represent strategies as n-tuple networks, and employ self-play temporal difference learning (TDL), evolutionary and coevolutionary learning, and hybrids thereof. To assess the performance, three different measures are used: score against an a priori given opponent (a fixed heuristic strategy), against opponents trained by other methods (round-robin tournament), and against the top-ranked players from the online Othello League. We demonstrate that although evolutionary-based methods yield players that fare best against a fixed heuristic player, it is the coevolutionary temporal difference learning (CTDL), a hybrid of coevolution and TDL, that generalizes better and proves superior when confronted with a pool of previously unseen opponents. Moreover, CTDL scales well with the size of representation, attaining better results for larger n-tuple networks. By showing that a strategy learned in this way wins against the top entries from the Othello League, we conclude that it is one of the best 1-ply Othello players obtained to date without explicit use of human knowledge.},
  Doi                      = {10.1109/TCIAIG.2013.2258919},
  Keywords                 = {Coevolution, N-tuple Systems, Othello, Temporal Difference Learning},
  Url                      = {http://www.cs.put.poznan.pl/mszubert/pub/szubert2013tciaig.pdf}
}

This entry was posted by . Bookmark the permalink.