Jump to content


Photo

Will artificial intelligence develop to the point where it can beat a human in stratego?


  • Please log in to reply
43 replies to this topic

#21 astros

astros

    Stratego TM

  • WC Online Team
  • 787 posts
  • Coat of arms
  • Platinum Colonel

Posted 12 December 2017 - 08:53 PM

No, programs have been designed that consistently beat the best poker players, which requires the ability to bluff. Similarly, an unbeatable Go, a game that is much for complex than chess by several orders of magnitude, has been created. If someone were to invest the resources, a similar program could be made for Stratego.

It is likely that in the near future computers will vastly surpass mankind's abilities. At this point, it would be trivially easy for a machine to beat any person in Stratego.
I'm in love with Stacy's mom.

#22 Nortrom

Nortrom

    General

  • Moderators
  • 2,011 posts
  • Coat of arms
  • Platinum Marshal

Posted 12 December 2017 - 09:33 PM

My take on AI and Stratego : http://nortromstrate...an-players.html


  • TemplateRex likes this
"Rock is overpowered, paper is fine" - scissors

#23 OuweSok

OuweSok

    Sergeant

  • Members
  • PipPipPipPipPipPip
  • 495 posts
  • Coat of arms
  • Platinum Spy

Posted 13 December 2017 - 07:11 AM

AI will have zero problems to bluff. Poker AI has brought bluffing to the next level, helped by the fact that in any spot it will have some % of bluffs and some % of good hands and it will put the human in unguessable spots when shoving allin for a very large amount compared to pot size.

The equivalent in Stratego is an unknown piece coming at you and you not having a clue what it is and what the actual plan of the AI is: either saccing it for info or capturing a nearby known or moved piece?

The intuition part will be learned by having good rules about positional values, thousands of setups in memory that will force humans to make setups that are less predictable, something where the computer will have 0 problems with.

The AI will crush humans in any thinking game and Stratego is certainly no exception.
  • TemplateRex likes this

Check out my Stratego Youtube channel ! :o


#24 TemplateRex

TemplateRex

    Sergeant

  • Members
  • PipPipPipPipPipPip
  • 453 posts
  • Coat of arms
  • Silver Captain

Posted 03 January 2018 - 12:45 AM

https://www.alphagomovie.com/ is now live on Netflix worldwide (works in the Netherlands as well). Features the match of AlphaGo against Lee Sedol in March 2016. Few technical details, but worth watching to see how quickly machines can conquer a seemingly indomitable domain.


I hereby grant explicit permission to all my opponents to record and publish my games as they see fit.


#25 TemplateRex

TemplateRex

    Sergeant

  • Members
  • PipPipPipPipPipPip
  • 453 posts
  • Coat of arms
  • Silver Captain

Posted 26 February 2018 - 10:32 AM

I have been prototyping a Stratego program. One of the things I'd like to compute is perfect information endgame databases. In chess, 10x10 draughts, and 8x8 checkers this has been very beneficial for human analysts as well as for the playing strength of AIs (much more so in checkers and draughts than in chess). Also in Stratego, I think such databases can help people understand endgames better. However, their usefullness will be much smaller. There are two reasons for this:

  1. a much larger board (92 squares vs 64 for chess and 50 for draughts, and only 32 for checkers). This means that normal consumer pc's can't easily compute anything bigger than an endgame with 5 moving pieces and fixed bomb/flag positions. In contrast, in chess all 6 piece (and some 7 piece) endgames have been solved, in 10x10 draughts all 8 piece (some 9 piece) endgames are known, and in 8x8 checkers all 10 piece (some 11 piece) endgames are known.
  2. many more material combinations than in draughts/checkers (2 piece types) or even chess (6 piece types).

Regarding point 1): Given bombs and flags for both sides, there are between B= 78 and 90 open squares to place the moving pieces. For N non-identical pieces, the number of placements are B! / (B - N)! different ways to place them on the board. For N=5, you get between 2 and 5 billion positions, which can be solved in memory on a normal consumer pc.

 

For N identical pieces against 1 opponent piece, the number of placements is equal to B! / ((B-N-1)! * N! * 1!). For N=5, you get between 18 and 53 billion positions. On my pc with 64Gb of memory, this will just fit. So endgames of 5 pieces against 1 scout can be exhaustively solved given a fixed flag/bomb placement. You can treat any 5 non-scout (non-miner) piece as identical because of the principle of rank collapsing.

 

It is also possible to compute some other 6 piece endgames. E.g. lieut + 1 miner vs 2 sarges + 2 miners can just be done if both sides have at least 4 bombs on the board. 

 

My current estimate is that such big 5 or 6 piece endgames will probably take about a week to compute (for any given flag/bomb placement). Maybe in a day if the computation can be done in parallel on multiple cores.

 

Regarding point 2): the 10 different moving piece types give rise to a lot of material combinations. Fortunately, there is the principle of rank collapsing. But even then, the N=2 endgames already have 10 different combinations: 1_1, 1_2, 1_8, 1_9, 1_S, 8_8, 8_9, 8_S, 9_9 and 9_S. Here I denote 1 as the top rank piece, 2 as the second in rank, and 8 as miner, 9 as scout and S as spy. So 1_1 is the endgame of two equal rank pieces without special abilities (no miner, scout or spy). The 1_S endgame is only valid if the top rank piece is a marshal, otherwise this collapses to the 1_2 endgame. 

 

These 10 piece combinations have to be multiplied by the flag/bomb formations. There are 40! / (6!  *1!) = 130 million possible formations for each player, so it is impossible to do this exhaustively. Instead one could place the flag on the back row for both players, and only compute all 5 surrounding squares as either bomb or empty. This gives 2^5 * 10 = 320 possibilities for each player, or 100 thousand combinations in total. The 2 moving pieces then give only 6.6 thousand combinations (82 * 81) so about 660 million positions per endgame. This means that there are about 6.6 billion different 2 piece endgame positions with mutual back row flags to be solved. 

 

I haven't calculated it yet, but maybe also the 3 piece endgames with back row flags can be solved exhaustively. If I'm counting correctly, there are 47 different material combinations (note that due to rank collapsing e.g. 22_8 is equivalent to 11_8, etc.):

11_1, 11_2, 11_8, 11_9, 11_S,

12_1, 12_2, 12_S,

13_2,

18_1, 18_2, 18_8, 18_9, 18_S,

19_1, 19_2, 19_8, 19_9, 19_S,

1S_1, 1S_8, 1S_9, 1S_S,

22_1, 28_1, 29_1, 2S_1

88_1, 88_8, 88_9, 88_S,

89_1, 89_8, 89_9, 89_S,

8S_1, 8S_8, 8S_9, 8S_S

99_1, 99_8, 99_9, 99_S,

9S_1, 9S_8, 9S_9, 9S_S

 

Each of those endgames has about 500 thousand positions per flag/bomb formation, so about 50 billion positions per endgame (fits in memory during the computation, but could be done per flag/bomb formation anyway), or about 2.5 terabytes uncompressed storage on disk. After compression, this should be a lot smaller. This will probably equivalent in size and effort to the 6 piece chess endgames (about half a terabyte of disk storage).

 

In summary: a small but important part of Stratego endgames can be solved. All back row flag/bomb formations with up to 3 moving pieces can be fully solved, and for any particular flag/bomb formation, endgames with up to 5 non-identical or up to 6 partially identical moving pieces (with enough bombs to reduce the number of free squares) can also be solved exhaustively.

 

Positions like the 6 or 7 piece endgames on Nortrom's blog remain out of reach with current computers. However, it is possible to compute all the underlying 5 piece positions that can arise from this endgame. This might greatly simplify the analysis using a simple forward alpha-beta search (only 1 or 2 exchanges before reaching the databases). I'll report back when I have concrete results :)


Edited by TemplateRex, 26 February 2018 - 01:01 PM.

  • Nortrom and Morx like this

I hereby grant explicit permission to all my opponents to record and publish my games as they see fit.


#26 El cabrón

El cabrón

    New Recruit

  • Members
  • Pip
  • 2 posts
  • Coat of arms
  • Gold Spy

Posted 26 February 2018 - 04:53 PM

Dear developer,

 

This sound nice as a first step, but beside to calculate so much i suggest to focus also on "compacting techniques".

These techniques may be represented as a set of simple functions that covers a lot of cases (game states), and can give a fast and always correct result.

Simple example: chess game 1 king in either side and a one rook. This is always a win position if the player with two pieces begins.

The total state number is quite small (32k?), but it is a good exercise to create such a compacter function (few rules only, but these has to be determined by the AI!!!). 

In this way i suppose you will be able to compact those states by at least 1000 times, and what is much more important, you can extract the proper data from such a database in few ms of time.

In my opinion with good data management a 6 piece situation is in the limit, assuming the usage of a fast PC with a large SSD (1TB is still affordable, in the presence of a large SSD the necessary memory is negligible) in it.

The selection of the correct function has to be performed by a function "pre-selection" module.

 

When such data collection and functions are performed, if we want to handle a given situation, 99% of the cases will not be used, so only a little part of the stored data will be stored in the memory too.

In this way you can give results as: win in 125 steps :)

 

Success!!!



#27 Nortrom

Nortrom

    General

  • Moderators
  • 2,011 posts
  • Coat of arms
  • Platinum Marshal

Posted 26 February 2018 - 07:47 PM

I'm curious as to how your project will unfold. While the posts you made about the theoretical things are quite interesting, they are not so easy to comprehend. I suppose I'll read it again later when fully focused :)

 

What exactly does this 'prototype' you speak of contain?


"Rock is overpowered, paper is fine" - scissors

#28 TemplateRex

TemplateRex

    Sergeant

  • Members
  • PipPipPipPipPipPip
  • 453 posts
  • Coat of arms
  • Silver Captain

Posted 26 February 2018 - 08:17 PM

I'm curious as to how your project will unfold. While the posts you made about the theoretical things are quite interesting, they are not so easy to comprehend. I suppose I'll read it again later when fully focused :)

 

What exactly does this 'prototype' you speak of contain?

 

My previous post was just an warm-up exercise to see how many endgames are within reach for a regular computer. The math of counting how many ways to place pieces on a board is very straightforward (similar to the goold old school problems of drawing colored marbles from bags). Feel free to ask specific questions.

 

In other board games, endgame databases have given a lot of interesting knowledge about optimal play. Things like 1 lieut + miner vs 1 sarge + 1 or 2 miners should be quite practical. I doubt this is played perfectly over the board. If you think there are other more practical endgames for which optimal play is not well known, let me know and I can see if they can be computed.

 

My current stratego program is a rudimentary port of a draughts program that I wrote years ago. It can only play the subset of perfect information stratego, but not a legal game. :)  I used it in one of my first posts in order to solve Napoleon 1er's bomb placement puzzle. That computation was very similar in nature to solving endgame databases. Most of the tools to build them are ready, just haven't found the time to do it properly.

 

To get a proper AI, I am much more interested in developing an AlphaZero approach that will learn from self-play. I think I have a pretty good idea of the datastructures required to build it, but the algorithm side is much more challenging (because of the imperfect information), and the AlphaZero approach is very computational expensive (Deepmind used hundreds if not thousands of computers for Go, chess and Shogi). The endgame databases are just a diversion because I know it will definitely work. Plus it can teach me how to play simple endgames myself  :)


Edited by TemplateRex, 26 February 2018 - 08:21 PM.

I hereby grant explicit permission to all my opponents to record and publish my games as they see fit.


#29 Fks

Fks

    Major

  • NASF Committee
  • 1,119 posts
  • Coat of arms
  • Platinum Marshal

Posted 26 February 2018 - 08:23 PM

I mean the second A.I gets better then humans there cant be any online tournaments and unless the admins start banning people from using the A.I I't will be pc vs pc for #1 spot.


Proud Member of the North American Stratego Federation (NASF)

http://forum.strateg...18/#entry461226


#30 TemplateRex

TemplateRex

    Sergeant

  • Members
  • PipPipPipPipPipPip
  • 453 posts
  • Coat of arms
  • Silver Captain

Posted 26 February 2018 - 08:28 PM

I mean the second A.I gets better then humans there cant be any online tournaments and unless the admins start banning people from using the A.I I't will be pc vs pc for #1 spot.

 

Stratego is not poker: no money, no reason to cheat. Online chess is not doing too badly, is it?

 

But unless a big company will take up the challenge, I think it will be quite a few years before an AI beats the top humans. I mean, even a very well made program like Probe can be easily beaten by a low silver player like me (I literally played dozens of games against it, to playtest all kinds of setups, and only lost once against it). 


I hereby grant explicit permission to all my opponents to record and publish my games as they see fit.


#31 Fks

Fks

    Major

  • NASF Committee
  • 1,119 posts
  • Coat of arms
  • Platinum Marshal

Posted 26 February 2018 - 09:01 PM

Stratego is not poker: no money, no reason to cheat. Online chess is not doing too badly, is it?
 
But unless a big company will take up the challenge, I think it will be quite a few years before an AI beats the top humans. I mean, even a very well made program like Probe can be easily beaten by a low silver player like me (I literally played dozens of games against it, to playtest all kinds of setups, and only lost once against it).

Most online games between top players are 1-3 minutes Unless they are known Gms just to prevent using A.I. For chess. People will want to cheat to raise there rating up and win tournaments since you have 15 seconds per move you will always have enough time to use A.I. Personally I don't see being able to beat someone like Hielco or Nortrom for a very long time if ever. With Hielco and Nortrom still being elite players atleast.

Proud Member of the North American Stratego Federation (NASF)

http://forum.strateg...18/#entry461226


#32 TemplateRex

TemplateRex

    Sergeant

  • Members
  • PipPipPipPipPipPip
  • 453 posts
  • Coat of arms
  • Silver Captain

Posted 26 February 2018 - 09:05 PM

Most online games between top players are 1-3 minutes Unless they are known Gms just to prevent using A.I. For chess. People will want to cheat to raise there rating up and win tournaments since you have 15 seconds per move you will always have enough time to use A.I. Personally I don't see being able to beat someone like Hielco or Nortrom for a very long time if ever. With Hielco and Nortrom still being elite players atleast.

 

So change the timings to 3|4 (3 mins + 4 seconds per move, Bronstein). Good luck typing those moves that fast. You might get better return for writing an AI that can track the discovered pieces on the board :)


I hereby grant explicit permission to all my opponents to record and publish my games as they see fit.


#33 MTinsley

MTinsley

    Sergeant

  • Honorary members
  • PipPipPipPipPipPip
  • 254 posts
  • Coat of arms
  • Platinum Major

Posted 26 February 2018 - 09:27 PM

I believe that one day, AI will be able to beat the top Stratego Players.

Neural Network algorithms have perfected Chess, Shogi, and Go where all the information is available at the start. These are called 'perfect information games'. Also, there's been a lot of progression for AI research in imperfect information games:-
-Limit Texas Hold Em (no limit hold em is currently unsolved) https://en.wikipedia...heus_(poker_bot)
-Scrabble AI is better than most top human players https://en.wikipedia...Maven_(Scrabble)
-OpenAI played various PC games better than most humans https://techcrunch.c...dota-2-players/

Edited by MTinsley, 26 February 2018 - 09:29 PM.


#34 Nortrom

Nortrom

    General

  • Moderators
  • 2,011 posts
  • Coat of arms
  • Platinum Marshal

Posted 26 February 2018 - 09:29 PM

I've seen the OpenAI play dota2, but it was a 1v1 matchup and the actual game is a 5v5 which requires coordination.


  • MTinsley likes this
"Rock is overpowered, paper is fine" - scissors

#35 MTinsley

MTinsley

    Sergeant

  • Honorary members
  • PipPipPipPipPipPip
  • 254 posts
  • Coat of arms
  • Platinum Major

Posted 26 February 2018 - 09:33 PM

I don't play those games myself. I read the news article and it implied it was the complete game. Thanks for clarifying!

It is not unreasonable to assume, with the current developments in research, a top-level Stratego AI is possible. However, Stratego isn't as popular as the other games mentioned in my initial post. While I believe the AI to be feasibly possible, I doubt it we will get to see one for some time.

I'm very interested in seeing what TemplateRex can produce. He seems interested in developing his own program. The current programs available -- Probe and MasterOfTheFlag -- are fairly weak and do not have a consistent plan.

#36 TemplateRex

TemplateRex

    Sergeant

  • Members
  • PipPipPipPipPipPip
  • 453 posts
  • Coat of arms
  • Silver Captain

Posted 26 February 2018 - 09:37 PM

I'm very interested in seeing what TemplateRex can produce. He seems interested in developing his own program. The current programs available -- Probe and MasterOfTheFlag -- are fairly weak and do not have a consistent plan.

 

The best way is probably to start on a much smaller board with fewer pieces. E.g. 8-piece barrage on a 6x6 board with 2 small or 1 big lake. Just to get something off the ground (a bit like the 1vs1 Dota player). 

 

Also the setup part of the game is conceptually different from other games. Learning to do it without human input (i.e. the AlphaZero way) is not entirely clear. The reinforcement learning approach learns how to assign credit to individual moves, and gradually updates the parameter weights. The setup are 40 simultaneous moves for both players. Maybe it will just work, but it might be hard for these learning algorithms to figure out how exactly the setup placement contributed to a win or loss.


Edited by TemplateRex, 26 February 2018 - 09:43 PM.

I hereby grant explicit permission to all my opponents to record and publish my games as they see fit.


#37 MTinsley

MTinsley

    Sergeant

  • Honorary members
  • PipPipPipPipPipPip
  • 254 posts
  • Coat of arms
  • Platinum Major

Posted 26 February 2018 - 09:46 PM

Perhaps the AI can randomly create setups, but given a few key principles, such as:

-Never enclose moving pieces behind bombs
-Never 'block' one of the 3 lanes with bombs
-Flag should be on the bottom row.
-Marshals, Generals, and Colonels should not be placed on the opening 6 squares
-And so on

Edited by MTinsley, 26 February 2018 - 09:46 PM.


#38 TemplateRex

TemplateRex

    Sergeant

  • Members
  • PipPipPipPipPipPip
  • 453 posts
  • Coat of arms
  • Silver Captain

Posted 26 February 2018 - 09:56 PM

Perhaps the AI can randomly create setups, but given a few key principles, such as:

-Never enclose moving pieces behind bombs
-Never 'block' one of the 3 lanes with bombs
-Flag should be on the bottom row.
-Marshals, Generals, and Colonels should not be placed on the opening 6 squares
-And so on

 

All eminently reasonable principles, but the AlphaZero approach would consider it cheating :) More seriously, they tried to add human knowledge (still automated though). They fitted initial parameters to human grandmaster games, then updated with self-play learning. It turns out that learning everything from scratch eventually surpassed their initial efforts. All constraints you put on setups should be discoverable through self-play, and if not, they are probably a human weakness (at least this is what AlphaGo experienced).


I hereby grant explicit permission to all my opponents to record and publish my games as they see fit.


#39 TemplateRex

TemplateRex

    Sergeant

  • Members
  • PipPipPipPipPipPip
  • 453 posts
  • Coat of arms
  • Silver Captain

Posted 11 June 2018 - 07:38 PM



I'm curious as to how your project will unfold. While the posts you made about the theoretical things are quite interesting, they are not so easy to comprehend. I suppose I'll read it again later when fully focused :)

 

What exactly does this 'prototype' you speak of contain?

 

Baby steps: a somewhat primitive small cursor-based front-end (C++ based, interfaces with my analysis engine that I am developing. No ETA, hopefully more to come during this summer, with some interesting analyses):

 

d6DFoZh.jpg


  • Nortrom likes this

I hereby grant explicit permission to all my opponents to record and publish my games as they see fit.


#40 Wogomite

Wogomite

    Lieutenant

  • NASF Committee
  • 717 posts
  • Coat of arms
  • Gold Lieutenant

Posted 11 June 2018 - 08:01 PM

This is my question to the 'know hows', is it possible to teach an ai to respond to known knowledge of his opponent? If so then the combinations of information a computer could learn could indeed beat the best of humans if it was designed by the best players. I'm sure it would take an atrocious amount of time to create but it could be done. Imagine an ai Nortrom that never suffered any type of mental fatigue or over stimulation. Even more amazing, think of the code? If Nortrom made this, the code would hold his deepest Stratego secrets. A genuine chest of Stratego treasure.




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users