Monte-Carlo CFR

Paper link.

With MCCFR, we avoid traversing the entire game tree on each iteration while still having the immediate counterfactual regrets be unchanged in expectation.

Ways to Sample:

  • Chance Sampling (CS)
    • selects a single chance node at the root of the tree.
  • External Sampling (ES)
    • Sample the actions of the opponent and of chance only.
    • This means that these samples are based on how likely the opponent’s plays are to occur, which is sensible, since then regret values corresponding to these plays are updated faster.
  • Outcome Sampling (OS)
    • Samples one action down the whole tree.

Average Strategy Sampling (AS), selects actions for player i according to the cumulative profile and three predefined parameter