Multi-Agent System

Some approaches:

Convention

The convention for the player is ego. The convention for the opponent is opp.

Example:

_, _, _, _ = env.reset(np.array([[ego_x, ego_y],
                                 [opp1_x, opp1_y],
                                 [opp2_x, opp2_y]]))
_, _, _, _ = env.step(np.array([[ego_steer, ego_speed],
                                 [opp1_steer, opp1_speed],
                                 [opp2_steer, opp2_speed]]))