Parameter Study
You can conduct a parameter study to compare various bandit algorithms. Plot each algorithm as a function of its own parameter shown on a single scale on the x-axis. ? y-axis is the average reward over 1000 time steps.
Comparing x-axis: Number of steps y-axis: Cumulative reward