News
Rajeev Agrawal, Sample Mean Based Index Policies with O (log n) Regret for the Multi-Armed Bandit Problem, Advances in Applied Probability, Vol. 27, No. 4 (Dec., 1995), pp. 1054-1078 ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results