作者: zohar karnin , Kevin G. Jamieson , Julian Katz-Samuels , Lalit Jain
DOI:
关键词:
摘要: This paper proposes near-optimal algorithms for the pure-exploration linear bandit problem in fixed confidence and budget settings. Leveraging ideas from theory of suprema empirical processes, we provide an algorithm whose sample complexity scales with geometry instance avoids explicit union bound over number arms. Unlike previous approaches which based on minimizing a worst-case variance (e.g. G-optimal design), define experimental design objective Gaussian-width underlying arm set. We novel lower terms this that highlights its fundamental role complexity. The our matches bound, addition is computationally efficient combinatorial classes, e.g. shortest-path, matchings matroids, where sets can be exponentially large dimension. Finally, propose first bandits setting. Its guarantee up to logarithmic factors.