DOI:
关键词:
摘要: Consider a learner who wants to dynamically collect observations so as to improve their information about an underlying phenomena of interest. The learner needs to collect the observations quickly as well as account for the penalty of wrong declaration. This is a sequential learning problem where the learner relies on his current information state to adaptively select the most” informative” action from the available action set. In this thesis we build a connection to the sequential design of experiments originally proposed by Chernoff in Chernoff (1959) and develop new approaches in active regression. Let us first establish some basic notation and a problem statement. Consider a sequential learning problem in which the learner may select one of n possible measurement actions at each step. A sample resulting from measurement action i is a realization of a sub-Gaussian random variable with mean µi (θ∗), where the mean µi (θ) is a known function parameterized by θ∈ Θ and ϵ is a zero mean random variable. The specific θ∗∈ Θ that governs the observations is not known. Each measurement action may be performed multiple times, resulting in iid observations of this form, and all measurements (from different actions) are also statistically independent. This thesis considers the problem of sequentially and adaptively choosing measurement actions to accomplish one or more of the following goals:Active Testing: Θ is finite and the goal is to correctly determine the true hypothesis θ∗. Active Regression: Θ is a compact (uncountable) space and the goal is to accurately estimate θ∗. Structured Bandit: The goal is to identify measurement action (s) that …