作者: Sanghack Lee , Elias Bareinboim
DOI: 10.1609/AAAI.V33I01.33014164
关键词:
摘要: Causal knowledge is sought after throughout data-driven fields due to its explanatory power and potential value inform decision-making. If the targeted system well-understood in terms of causal components, one able design more precise surgical interventions so as bring certain desired outcomes about. The idea leveraging understanding a improve decision-making has been studied literature under rubric structural bandits (Lee Bareinboim, 2018). In this setting, (1) pulling an arm corresponds performing intervention on set variables, while (2) associated rewards are governed by underlying mechanisms. One key assumption work that any observed variable (X) manipulable, which means intervening making do(X = x) always realizable. many real-world scenarios, however, too stringent requirement. For instance, scientific evidence may support obesity shortens life, it’s not feasible manipulate directly, but, for example, decreasing amount soda consumption (Pearl, paper, we study relaxed version bandit problem when all variables manipulable. Specifically, develop procedure takes argument partially specified identifies possibly-optimal arms with non-manipulable variables. We further introduce algorithm uncovers non-trivial dependence structure among arms. Finally, corroborate our findings simulations, shows MAB solvers enhanced newly discovered consistently outperform causal-insensitive solvers.