作者: Chaeryon Kang , Holly Janes , Ying Huang
DOI: 10.1111/BIOM.12192
关键词: Statistics 、 Inverse probability 、 Regression analysis 、 Mathematical optimization 、 Mathematics 、 AdaBoost 、 Estimator 、 Weighting 、 Maximization 、 BrownBoost 、 Decision boundary
摘要: We thank co-editor Jeremy M. G. Taylor for organizing this discussion and the discussants their insightful comments suggestions. In rejoinder, we will address broad points made by individual draw connections between them. We agree with Laber, Tsiatis, Davidian, Holloway (hereafter LTDH) that a taxonomy of methodology deriving treatment rules is useful discussing relative merits various approaches. LTDH classified statistical approaches to find marker-based into two classes based on estimation method: “regression-based methods” obtain rule first modeling outcome using regression model; “policy search directly maximize criterion interest, example, expected under treatment, in order derive rule. Our boosting approach was characterized as regression-based approach, whereas weighted learning (OWL, Zhao et al. (2012)), direct maximization inverse probability estimator (IPWE) augmented (AIPWE) (Zhang (2012a, b)), marker-by-treatment interactions through Q- A-learning (for Murphy (2003); (2009)) were policy methods. We prefer group methods somewhat different labels. call those yield contrast, “outcome prediction model given marker which can then be used Using terminology, our OWL, A- are all examples methods: they only do not produce outcome. The differ whether “direct” maximizing interest such treatment; or “indirect” from, but presumably related interest. method an indirect approach. proposed Tian, minimizes rate at subjects misclassified according benefit (using surrogate variable unobserved outcome), also method. This helpful, believe, it makes plain fact mentioned article methods, except suggested Yu Li YL), designed robust misspecification. They therefore limited suitable addressing problem identifying rule, more difficult task predicting value assignment. Several novel use ideas. Several rely refomulated classification weights functions (Zhao (2012) Zhang (2012a,b)). formulation, Kosorok ZK) Tian solving approximation applied AdaBoost improve weak classifiers, while “value boosting” allows general from AIPWE. these have appeal deserve in-depth investigation. YL both raised questions about strategy upweighting small estimated effects, near decision boundary, who likely incorrectly respect benefit. interesting fundamental question: should lie close boundary influence classifier? Or far whose incorrect recommendations greater impact influence? Many traditional focused classify, support vector machines AdaBoost. other recently developed BrownBoost (Freund, 2001) focus class labels consistently correct across iterations, give up “noisy subjects” incorrect. that, selection context, effects large worth further investigation. suspect optimal weighting depend particular setting, affected factors distribution markers associations effect. We point ZK performance depends choice working model. practice, prior biological knowledge cross-validation techniques guiding There appears similar sensitivity OWL “kernel” parameterizing boundary. Comparing finite sample one another simulations challenging, particularly misspecification, each requires specification set inputs. One simple question how extend binary article, types outcomes continuous count outcomes. extends naturally illustrated Table 1. let D ∈ ℝ1 smaller preferable, T assignment (T = 0/1 where 1 default), Y ℝp markers. Denote Δ(Y) E(D|T 0,Y) − 1,Y) marker-specific effect, ϕ(Y) 1{Δ(Y) ≤ 0} and Table 1 Results simulation study setting. Marker combinations obtained linear maximum likelihood (Linear MLE) described ... θ{ϕ(Y)}=E(D∣T=1)-[E{D∣T=1,ϕ(Y)=0}P{ϕ(Y)=0}+E{D∣T=0,ϕ(Y)=1}P{ϕ(Y)=1}] the primary measure its performance. slightly higher θ misclassification than classical estimation. note, however, investigation needed specify reasonable ranges tuning parameters setting. We conclude observations emerge discussion. First, there much gained bringing researchers areas statistics biostatistics together around single topic. highlights fields adaptive regimes risk biomarker evaluation. Undoubtedly, has brought relevant work field attention another. With science surely improve. Second, tremendous reproducible research. applaud journal encouraging us publish code along article. code, able efficiently compare alternative ours same scenarios, thus expediting scientific process.