摘要: Many machine learning systems deployed for real-world applications such as recommender systems, image captioning, object detection, etc. are ensembles of multiple models. Also, the top-ranked systems in many data-mining and computer vision competitions use ensembles. Although ensembles are popular, they are opaque and hard to interpret. Explanations make AI systems more transparent and also justify their predictions. However, there has been little work on generating explanations for ensembles. In this chapter, we propose two new methods for ensembling visual explanations for VQA using the localization maps for the component systems. Our novel approach is scalable with the number of component models in the ensemble. Evaluating explanations is also a challenging research problem. We introduce two new approaches to evaluate explanations—the comparison metric and the uncovering …