作者: Shuo Chang , Peng Dai , Jilin Chen , Ed H. Chi
关键词:
摘要: Online search and item recommendation systems are often based on being able to correctly label items with topical keywords. Typically, labelers analyze the main text associated item, but social media posts multimedia in nature contain contents beyond text. Topic labeling for is therefore an important open problem supporting effective recommendation. In this work, we present a novel solution Google+ posts, which integrated number of different entity extractors annotators, each responsible part post (e.g. body, embedded picture, video, or web link). To account varying quality annotator outputs, first utilized crowdsourcing measure accuracy individual then used supervised machine learning combine annotators their relative accuracy. Evaluating using ground truth data set, found that our approach substantially outperforms topic labels obtained from text, as well naive combinations annotators. By accurately applying according relevance results enables better