作者: Carolyn Penstein Rosé , Elijah Mayfield , David Adamson
DOI:
关键词:
摘要: Automated annotation of social behavior in conversation is necessary for large-scale analysis real-world conversational data. Important behavioral categories, though, are often sparse and appear only specific subsections a conversation. This makes supervised machine learning difficult, through combination noisy features unbalanced class distributions. We propose within-instance content selection, using cue to selectively suppress sections text biasing the remaining representation towards minority classes. show effectiveness this technique automated empowerment language online support group chatrooms. Our significantly more accurate than multiple baselines, especially when prioritizing high precision.