作者: Cassio V.S. Prazeres , Maria da Graca C. Pimentel , Cesar A.C. Teixeira
DOI: 10.1109/ICSC.2007.54
关键词:
摘要: One of the ultimate goals natural language processing (NLP) systems is understanding meaning what being transmitted, irrespective medium (e.g., written versus spoken) or form static documents dynamic dialogues). Although much work has been done in traditional domains such as speech and text, little yet newer communication enabled by Internet, e.g., online chat instant messaging. This part due to fact that there are no annotated corpora available broader research community. The purpose this build a corpus, tagged with lexical (token part-of-speech labels), syntactic (post parse tree), discourse classification) information. Such corpus can then be used develop more complex, statistical-based NLP applications perform tasks author profiling, entity identification, social network analysis.