Comparing Corpus-based to Web-based Lookup Techniques for Automatic English Inclusion Detection

作者: Beatrice Alex

DOI:

关键词: Web applicationInformation retrievalCorpus basedData setComputer scienceInclusion (education)German

摘要: The influence of English as a global language continues to grow an extent that its words and expressions permeate the original forms other languages. This paper evaluates modular Web-based sub-component existing inclusion classifier compares it corpus-based lookup technique. Both approaches are evaluated on German gold standard data set. It is demonstrated what approach benefits from amount available online fact this constantly updated.

参考文章(12)
Volker Fischer, J. C. Marcadet, Claire Waast-Richard, A transformation-based learning approach to language identification for mixed-lingual text-to-speech synthesis. conference of the international speech communication association. pp. 2249- 2252 ,(2005)
Gregory Grefenstette, Julien Nioche, Estimation of English and non-English language use on the WWW riao conference. pp. 237- 246 ,(2000)
Mitch Marcus, Beatrice Santorini, Mary Ann Marcinkiewicz, None, Building a large annotated corpus of English: the penn treebank Computational Linguistics. ,vol. 19, pp. 313- 330 ,(1993) , 10.21236/ADA273556
Hans Uszkoreit, Thorsten Brants, Brigitte Krenn, Wojciecb Skut, A lingnistically interpreted corpus of German newspaper text language resources and evaluation. pp. 705- 712 ,(1998)
ARIEL S. SCHWARTZ, MARTI A. HEARST, A simple algorithm for identifying abbreviation definitions in biomedical text. pacific symposium on biocomputing. pp. 451- 462 ,(2002) , 10.1142/9789812776303_0042
Einar Haugen, The analysis of linguistic borrowing. Language. ,vol. 26, pp. 210- ,(1950) , 10.2307/410058
Paulseph-John Farrugia, Text to Speech Technologies for Mobile Telephony Services University of Malta. Faculty of ICT. ,(2003)