摘要: We present a novel method to detect parallel fragments within noisy corpora. Isolating these from the data in which they are contained frees us alignments and stray links that can severely constrain translation-rule extraction. do this with existing machinery, making use of an word alignment model for task. evaluate quality utility extracted on large-scale Chinese-English Arabic-English translation tasks show significant improvements over state-of-the-art baseline.