Δημοσίευση - An automatic method for revising ill-formed sentences based on N-grams

ΑΝΑΖΗΤΗΣΗ

An automatic method for revising ill-formed sentences based on N-grams

Ερευνητική περιοχή:  
    
Είδος:  
Άρθρο σε πρακτικά

 

Έτος: 2006
Συγγραφείς: Θεολόγος Αθανασέλης; Στυλιανός Μπακαμίδης; Ιωάννης Δολόγλου
Τίτλος βιβλίου: Proceedings of the 3rd International conference on Speech Prosody
Σελίδες: 370-373
Διεύθυνση: Dresden, Germany
Περίληψη:
A good indicator of whether a person really knows the context of language is the ability to use in correct order the appropriate words in a sentence. The “scrambled” words cause a meaningless and ill formed sentences. Since the language model, is extracted from a large text corpus, it encodes the local dependencies of words. The word order errors usually violated the syntactic rules locally and therefore the N-grams can be used in order to fix ill-formed sentences. This paper presents an approach for repairing word order errors in text by reordering words in a sentence and choosing the version that maximizes the number of trigram hits according to a language model. The novelty of this method concerns the use of an efficient confusion matrix technique for reordering the words. The comparative advantage of this method is that works with a large set of words, and avoids the laborious and costly process of collecting word order errors for creating error patterns.
[Bibtex]