Δημοσίευση - Applying a Sectioned Genetic Algorithm to Word Segmentation
ΕΡΓΑ

Applying a Sectioned Genetic Algorithm to Word Segmentation

Ερευνητική περιοχή:  
Άλλα θέματα Πληροφορικής
    
Είδος:  
Άρθρο σε περιοδικό

 

Έτος: 2010
Συγγραφείς: Ζ. Detorakis; Γιώργος Ταμπουρατζής
Περιοδικό: Pattern Analysis and Applications
Τόμος: 13
Αριθμός: 1
Σελίδες: 93-104
DOI: 10.1007/s10044-008-0140-z
Περίληψη:
This article presents a novel approach for morphological analysis based on the concept of genetic algorithms (GAs). Morphological analysis is of critical importance in data mining and information retrieval systems because it leads to a more homogeneous representation of words. The system presented here makes minimal use of language specific information and is therefore more general than the rule-based techniques that have been proposed in literature. A number of heuristics are created and tested as evaluation functions; both general-purpose ones as well as heuristics specifically designed for the task, and decisions are made on the optimum models for the genetic operators suitable for the specific implementation. Finally the system addresses the problem of simultaneous processing of a great number of words without excessively increasing the execution time or deteriorating the segmentation quality of the final results. This is accomplished by the division of the individuals into sections, following the application of a group of masks, and the operation of the GA on these smaller sections instead of on the entire individual.
[Bibtex]