RESEARCH
Applying a Sectioned Genetic Algorithm to Word Segmentation
Research Area:  
Other topics in Computer Science
Type:  
Journal article
Year: | 2010 | ||||
---|---|---|---|---|---|
Authors: | Ζ. Detorakis; George Tambouratzis | ||||
Journal: | Pattern Analysis and Applications | ||||
Volume: | 13 | ||||
Number: | 1 | ||||
Pages: | 93-104 | ||||
DOI: | 10.1007/s10044-008-0140-z | ||||
Abstract: | This article presents a novel approach for morphological analysis based on the concept of genetic algorithms (GAs). Morphological analysis is of critical importance in data mining and information retrieval systems because it leads to a more homogeneous representation of words. The system presented here makes minimal use of language specific information and is therefore more general than the rule-based techniques that have been proposed in literature. A number of heuristics are created and tested as evaluation functions; both general-purpose ones as well as heuristics specifically designed for the task, and decisions are made on the optimum models for the genetic operators suitable for the specific implementation. Finally the system addresses the problem of simultaneous processing of a great number of words without excessively increasing the execution time or deteriorating the segmentation quality of the final results. This is accomplished by the division of the individuals into sections, following the application of a group of masks, and the operation of the GA on these smaller sections instead of on the entire individual. |
||||
[Bibtex] |