Publication - Applying a Sectioned Genetic Algorithm to Word Segmentation
RESEARCH

Applying a Sectioned Genetic Algorithm to Word Segmentation

Research Area:  
Other topics in Computer Science
    
Type:  
Journal article

 

Year: 2010
Authors: Ζ. Detorakis; George Tambouratzis
Journal: Pattern Analysis and Applications
Volume: 13
Number: 1
Pages: 93-104
DOI: 10.1007/s10044-008-0140-z
Abstract:
This article presents a novel approach for morphological analysis based on the concept of genetic algorithms (GAs). Morphological analysis is of critical importance in data mining and information retrieval systems because it leads to a more homogeneous representation of words. The system presented here makes minimal use of language specific information and is therefore more general than the rule-based techniques that have been proposed in literature. A number of heuristics are created and tested as evaluation functions; both general-purpose ones as well as heuristics specifically designed for the task, and decisions are made on the optimum models for the genetic operators suitable for the specific implementation. Finally the system addresses the problem of simultaneous processing of a great number of words without excessively increasing the execution time or deteriorating the segmentation quality of the final results. This is accomplished by the division of the individuals into sections, following the application of a group of masks, and the operation of the GA on these smaller sections instead of on the entire individual.
[Bibtex]