Classification of texts according to register/author style. First, a method was developed to classify texts according to the register they belong to, eg. political speech register, fiction register etc. A classification rate of 90% was achieved. At a second step, a method was developed for the classification of texts in the same register according to the style of the author. To this end, corpora from the Minutes of the Greek Parliament were used*. The overall enterprise was supported by a set of language engineering tools, all developed by ILSP / R.C. "Athena", as well as a set of statistical methods. For this last task, the classification rate of 85% was achieved.
*ILSP / R.C. "Athena" wishes to acknowledge the assistance of the Secretariat of the Hellenic Parliament in obtaining the session transcripts.
|
|
Research areas
Publications
|
|