Evaluation of question classification systems using differing features
Abstract
Most question and answer systems are based on three research themes: question classification and analysis, document retrieval and answer extraction. The performance in every stage affects the final result. The classification of questions appears as an important task because it deduces the type of expected answers. A method of improving the performance of question classification is presented, based on linguistic analysis (semantic, syntactic and morphological) as well as statistical approaches guided by a layered semantic hierarchy of fine grained question types. Actually, methods of question expansion are studied. This method adds for each word a higher representation. Various features of questions, diverse term weightings and several machine learning algorithms are compared. Experiments were conducted on real data are presented. They demonstrate an improvement in precision for question classification.