Semantically Expanding Questions for Supervised Automatic Classification
Abstract
Responding correctly to a question given a large collection of textual data is not an easy task. There is a need to perceive and recognize the question at a level that permits to detect some constraints that the question imposes on possible answers. The question classification task is used in Question Answering systems. This deduces the type of expected answer, to perform a semantic classification to the target answer. The purpose is to provide additional information to reduce the gap between answer and question to match them. An approach to ameliorate the effectiveness of classifiers focusing on the linguistic analysis (semantic, syntactic and morphological) and statistical approaches guided by a layered semantic hierarchy of fine grained questions types. This work also proposes two methods of questions expansion. The first finds for each word synonyms matching its contextual sense. The second one adds a high representation "hypernym" for the noun. Various representation features of documents, term weighting and diverse machine learning algorithms are studied. Experiments conducted on actual data are presented. Of interest is the improvement in the precision of the classification of questions.