UJM at CLEF in Author Verification based on optimized classification trees
Abstract
This article describes our proposal for the Author Identification task in the PAN CLEF Challenge 2014. We have adopted a machine learning ap- proach based on several representations of the texts and on optimized decision trees which have as entry various attributes and which are learned for every train- ing corpus separately for this classification task. Our method ranked us at the 2nd place with an overall AUC of 70.7%, and C@1 of 68.4% and, between the 1st and the 6th place on the six corpora.