Feature-augmented model for multilingual discourse relation classification - Méthodes et Ingénierie des Langues, des Ontologies et du Discours Access content directly
Conference Papers Year : 2024

Feature-augmented model for multilingual discourse relation classification

Abstract

Discourse relation classification within a multilingual, cross-framework setting is a challenging task and the best-performing systems so far have relied on monolingual and monoframework approaches. In this paper, we introduce transformer-based multilingual models, trained jointly over all datasets—thus covering different languages and discourse frameworks. We demonstrate their ability to outperform single-corpus models and to overcome (to some extent) the disparity among corpora, by relying on linguistic features and generic information about the nature of the datasets. We also compare the performance of different multilingual pretrained models, as well as the encoding of the relation direction, a key component for the task. Our results on the 16 datasets of the DISRPT 2021 benchmark show improvements in accuracy in (almost) all datasets compared to the monolingual models, with at best 65.91% in average accuracy, thus corresponding to a 4% improvement over the state-of-the-art.
Fichier principal
Vignette du fichier
2024.codi-1.9.pdf (203.18 Ko) Télécharger le fichier
Origin Files produced by the author(s)

Dates and versions

hal-04598276 , version 1 (03-06-2024)

Identifiers

  • HAL Id : hal-04598276 , version 1

Cite

Eleni Metheniti, Chloé Braud, Philippe Muller. Feature-augmented model for multilingual discourse relation classification. 5th Workshop on Computational Approaches to Discourse (CODI 2024), Association for Computational Linguistics, Mar 2024, St Julians, Malta. pp.91-104. ⟨hal-04598276⟩
16 View
8 Download

Share

Gmail Mastodon Facebook X LinkedIn More