Total Effects with Constrained Features
Résumé
Recent studies have emphasized the connection between machine learning feature importance measuresand total order sensitivity indices (total effects, henceforth). Feature correlations and the need to avoid unre-stricted permutations make the estimation of these indices challenging. Additionally, there is no establishedtheory or approach for non-Cartesian domains. We propose four alternative strategies for computing totaleffects that account for both dependent and constrained features. Our first approach involves a generalizedwinding stairs design combined with the Knothe-Rosenblatt transformation. This approach, while applicableto a wide family of input dependencies, becomes impractical when inputs are physically constrained. Oursecond approach is a U-statistic that combines the Jansen estimator with a weighting factor. The U-statisticframework allows the derivation of a central limit theorem for this estimator. However, this design is com-putationally intensive. Then, our third approach uses derangements to significantly reduce computationalburden. We prove consistency and central limit theorems for these estimators as well. Our fourth ap-proach is based on a nearest-neighbour intuition and it further reduces computational burden. We test theseestimators through a series of increasingly complex computational experiments with features constrainedon compact and connected domains (circle, simplex), non-compact and non-connected domains (Sierpinskigaskets), we provide comparisons with machine learning approaches and conclude with an application to arealistic simulator.
Domaines
Statistiques [math.ST]
Fichier principal
TotalEffectsConstrainedFeatures_BorognovoPlischkePrieur_hal_V2.pdf (2.95 Mo)
Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)