Variance Based Samples Weighting for Supervised Deep Learning

Paul Novello; Gaël Poëtte; David Lugato; Pietro Marco Congedo

Pré-Publication, Document De Travail Année : 2020

Variance Based Samples Weighting for Supervised Deep Learning

(1, 2, 3) , (2) , (4) , (1, 3)

1
2
3
4

Paul Novello

Fonction : Auteur
PersonId : 1214819
IdHAL : paul-novello
ORCID : 0000-0002-1053-8694

Centre de Mathématiques Appliquées - Ecole Polytechnique

Commissariat à l'énergie atomique et aux énergies alternatives

Shape reconstruction and identification

Gaël Poëtte

Fonction : Auteur

Commissariat à l'énergie atomique et aux énergies alternatives

David Lugato

Fonction : Auteur
PersonId : 933510

Centre d'études scientifiques et techniques d'Aquitaine

Pietro Marco Congedo

Fonction : Auteur

Centre de Mathématiques Appliquées - Ecole Polytechnique

Shape reconstruction and identification

Résumé

Machine Learning (ML) aims at approximating functions defined on a measured space with a model. A relevant choice of distribution for the training data set can improve the performances of a given ML model. We claim and empirically justify that an ML model yields better results when the data set focuses on regions where the function to learn is steeper. We first traduce this assumption in a mathematically workable way. Then, theoretical derivations allow to construct a methodology that we call Variance Based Samples Weighting (VBSW). VBSW uses local variance of the labels to weight the training points. This methodology is general, scalable and cost effective. It is validated on Deep Learning models like Bert [10] or ResNet [14] by significantly increasing their performances for various Natural Language Processing and image classification tasks.

Domaines

Mathématiques [math] Statistiques [math.ST] Statistiques [stat] Machine Learning [stat.ML] Théorie [stat.TH]

Fichier principal

full_paper.pdf (721.35 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Paul Novello : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-02885827

Soumis le : mercredi 1 juillet 2020-09:25:08

Dernière modification le : mercredi 3 avril 2024-11:24:09

Archivage à long terme le : mercredi 23 septembre 2020-14:25:28

Dates et versions

hal-02885827 , version 1 (01-07-2020)

hal-02885827 , version 2 (19-01-2021)

hal-02885827 , version 3 (28-01-2021)

hal-02885827 , version 4 (27-09-2022)

Identifiants

HAL Id : hal-02885827 , version 1

Citer

Paul Novello, Gaël Poëtte, David Lugato, Pietro Marco Congedo. Variance Based Samples Weighting for Supervised Deep Learning. 2020. ⟨hal-02885827v1⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

312 Consultations

330 Téléchargements

Variance Based Samples Weighting for Supervised Deep Learning

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Partager