SSP-Net: Scalable sequential pyramid networks for real-Time 3D human pose regression

Diogo Carbonera Luvizon; Hedi Tabia; David Picard

doi:10.1016/j.patcog.2023.109714

Article Dans Une Revue Pattern Recognition Année : 2023

SSP-Net: Scalable sequential pyramid networks for real-Time 3D human pose regression

(1) , (1, 2) , (1, 3)

1
2
3

Diogo Carbonera Luvizon

Fonction : Auteur

MPI Informatics

Hedi Tabia

Fonction : Auteur
PersonId : 11431
IdHAL : hedi-tabia
ORCID : 0000-0002-1827-7150
IdRef : 159010373

MPI Informatics

Informatique, BioInformatique, Systèmes Complexes

David Picard

Fonction : Auteur
PersonId : 741
IdHAL : david-picard
ORCID : 0000-0002-6296-4222
IdRef : 133005216

MPI Informatics

Laboratoire d'Informatique Gaspard-Monge

Résumé

In this paper we propose a highly scalable convolutional neural networks, end-to-end trainable, for real-time 3D human pose regression from still RGB images. We call this approach Scalable Sequential Pyramid Networks (SSP-Net) as it is trained with refined supervision at multiple scales in a sequential manner. Our network requires a single training procedure and is capable of producing its best predictions at 120 frames per second (FPS), or acceptable predictions at more than 200 FPS when cut at test time. We show that the proposed regression approach is invariant to the size of feature maps, allowing our method to perform multi-resolution intermediate supervisions and reaching results comparable to the state-of-the-art with very low resolution feature maps. We demonstrate the accuracy and the effectiveness of our method by providing extensive experiments on two of the most important publicly available datasets for 3D pose estimation, Human3.6M and MPI-INF-3DHP. Additionally, we provide relevant insights about our decisions on the network architecture and show its flexibility to meet the best precision-speed compromise.

Mots clés

3D Human pose estimation Computer vision Neural nets

Domaines

Vision par ordinateur et reconnaissance de formes [cs.CV] Intelligence artificielle [cs.AI]

Fichier principal

2009.01998.pdf (1.44 Mo)

Origine	Fichiers produits par l'(les) auteur(s)

David Picard : Connectez-vous pour contacter le contributeur

https://hal.science/hal-04124371

Soumis le : mardi 5 mars 2024-22:28:22

Dernière modification le : mardi 19 novembre 2024-20:10:06

Archivage à long terme le : jeudi 6 juin 2024-19:49:14

Dates et versions

hal-04124371 , version 1 (05-03-2024)

Identifiants

HAL Id : hal-04124371 , version 1
ARXIV : 2009.01998
DOI : 10.1016/j.patcog.2023.109714

Citer

Diogo Carbonera Luvizon, Hedi Tabia, David Picard. SSP-Net: Scalable sequential pyramid networks for real-Time 3D human pose regression. Pattern Recognition, 2023, 142, pp.109714. ⟨10.1016/j.patcog.2023.109714⟩. ⟨hal-04124371⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

ENPC CNRS UNIV-CERGY UNIV-EVRY LIGM_A3SI IBISC PARISTECH LIGM ETIS ETIS-MIDI IBISC-IRA2 UNIV-PARIS-SACLAY GS-COMPUTER-SCIENCE GS-LIFE-SCIENCES-HEALTH GS-SPORT-HUMAN-MOVEMENT UNIV-EIFFEL

204 Consultations

42 Téléchargements

SSP-Net: Scalable sequential pyramid networks for real-Time 3D human pose regression

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager