Semi-automatic training of an object recognition system in scene camera data using gaze tracking and accelerometers

Cognolato, Matteo (University of Applied Sciences and Arts Western Switzerland (HES-SO Valais-Wallis) ; Rehabilitation Engineering LaboratoryETH Zurich) ; Graziani, Mara (University of Rome “La Sapienza”) ; Giordaniello, Francesca (University of Rome “La Sapienza”, Italy) ; Saetta, Gianluca (Department of Neurology, University Hospital of Zurich, Switzerland) ; Bassetto (Franco : linic of Plastic SurgeryPadova University Hospital, Italy) ; Brugger, Peter (Department of Neurology, University Hospital of Zurich, Switzerland) ; Caputo, Barbara (University of Rome “La Sapienza”, Italy) ; Müller, Henning (University of Applied Sciences and Arts Western Switzerland (HES-SO Valais-Wallis)) ; Atzori, Manfredo (University of Applied Sciences and Arts Western Switzerland (HES-SO Valais-Wallis))

Object detection and recognition algorithms usually require large, annotated training sets. The creation of such datasets requires expensive manual annotation. Eye tracking can help in the annotation procedure. Humans use vision constantly to explore the environment and plan motor actions, such as grasping an object. In this paper we investigate the possibility to semi-automatically train object recognition with eye tracking, accelerometer in scene camera data, learning from the natural hand-eye coordination of humans. Our approach involves three steps. First, sensor data are recorded using eye tracking glasses that are used in combination with accelerometers and surface electromyography that are usually applied when controlling prosthetic hands. Second, a set of patches are extracted automatically from the scene camera data while grasping an object. Third, a convolutional neural network is trained and tested using the extracted patches. Results show that the parameters of eye-hand coordination can be used to train an object recognition system semi-automatically. These can be exploited with proper sensors to fine-tune a convolutional neural network for object detection and recognition. This approach opens interesting options to train computer vision and multi-modal data integration systems and lays the foundations for future applications in robotics. In particular, this work targets the improvement of prosthetic hands by recognizing the objects that a person may wish to use. However, the approach can easily be generalized.


Mots-clés:
Faculté:
Economie et services
Ecole:
HEG VS HES-SO Valais-Wallis - Haute Ecole de Gestion & Tourisme
Institut:
Institut Informatique de gestion
Classification:
Economie/gestion
Adresse bibliogr.:
Cham, Springer
Date:
Cham
Springer
2017
Pagination:
pp. 175-184
Titre du document hôte:
Computer Vision Systems : 11th International Conference, ICVS 2017, Shenzhen, China, July 10-13, 2017
DOI:
ISBN:
978-3-319-68344-7
Le document apparaît dans:

Note: The status of this file is: restricted


 Notice créée le 2017-11-03, modifiée le 2017-11-10

Fichiers:
Télécharger le document
PDF

Évaluer ce document:

Rate this document:
1
2
3
 
(Pas encore évalué)