Sample-efficient imitation learning via generative adversarial nets

Blondé, Lionel (University of Geneva, Switzerland) ; Kalousis, Alexandros (Haute école de gestion de Genève, HES-SO // Haute Ecole Spécialisée de Suisse Occidentale)

GAIL is a recent successful imitation learning architecture that exploits the adversarial training procedure introduced in GANs. Albeit successful at generating behaviours similar to those demonstrated to the agent, GAIL suffers from a high sample complexity in the number of interactions it has to carry out in the environment in order to achieve satisfactory performance. We dramatically shrink the amount of interactions with the environment necessary to learn well-behaved imitation policies, by up to several orders of magnitude. Our framework, operating in the model-free regime, exhibits a significant increase in sample-efficiency over previous methods by simultaneously a) learning a self-tuned adversarially-trained surrogate reward and b) leveraging an off-policy actor-critic architecture. We show that our approach is simple to implement and that the learned agents remain remarkably stable, as shown in our experiments that span a variety of continuous control tasks. Video visualisations available at: \url{https://youtu.be/-nCsqUJnRKU}.


Type de conférence:
full paper
Faculté:
Economie et Services
Ecole:
HEG - Genève
Institut:
CRAG - Centre de Recherche Appliquée en Gestion
Classification:
Informatique
Adresse bibliogr.:
Okinawa, Japan, 16-18 April 2019
Date:
2019-04
Okinawa, Japan
16-18 April 2019
Pagination:
11 p.
Veröffentlicht in:
Proceedings of Machine Learning Research
Numérotation (vol. no.):
2019, vol. 89, pp. 3138-3148
ISSN:
2640-3498
Ressource(s) externe(s):
Le document apparaît dans:



 Datensatz erzeugt am 2019-10-16, letzte Änderung am 2020-03-25

Volltext:
Volltext herunterladen
PDF

Dieses Dokument bewerten:

Rate this document:
1
2
3
 
(Bisher nicht rezensiert)