Estimating unobserved audio features for target-based orchestration

Cella, Carmine-Emanuele; Gillick, Jon; Bamman, David

doi:10.5281/zenodo.3527776

Cella, Carmine-Emanuele; Gillick, Jon; Bamman, David

2019

Télécharger

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Résumé

Target-based assisted orchestration can be thought of as the process of searching for optimal combinations of sounds to match a target sound, given a database of samples, a similarity metric, and a set of constraints. A typical solution to this problem is a proposed orchestral score where candidates are ranked by similarity in some feature space between the target sound and the mixture of audio samples in the database corresponding to the notes in the score; in the orchestral setting, valid scores may contain dozens of instruments sounding simultaneously. Generally, target-based assisted orchestration systems consist of a combinatorial optimization algorithm and a constraint solver that are jointly optimized to find valid solutions. A key step in the optimization involves generating a large number of combinations of sounds from the database and then comparing the features of each mixture of sounds with the target sound. Because of the high computational cost required to synthesize a new audio file and then compute features for every combination of sounds, in practice, existing systems instead estimate the features of each new mixture using precomputed features of the individual source files making up the combination. Currently, state-of-the-art systems use a simple linear combination to make these predictions, even if the features in use are not themselves linear. In this work, we explore neural models for estimating the features of a mixture of sounds from the features of the component sounds, finding that standard features can be estimated with accuracy significantly better than that of the methods currently used in assisted orchestration systems. We present quantitative comparisons and discuss the implications of our findings for target-based orchestration problems.

Détails

Titre Estimating unobserved audio features for target-based orchestration

Auteur(s)/ trice(s) Cella, Carmine-Emanuele (HEM Haute Ecole de Musique de Genève, HES-SO Haute école spécialisée de la Suisse occidentale)
Gillick, Jon (School of Information, University of California, Berkeley)
Bamman, David (School of Information, University of California, Berkeley)

Date 2019-11

Publié dans Proceedings of the 20th International Society for Music Information Retrieval Conference, ISMIR, Delft, The Netherlands, November 4-8, 2019

Pagination pp. 192-199

ISBN 978-1-7327299-1-9

DOI https://doi.org/10.5281/zenodo.3527776

Type d'article scientifique

Domaine Musique et Arts de la scène

Ecole HEM - Genève

Institut IRMAS - Institut de Recherche en Musique et Arts de la Scène

Le document apparaît dans Articles scientifiques
Global

Résumé

Détails

Actions

PDF