505_research
Training Audio-to-Audio Mapping Mechanism for Extending Acoustic Instruments and Guiding Improvisations
(From application document to DXARTS)
Applying state-of-the-art knowledge of audio description and machine learning techniques for artistic purposes
Many people works on audio description and machine learning techniques applied to music. However, most of this research is oriented to solve commercial and industrial problems such as creating audio recommendation systems, audio identifiers for radio stations, music querying by humming, among many others. Therefore, most of the work is done in pop-music that has the majority of consumers. Few people applied the knowledge to study other type of sound expressions and, if that is the case, the work has a traditional musicological approach. In consequence, there are few artistic explorations in the use of such cutting-edge technologies. There are some exceptions, and several musicians are implementing new techniques to their artistic production, however these contributions are mainly in the field of electronic music. One of the main contributions of the proposed work is therefore implementing the current knowledge in the fields of audio description and machine learning into the domain of improvisatory music. These tools are the technical methods that will be employed in order to create an artistic proposal that try to push the current state of improvisatory music systems by using recent knowledge in the technical domain.
Analyzing the acoustic piano with DSP techniques
With the signal processing techniques available nowadays, it is possible to obtain on real-time a huge amount of information of an acoustic signal represented as digital data. This low-level audio descriptors can be categorized in various groups including temporal, energy, spectral, and harmonic descriptors. Combining some of these descriptors it is possible to obtain more musically meaningful information such as on set detection, beat and tempo, pitch and harmony, etc.1 Even complex problems such as obtaining polyphonic pitch detection on real time can be extracted in certain cases.2 Solving signal processing problems is out of the scope of this proposal, which means, for example, that instead of trying to solve the polyphonic pitch detection problem, our goal would be not only to take artistically advantage of what has been done so far, but also of what has not been done.
As mention before, many of the available systems use special hardware devices for extracting the information of the piano. One of the proposals of this project is, on the other hand, to build this extension using only the available signal processing procedures which seams to be a more natural procedure. Several detailed information that can be extracted by mechanical devices such as MIDI keyboards will be lost if we uses only audio detection. However, the hypothesis is that the lack of short-term resolution may be substituted by the use of adaptable mappings 3, and also by a better understanding of the high-level musical structure.
Modeling structure
Several works have been developed in the recent years in order to extract patterns from digital data. The results are used then to recognize low-level and high level musical structures. Most of this work has been developed in the context of popular music where usually there is a tonal harmony and a steady rhythm and recognizable sections in the form of verses and choruses. In order to obtain certain musical understanding of the patterns, several models have been proposed. Most of them generate hierarchical trees in one way or another. The most adopted model is probably the model of F. Lerdahl and R. Jackendoff [LER83] which has the disadvantage of work only with a particular style of music where the sections and phrases are well delimited. Another alternative that could be studied is the mathematical model of Guerino Mazzola which seems to be a suitable option for modeling patterns more diluted. [MAZ02]
Thus, another branch of the research will be to study and implement the model or models that best fulfill our requirement of having a tool that permits to keep tracking of the material performed on the instrument and obtain information about the overall shape, the leading, and the direction of the musical ideas. This model should be able to work dynamically and update constantly on real time.
Implementing a learning algorithm
Rehearsal is essential for a good concert! Either is a soloist improvisation or a group performance, the process of learning and tuning details before the actual concert is fundamental in the live music paradigm. As mention before, many music systems do interact with live musicians. However, few of them are conceived as systems where actual learning algorithms are core elements of the implementations. The author proposes the development of a system that requires rehearsal. Thus, the musician or musicians should interact several times with the system before the final concert.
Based on the model described in the last section and using standard machine learning techniques the system will analyze the activity of the player or players. It will also try to obtain as much musical knowledge as possible: type of musical materials employed; type of transformations and developments of such materials; type of global shape of the performance; type of interactions between musicians, etc. Once the system has obtained some information about the music, the system would be ready to become part of the ensemble and live musicians would be able to interact and improvised with it. Ideally, the system should generate motives and musical phrases that could be musically meaningful for the performance. The type of sounds that the system should generate must be in accordance to the material, and tentatively they should be generated based on the acoustic material of the acoustic instruments.
Measuring timbre similarity and transforming sounds
The initial extension of the instrument will be acoustic; therefore, the system will have to generate sounds automatically. Several approaches may be taken in order to generate the sounds. Basic options such as triggering prerecorded samples or using available effects in the form of -for example- commercial VST plug-ins are discarded because usually there is not a direct control of the timbre and many of the ranges and parameters are predetermined.
However, within the last years, there has been a great development in the field of describing the sound and trying to classify it.4 One goal of the project would be also to use this knowledge and try to implement transformation of sound that may be coherent to the musical context. The intention would be to create organic and subtle transformations instead of proposing surprising or outstanding timbres. In other words, having control over the relationship between the timbre of the generated sound and the rest of the musical variables would be more important than proposing timbre novelty.
Extrapolating the musical model into other medias
Once the model is robust enough to get understanding of the discourse, there is the possibility of extend the system and implementing routines that not only creates acoustic material but also other types of medias such as dynamic digital images; lighting on stage; choreographic information. As an example we can mention the well known field of audiovisual creation where sound and image are treated as a single unity. [EVA05] On this type of multimedia approaches, the motives and structures of the discourse transit indistinguishable among medias and both domains share the same structure.
Hacking the model
The extrapolation described above may be expanded into more exploratory experimental hacking-kind paradigms where the information recovered from the acoustic piano may be used for controlling mechanical devices such as motors and relays; accessing databases; or sending information to movable devices such as cell phones and PDA.
Initial Bibliography
Please download the attached files. hugosgBib.txt for the bibtext version, hugosgBib.htm for the html version formated as MLA 6th edition.
| Attachment | Size |
|---|---|
| hugosgBib.txt | 12.38 KB |
| hugosgBib.htm | 13.25 KB |