You implemented an interesting project some time ago: You recreated the unique sound of the cult band "The Ramones" using artificial intelligence ... or how would you describe what you created with „THE RAiMONES“?
The Ramones were a cool band, but unfortunately all the original members have since passed away. With THE RaiMONES, I wanted to bring their music back to life using artificial intelligence, or machine learning. I was interested in two aspects in particular: Firstly, the implementation - how and with which data does this learning process work optimally - and secondly, what comes out of it in terms of creativity.
And what came out of it?
"THE RaiMONES was a small, private project. I didn't create complete pieces of music, but "sound snippets". I then sent these to a friend from Japan who had already played with the Ramones. When he received the generated notes for guitar and bass plus lyrics, he was able to make a song song out of it within hours.
You mentioned the data situation. Without it, of course, nothing works. What is necessary for a project like yours?
There are two main ways to approach this: Either you teach the neural network to play music like the Ramones right from the start, or you first teach it to play music from scratch - i.e. all the chords, etc. - and only then to play like the Ramones. In my project, I worked directly with the data basis of the Ramones. Accordingly, the whole thing is somewhat limited. I could probably have achieved a better outcome if I had first trained the network with a whole music corpus using transfer learning and only then with the specific Ramones style. However, I simply didn't have the time or the computing power.
There is no general answer to the question of how much data is needed to train the network in this way. It also depends on the network. The one I used was relatively simple because the data was also limited (editor's note: 130 songs in MIDI format and lyrics of all their 178 songs).
And what kind of data are we talking about?
Basically, there are two options for music: Either you use the midi data, i.e. the notes, or you use the audio files. Audio files, with 44,100 samples per second, naturally provide much more data, but are also much more complex. For example, Magenta by Google or the Dadabots work with audio files, OpenAI's Musenet uses MIDI data. I used midi files and used them to do the training and then had the notes written or calculated, so to speak, and not the tones. What the Dadabots do, for example, goes a bit further: they string together thousands of hard rock and death metal audio pieces and train their network on the basis of them, thereby coming up with completely new tones. So in their case, the timbre, instruments and so on are included. In my case, it was an abstract tone that I calculated.
Either you teach the neural network to play music like the Ramones right from the start, or you first teach it to play music from scratch - i.e. all the chords, etc. - and only then to play like the Ramones.
You mention Magenta and the Dadabots, among others. Where do we stand today in the field of AI-generated music?
Just recently, the AI Song Contest took place again, where the Dadabots were among the participants and you could see various approaches to what can be done with AI. In principle, artificial intelligence and the corresponding tools are mainly used in a supporting role. I would rather speak of augmented intelligence than artificial intelligence - in other words, an interplay between musicians and technology, by which they optimise their creative processes. For me personally, this is also the most promising approach for the future.
So you don't expect AI to storm the hit parade soon, with purely computer-generated songs?
There are such approaches - for example, for computer music or partly for meditation music, i.e. repetitive pieces. In my opinion, we are still a good while away from a completely computer-generated song with everything that goes with it, maybe it will be possible in 5 years, but I don't want to make a precise prediction. And one question that will certainly arise in this respect is: do we want that?
In the future, will there also be areas where AI will not stand a chance against a human artist?
I don't think so, no. I do believe that it will be difficult to develop an AI that can make coffee, drive a car and write pieces of music at the same time. But artificial intelligence specialised in one area will be able to do much, much more in the future than it can today - and I don't think there is anything that humans can do that an AI won't be able to do.
This also applies to creativity. This can also be simulated. That was something I learned from my "RaiMONES" project. Random generators were used to generate sound fragments, which were then used to create something new. A good example is the programme AlphaGo Zero - which was able to make moves in the board game Go that no one had ever thought of before. I expect the same with music, that technological possibilities will create something new that people would not have thought of themselves.
Our creativity is also shaped by external influences. With AI, you could think of these as inputs from data, so to speak?
Exactly. When AI is trained with countless data, it can get to places that we humans can never get to. We can now look at this again in the context of music: When we grow up here in the West, we are influenced by Western music, in India by Indian music - of course there are already certain fusions there, too, but it is rather difficult for something completely new to emerge. AI, on the other hand, which learns music in all its breadth and depth from scratch, without preferences and restrictions, has, in my opinion, more potential to invent something completely new. The question is: Do we like it? After all, our ears are also trained in a certain way to the music we know.
I do believe that it will be difficult to develop an AI that can make coffee, drive a car and write pieces of music at the same time. But artificial intelligence specialised in one area will be able to do much, much more in the future than it can today - and I don't think there is anything that humans can do that an AI won't be able to do.
Let's take another step back: how concretely does AI already support today's musicians?
There are various tools that are being used. These can, for example, generate suggestions for missing interludes in songs, serve as inspiration for musicians or generate the drum part for a song they have written. As I said, they are a help. But there are also other tools. Everything that concerns signal processing goes in the direction of machine learning - for example, "Source Separation" to separate individual musicians from a stereo track. For instance, if you only want to filter out the violin from a track or the vocals for karaoke machines. Nowadays, all this is already done with machine learning or is based on trained data.
Is it mainly about writing music or also the concrete implementation, meaning playing the songs?
Both. For my project, I could have basically given the data to a midi to audio generator. But completely computer-generated music does exist, of course; as I mentioned, the Dadabots generate complete death metal songs that way, or there are services like Endel, that offer something similar. There, for example, you can specify your mood and a suitable soundtrack is generated by the computer based on that. I find this influence of music on emotions very interesting. I think a lot will happen there in the future.
We notice ourselves that music has an influence on our emotions. I listen to different pieces when I want to concentrate than when I want to push myself, for example when I'm jogging. Technologies are currently being developed to deal with this. For example, the influence of certain music on us is measured via brain waves. That's super exciting, because it allows music to be adjusted to a person in such a way that he or she can optimally reach certain brain states, for example, in order to concentrate in the best possible way. The ETH spin-off “IDUN technologies”, for example, is working on using headphones to find out the emotional state of the user.
We notice ourselves that music has an influence on our emotions. I listen to different pieces when I want to concentrate than when I want to push myself, for example when I'm jogging. Technologies are currently being developed to deal with this. For example, the influence of certain music on us is measured via brain waves.
What is going on in Switzerland in the field of AI & music? Is there an AI music scene or interesting enterprises?
In concrete terms, there is not much happening in the start-up sector in Switzerland. There are researchers at the Zurich University of the Arts (ZHdK) who are researching AI and creativity or music, as well as a department at EPFL. From the music scene, I find Melody Chua very interesting; she plays the "augmented flute", so to speak. In the field of companies, there is an exciting start-up based in Zurich: Mictic. What it does I would call "augmented reality audio". You wear two wristbands and by moving your arms you can create all kinds of different sounds. I'm convinced that music will move in this direction in the future, so we won't just consume it, but will be able to shape it interactively.
So, among other things, it's about giving people who know little about music and can't read music are also given the opportunity to create something themselves?
Yes, to be creative. With the Mictic product, for example, you can experience what it's like to play the cello without ever having held one in your hand. I find that a very interesting aspect: using technology to simplify certain things, but also to promote creativity. That's what it's all about. It doesn't replace people or musicians.
So, that's a relief for the musicians…
Well, an analogy can perhaps be found in photography. In the past, a lot more was needed to take a good photo - big cameras, chemicals, etc. Today, a mobile phone alone is sometimes enough. But which mobile phone photos really have artistic value? I think it will be similar with music. People will be empowered to make music more easily themselves - but will they come close to an Elvis Presley? I don't think so.
So the music world of the future will be simpler, more interactive and individually adapted - and what will our everyday musical life look like in concrete terms?
I'm curious about that too! (laughs)
Matthias Frey has been fascinated by the mix of technology and music since his earliest days. No wonder he switched from the violin to the electronic bass guitar at a young age in order to be able to tinker with effects. He studied electrical engineering at the ETH and earned a doctorate in analogue electronics and signal processing. Professionally, Frey works for an electronics company in the areas of signal processing, AI and hardware for AI and is personally intensively involved with current developments in music & technology.