This interview, written by Robin Wacanno and Mund Vetter, is from Cavia 56 - The Great VIA vacation book. The "Cavia", Dutch for guinea pig, is the magazine of via and is published once a quarter.
The Eurovision Song Contest also fell prey to the coronavirus this year. Fortunately, this loss was compensated in May by the so-called AI Song Contest. In this contest it was the intention that teams from different countries would make a song with the help of AI. Master student Artificial Intelligence and active via member Janne Spijkervet created a song for this contest together with Willie Wartaal, a famous Ducth rapper. How did such a unique collaboration come into existence and how did the process go from idea to end product? We had the opportunity to interview Janne about her participation in the contest.
Janne has been making music from an early age. At the age of eight she started taking drum lessons and did so for a couple of years. However, she finally came to the conclusion that she didn't like the 'loud sounds' that came out of the drums. She was eleven years old, when she took piano lessons just like her sister. As time went on, she took over the hours of her sister's piano lessons and noticed during these lessons that she didn't like playing the sheet music literally. When she had to play an etude of Chopin, she started improvising after a few bars, to the great displeasure of her piano teacher who eventually was replaced by one who did encourage improvisation.
This urge to invent her own music piqued her interest in computers, with which she could create a piece of music consisting of a full orchestra with just the help of the program "Logic" and her keyboard. At the age of fourteen she participated in a composition contest for which she wrote a piece with no less than fifteen instruments. She ended up in the finals of that contest, which she unfortunately did not win, and because of that she noticed that composing was 'very cool' to do. The competition opened her eyes and she decided to study music theory and composition next to high school.
After high school, she started a bachelor media composition. There she learned how to write music for movies and commercials, but she was also taught how to program audio applications. For her thesis on artificial intelligence and composition she wrote an algorithm that could compose choral pieces based on Bach's chorales and choral pieces. She wanted to learn more about artificial intelligence. So in 2017 she made a transition to the pre-master Artificial Intelligence of the UvA. The subjects Linear algebra and Logic were hard, but fortunately the good education of Leo Dorst pulled her through. After a year of hard work, she was finally able to start the master Artificial Intelligence. Which she has now almost completed.
Janne's involvement in the Song Contest was actually a happy coincidence. Together with her thesis supervisor Ashley Burgoyne, a lecturer and researcher at the UvA, she attended an annual conference organized by ISMIR (International Society for Music Information Retrieval). This conference brings together people doing research into applications within the interface between music and information sciences from different countries. At the conference, the director of the AI Song Contest was scouting researchers, mainly interested in music generation, to take part in the competition. This director asked Janne and her thesis supervisor to join in. From this a team of UvA researcher formed itself with Janne as an important part.
In addition to broadcasting the final of the event, the VPRO also planned to make a series of short YouTube videos in which the process of making the song of Janne's team was followed. These videos can be viewed here.
A song usually consists of a number of parts of which the lyrics and instrumental accompaniment are the most important. The goal of the contest was to make a song which would use artificial intelligence as a tool in the making of both parts.
The task of Janne within the team was to train a language model that would be able to generate lyrics. The basis was the existing GPT 2 of OpenAI. Although this model as adopted could in principle already generate lyrics, there was still a lot of work that had to be done to get usable lyrics out of it. The model had been trained on a general dataset of English lyrics collected from the internet, and with this the model was able to develop a general 'understanding' of the English language. In order to use the model for the specific task of lyric generation, it also had to be fine-tuned and thus further trained on a dataset of specific lyrics. The model that Janne finally trained, and the corresponding front-end can be found in this Git-repo: https://github.com/Spijkervet/gpt-2-lyrics.
The model finally came with an output that can be read here. The first part could be used as lyrics for the song without modification. In this raw data that came out of the algorithm, a number of interesting things can be seen. First of all, the program took certain meta instructions from the training data and processed them in the output. A nice example of this is the next piece:
Kill the government
Kill the system
As you can see, the model has learned that verses can start with such instructions. These instructions were eventually adopted in the song. Something else that stood out about the output of the program is the rather anarchistic nature of the text. This is already present in the fragment above, but there are many more examples of this in the text.
Another member of the team, Arran Lyon, also a student at the UvA, was responsible for the instrumental accompaniment of the song. The algorithm he had developed for an earlier project was trained on a dataset of MIDI versions of songs, among others from the Eurovision Song Contest. Based on this data, the algorithm can then generate so-called musical phrases (short series of notes). So it is not the case that a complete song immediately came rolling out of this program after the push of a button. It only provided the building blocks with which an actual instrumental track could later be mixed.
From the mountain generated melodies of course suitable candidates had to be selected. This also involved artificial intelligence. The aforementioned teacher and researcher Ashley Burgoyne, who researches whether a song is 'catchy' with the help of algorithms, had developed a program called Catchy for this purpose. By training this program on Song Contest songs that had scored well in the past, the team hoped to be able to select melodies that in turn could become the building blocks for a successful Song Contest song.
One thing was still missing at this point: a vocal. The rapper Willie Wartaal was responsible for that. However, his participation, or that of a human singer in general, was not yet in prospect in the beginning; the plan was to do the singing with the help of artificial intelligence as well. It just turned out that this technique was still too much in its infancy. Another feature of a human singer is that he could also contribute to the publicity of the project. The choice fell on Willie Wartaal. Janne with her background in composition was allowed to work with him in the studio.
Working in the studio with Willie Wartaal was very special. The collaboration started at Science Park. Without anyone noticing it, one of the biggest rappers in the Netherlands walked through Science Park, to the robot lab, where he and Janne had to put together a track from a mixture of melodies, bass lines and chords, while next to them the scientists were working hard.
Willie Wartaal had mixed feelings about the use of an algorithm. On the one hand he saw artificial intelligence as a kind of pet that you feed some data and that then generates ideas for you. On the other hand, he found it difficult that as an artist he was in control his work. You give an algorithm some input, but you have no idea what comes out of it.
Janne made two instrumental demo tracks at home from the melodies that were selected. These became two very different songs. Willie Wartaal resolutely did not choose Janne's favorite track. Janne could understand this decision. Her favorite version, which can be listened to here, showed better what an algorithm was capable of, but was simply too difficult for Willie Wartaal to sing on. After Willie Wartaal's contribution to the song, it was finished. Ready to be judged.
After each team had made their song they had to be judged of course. The ranking was made using the average of a jury grade and an audience grade. Janne's team finished in third place. She was very satisfied with this result, even more so because she values the jury grade more. On the basis of this judging grade alone, they would have been in second place together with Australia.
Of the other entries Janne especially liked the song of the German team 'Databots x Portrait XO'. Artificial intelligence was used for all facets of this team's song, and therefore very little human intervention was needed. This made the song sound more experimental and less like a standard Song Contest song. It served as a nice example of what is possible with artificial intelligence in the field of machine music generation.
Looking back on her participation there are also things that Janne thinks could have been done better. An important example of this is the compilation of the dataset(s) for training the models. For the melody generation model, for example, a dataset of songs that came by at the Song Contest was used, as well as general pop songs. However, the model was not trained on an anti-corpus with songs that would not work well.
Finally, Janne hopes that the AI Song Contest will continue to be organized. She really enjoyed her participation and sees the event as a useful platform as well. Thanks to the cooperation between the VPRO and research groups, a subject as heavy as the development and application of artificial intelligence can be shown to a wider audience in an interesting way.