Vox is my attempt to program the smallest choir in the world in the form of a computer demo. According to Wikipedia, demos are "self-contained, sometimes extremely small computer programs that produce audio-visual presentations."
This is a long-term project that reaches back to Autumn 2013. My goal is to produce a so-called intro, an executable file of 64kb or less in size that unfolds to a complete choir when you run it on your PC.
detail
after around 3 days I came up with this first attempt to mimic a choir
For the first draft, I only modeled the vowels A, I and U and interpolated between them. This 3-day draft was very promising and you may ask yourself why the choir piece is still not finished after so many years. The answer is simple:
For a long time, I did not succeed to let the choir sing whole words but most of all, I was never satisfied with the artistic expression of the voices. They were not human enough. I attempted stochastic and other ways of composing but finally had to accept that I could not reach my goal by tweaking numbers and graphs on the screen with a mouse. So potentially, I had a vocal synthesizer that was able to mimic human singing, but no way to find out how to properly set all parameters and timings to make it sound right.
this version of 2014 contains some stochastic elements
One solution to this dilemma was to extract the vocal expression of real humans from audio recordings. And for me, a breakthrough in this regard came from machine-learning algorithms. I am currently able to automatically train and adapt the voice synthesizer parameters to mimic prerecorded phrases from different singers. The current synthesizer is based on my own voice but I can for example simulate female voices by modifying the assumed size of the oral cavity accordingly.
The next step will be to multiply and modify these phrases so that the choir finally unfolds its full vocal diversity. For composing purposes, I decoupled intonation (what you sing) from pitch, general amplitude and other parameters (how you sing). This way, voices can use the same words or phrases but sing them using different pitch, gain, etc. The final composition, piecing everything together and storing it as small as possible will take at least until the end of 2020.