[23] L'lasons de Aether ver Beinags #5

Conlan Walker
Mar 4, 2022
3 min read

This week, I managed to synthesize something akin to modal speech, if only in its most basic form. I had read a few sources that note how voice produces harmonics, without really explaining its elements. Since I recently coded something which can generate a wav file, I of course wanted to use it for this purpose. Since I wanted to add a harmonic series into one audio stream, I needed a way to combine multiple streams into one, which is known as mixing. There either doesn't seem to be a single right answer to that problem, or I was just reading bad information. Nonetheless, I thought of two different ways to mix an audio stream, on a sample-by-sample basis. The first is simply to output the sum of the inputs, and the second is to output sum of the inputs, before dividing by the number of inputs. I found pretty quickly though that each successive addition gets divided a lot depending on the harmonic index, and produced sounds that aren't particularly useful to me.

Since I've already made the bit of code that automates the creation of a wav, I can simply reuse most of it to serve as a sort of template. This is the code that came out of that:

My first test involved setting the volume of each harmonic to the inverse of the current harmonic (not shown in the above image, but it would be 1/hi). When I first tried to generate this, the program produced a wav with a waveform resembling that of a reverse sawtooth.

The upper section of this image is what was generated, with the bottom section being an actual reverse sawtooth for reference:

Apparently, adding a bunch of harmonics together will produce a sawtooth. I wanted to see how exact I could make it, so instead of using 7 harmonics, I used 2000. And, instead of 44.1kHz, I used 1mHz. Of course, I can only render 3/100ths of a second in a reasonable amount of time, but it was enough to create this (with quite a lot of detail):

Now, onto the other sample post-process method... thing.

Last week, I was asked to show exactly how I went about generating these wavs, so I recorded the whole process such that everything important is on-screen:

For the standalone version of the output, here's a vocaroo audio-only embed:

Even though I don't fully understand why this works, but it has an acceptable fidelity in the realm of pseudo-modal vocalization. In other words. In other words, I'm satisfied with the result enough to focus my attention on something else for the first time in a month.

If I compare the waveform this generates vs my actual voice, you'll see their similarity.

For this image, my voice is near the top, and the waveform of what you just heard is below:

Spectrum analysis of the two further shows their similarity.

Spectrum of my voice (some high-end mic noise is present in this too)

Versus my synthesized version:

Note that the fundamental frequency of the synth is exactly 100Hz, whereas mine hovers around 96-98Hz. I like how convenient the round number of 100 is in calculations, so I'll keep it that way for now.

So uhh, so end this, here are a few simple sketches I made in Blender that each took a disproportionate amount of time to make:

That's it for this week.

[23] L'lasons de Aether ver Beinags #5

Recent Posts

Comments