Page version: 1.3
Source code version: 1.2

1/f Music

In [1] RF Voss and J Clark found that if you take the melodies of songs from the radio as sequences of numbers, the power spectra of these sequences follow a 1/f line. In [2] they produced random melodies whose spectra were either flat ("white noise"), 1/f2 ("brownian" or "brown" noise), or slopes in between, including the exact midpoint, 1/f ("pink" or "flicker" noise). Experimental subjects picked 1/f sequences as the most musical and pleasant sounding, neither too random, like white noise, nor too boring, like Brownian noise (a pre-echo of the "edge of chaos" idea.) This added to the semi-mystical literature on 1/f spectra, and started a major branch of fractal computer music composition. The work was publicized in Martin Gardner's Scientific American column [3].

Waveforms vs. Melody Lines

Just to be absolutely clear, the noise we're talking about is not audio noise (although there are white, pink and brown noises in the audio spectrum too). Just as sound can be represented as sequences of (say) 44 thousand numbers per second, with frequencies from 20 thousand cycles per second down to twenty cycles per second, so a melody line can be represented as a sequence of numbers, with each number representing the height of a note on the musical staff. These numbers change as often as a new note is played, and range in frequency from the fastest figures being played (e.g. sixty-fourth notes at one quarter note per second would be 16 numbers, or 8 cycles, per second), down to a frequency of one cycle in the duration of a piece (e.g., if a piece lasts an hour, that's 1/3600 cycles per second).

Random sequences of numbers are called "noise" even though the sequences of note numbers we're talking about are at frequencies below the range of human hearing. Also, we'll still be talking about "power", which varies like the square of amplitude, even though these aren't sequences of voltages.

White, Brown, and Pink Noise

If each number is an independently-generated random number, then you have white noise. Over a long time, the Fourier transform of a sequence of random numbers has about the same amplitude, and thus the same power, at every frequency.

Brownian noise is white noise integrated. That is, each number in the sequence is a random number in a range centered on zero, added to the previous number. With brownian noise, the amplitude of each frequency is the inverse of the frequency, and since the power is the square of the amplitude, it's proportional to 1/f2.

Noise is "pink" when the power at each frequency is proportional to 1/f, which means the amplitude is like 1/sqrt(f). This isn't as easy to arrange as white or brown noise. Pink noise (or any other spectrum shape) can be made with inverse Fourier transforms, or with multi-stage filters applied to white noise, but Voss found an easy way to get pretty-good pink noise.

Voss's Pink Noise Generator

The key is that pink noise contains the same amount of power in each octave. From one octave to the next, there are twice as many frequencies, but the 1/f rolloff means that each has half the average power. That means that a good approximation to pink noise can be had by adding the outputs of a series of random number generators, one for each octave of the noise you want to generate. For instance, a melody with sixteen notes can be generated with four random number generators being updated at every note, every other note, every fourth note, and every eighth note:

Alternatively, we can draw this as a binary tree of numbers:.

The best information I've seen on generating 1/f noise with computers is [4]

Pyramid Music

This adding of a number to all the numbers in a sequence, taking all the notes in a phrase and moving them all up or down the scale by the same amount, is called transposition, and happens all the time to phrases within songs (think of the melody to "My Baby Does the Hanky Panky"). Phrases can be repeated by having nodes in the tree share subtrees. The idea of "pyramid music" is to give the levels one, two, three, four nodes, etc., instead of one, two, four, eight...

These hierarchically-combined, transposed phrases still have 1/f spectra (although more spikey), but sound more musical than plain 1/f melodies. Repetition and transposition give you themes and melodies, and the two together make it sound as if something's going on here, as if somebody's doing something on purpose. Pink melodies sometimes sound childish or pedantic, but always as if they're intended to be music even when they're really bad. Artificial intelligence programs often act alien, stiff or inscruitable; sometimes they make mistakes that no person would make, but I think consistent artificial childishness or human-like stupidity holds some kind of clue that's rare in AI.

Structure Choices

In Voss's pink noise generator, each node in the tree has two subtrees all its own, and one random number that transposes the combination of the two subphrases. Forcing sharing of subtrees means the nodes have to choose which subphrases to combine. (You can see in the diagram above that the pattern of lines is no longer uniform; I've chosen the connections semi-randomly.) I call these choices "structure choices," and I keep them separate from the choices of transpose amounts. The reason is that when I add parallel sequences of numbers related to the same song, for instance dynamics or the circle-of-fifths sequence explained below, The parallel sequences need to follow the same song structure, yet have different choices corresponding to the transpose amounts in the melody.

It would be easy for the random structure choices to use some phrases a lot, and leave other phrases out altogether. I add the constraint that every phrase (or note) from one layer be used in the layer above. The diagram above follows this rule. The way the program enforces the rule is just to redo the whole set of structure choices for a layer until they satisfy the rule (I spent a lot of time sweating over inventing a smarter method before seeing this obvious way!)

Doing by Copying, Syncopation

Rather than build the melody by walking a tree structure, it's easier to build each layer as a list of notes, one layer at a time. Each new phrase is just some contiguous set of notes from the previous layer, copied into the new layer with a transpose added.

This allows syncopation: the notes from the lower level don't have to start on a boundary where a whole phrase was assembled, but can be offset by some number of beats. The yellow blocks in the score below show how the "Maple Leaf Rag" uses variations of the same phrase, once starting at the second beat of a measure, then starting at the first beat:

Scale, Key or Mode

1/f and pyramid melodies sound better on diatonic (white note) or pentatonic (black note) scales than on a chromatic scale. Transposed phrases, in particular, aren't usually moved an exact chromatic interval but fit to the scale in use. But having the program force notes into a predetermined scale seems arbitrary and lame, and doesn't allow for accidentals or changes of key in mid song. I would like the scale, mode or key of the melody to come from something more primitive or organic somehow.

The most natural method I've found so far is to let each note be a compromise between one sequence that goes up and down a chromatic scale, and another sequence that goes around the circle of fifths. I call the current incarnation of this method "PyraQuant5." There are more details about it below.

Example Music

There are two examples of PyraQuant5 music here. One is long-playing, but a relatively small file because it's in MIDI format: PyraQuant5_s1134278015_qp73.mid
This is a 73-minute, 163k MIDI file consisting of about 200 22-second piano pieces. The long name refers to the arguments the program was run with, in particular the random number seed.

The second is a larger file that plays for a much shorter time: PyraQuantTheme.mp3
It's a 45-second, 381k MP3 version of my favorite two pieces from the MIDI file. They sound a bit like Vince Guaraldi's Charlie Brown music.

Everyone asks whether I've tried making the notes different lengths instead of just pounding eighth notes. I would like to. A phrase at any level of the hierarchy could be replaced by a single note, or a rest. But I haven't figured out how to combine longer notes and rests with syncopation and leading and trailing notes.

Meanwhile, an ancestor of PyraQuant called PYRAMUS7, produced long and short notes by what I think of as a cheat: it combined any string of eighth notes at the same pitch into a single longer note. PYRAMUS7 produces pyramid music on a diatonic scale, not using the circle-of-fifths method. Here are 17 minutes of PYRAMUS7 songs that I found interesting and collected years ago, now converted to MIDI: PYRAMUS7 HITS.

You might find PyraMus7HitsToTxt's 48 lines of BASIC easier to digest than the 2500 or so lines of C that go into PyraQuant (!) There is more about the program in the source tar file below.

PyraQuant Code Description

PyraQuant5 is my nickname for the current version of the program, actually called pyracirc5. It's a pure C program that produces output in either MIDI, AIFF or WAV format (options -m, -a, -w).

The following is a slightly out-of-date description of how the main program works. See the README file in the source code directory for descriptions of the other source files involved.

I generate two streams of pyramid random numbers. The "linear number" goes up and down the chromatic scale, like a 1/f melody. The "circle number" goes around the circle of fifths. Generally it doesn't go more than half way around in either direction. So,

   0 = A, 1 = E,  2 = B,  3 = Gb/F#, 4 = Db/C#, 5 = Ab/G#, 6 = Eb,
         -1 = D, -2 = G, -3 = C,    -4 = F,    -5 = Bb,   -6 = Eb again.
The generated numbers aren't necessarily rounded to ints (see below). The note for the melody is a compromise between the "linear" and "circle" numbers: for each of an octave of (exact) chromatic notes around the linear number, it looks at the distance the note is from the linear number, combined with the distance it is on the circle of fifths from the circle number. The note with the minimum combined distance is picked for the output melody. The combining function has been Euclidian distance and Manhattan distance (sum of the absolute values of the differences) at various times. Right now it's Manhattan distance with varying weights.

The "pyramid" sequences are built out of phrases arranged in layers. The bottom layer of a 2(n-1)-note song is n independent random notes (one-note "phrases").

In the higher levels, each phrase is built by concatenating two adjacent phrases from the layer below, then transposing the resulting phrase up or down by a small random amount. Each phrase may also be inverted or time-reversed, and there are additional tweaks having to do with syncopation and with making sure that each phrase in a level is used at least once in the next.

Bottom layer: n phrases of 1 note each.
Second layer: n-1 phrases of 2 notes each.
Third layer: n-2 phrases of 4 notes each.
...
Top layer: one phrase of 2(n-1) notes--the song.

Some variations or improvements as the program developed:

Source Code

The source is in pyracirc5.tgz. "Pyramus7HitsToTxt.txt" is a BASIC program, but the rest of the code is in C. If you're used to building C programs in a Unix-style environment it will seem familiar, if not you'll probably be lost, sorry. README explains the jobs of the various source files.

Thanks to

Jeff Glatt for his collection of MIDI, Standard Midi (file format) and General Midi (instrument set) information. Sean M. Burke for his Daktari MIDI midi-decoding perl script.

mfiles.co.uk for the Scott Joplin excerpt.

References

[1] RF Voss and J Clarke (1975), "1/f noise in music and speech", Nature, 258:317-318.

[2] RF Voss, J Clarke (1978), "1/f noise in music: music from 1/f noise", Journal of Acoustical Society of America, 63(1):258-263.

[3] M. Gardner, Sci. Am. 238 (1978) 16.

[4] Robin Whittle, DSP generation of Pink (1/f) Noise, http://www.firstpr.com.au/dsp/pink-noise/.


Up to my home page.

--Steve Witham