An analysis of Rob Miles's 2016 Bitshift Variations in C Minor.
- Introduction
- Laboratory
- Sawtooth Sound
- Notes and Tuning
- Voices
- Voice Variation
- The Period
- Harmony
- Closing Remarks
- Appendix
Introduction
In 2016 Rob Miles published a very short program called Bitshift Variations in C Minor and presents it in a YouTube movie. Here is a copy if his program:
echo "g(i,x,t,o){return((3&x&(i*((3&i>>16?\"BY}6YB6%\":\"Qj}6jQ6%\")[
t%8]+51)>>o))<<4);};main(i,n,s){for(i=0;;i++)putchar(g(i,1,n=i>>14,12
)+g(i,s=i>>17,n^i>>13,10)+g(i,s/3,n+((i>>11)%3),10)+g(i,s/5,8+n-((i>>
10)%3),9));}"|gcc -xc -&&./a.out|aplay
Paste this code into a terminal on your Linux system and listen. Or listen to the recording at SoundCloud. It starts with one slow-moving voice. A second, faster voice enters. Then a third and even a fourth voice! It has the mechanical charms of chiptune music of the 1980ies, sure, but be aware that it is only 210 bytes of C code that generate this infinite stream of PCM sound!
It is magical. Absolutely fascinating.
And it prompts investigation. The author, Rob Miles, explains his feat partially in a YouTube movie. This article goes into the details and provides some background. A one page overview may accompany the present analysis.
Laboratory
Create a shell script, say bitfiddle.sh, and paste the original code. Refactor and reformat so that the code looks like this:cat << EOT | gcc -xc - && ./a.out | aplay
g(i,x,t,o){...}
main(i){for(i=0;;i++){putchar(...);}}
EOT
For better readability, use cat with a here document instead of echo with a long argument. The two middle lines are the C code we want to investigate. The first line compiles and runs this code, sending its output to the speaker: `-xc` tells the compiler gcc that the language is C (since it reads from stdin, it cannot deduce the language from the file extension); a.out is the compiled program file; and aplay sends unsigned 8-bit sample data from stdin to the speaker at a rate of 8000 Hz (the default; command line arguments could be used to change sample format and sample rate).
Stretch out the terse C code a bit, so it looks like this:
cat << EOT | gcc -xc - && ./a.out | aplay
g(i,x,t,o){
return (3&x&(i*((3&i>>16?"BY}6YB6%":"Qj}6jQ6%")[t%8]+51)>>o))<<4;
}
main(i,n,s){
for(i=0;;i++)
putchar(
g(i, 1, n=i>>14, 12)
+ g(i, s=i>>17, n^i>>13, 10)
+ g(i, s/3, n+((i>>11)%3), 10)
+ g(i, s/5, 8+n-((i>>10)%3), 9)
);
}
EOT
Now start experimenting:
- Run the code and listen to the music.
- Comment all but one of the
g(...)
lines and listen. - Set the second parameter to
0
for one or several of theg(...)
lines. - Modify the last parameter to
g(...)
and listen.
We find that there are four voices playing at different speeds
and octaves. The g()
function is the PCM generator,
adding voices is mixing them, and i
is the clock.
Sawtooth Sound
This aspect of the program is partially explained in the movie. Sawtooth sound is easy to generate:
cat << EOT | gcc -xc - && ./a.out | aplay -r 8000 -f U8
main(i) { for (i=0;;i++) putchar(i); } /* 0..255 */
EOT
While i
constantly increases,
putchar
outputs only the low 8 bits of i
or i&255
(technically, it casts i
to an unsigned char
),
such that after 255 follows 0 (not 256). The result, when played
at 8000 Hz, is a very low 8000/256=31 Hz sound.
We need a way to control frequency and amplitude. The next few snippets are alternatives for the middle line in the script above.
Here masking (bitwise and, &
) and bit shifting
(<<
and >>
) enter the scene.
Shifting left is multiplying by powers of two, and shifting right
is dividing by powers of two.
Note that masking has lower precedence than bit shifting, so that
63&i<<2
means 63 & (i<<2)
.
(1) Two octaves higher (i<<2
or 4 times the frequency):
main(i){ for(i=0;;i++) putchar(i<<2); } /* 0 to 252 in steps of 4 */
(2) Same frequency, lower amplitude:
main(i){ for(i=0;;i++) putchar(63&i); } /* 0..63 in steps of 1 */
(3) Amplify to original amplitude:
main(i){ for(i=0;;i++) putchar((63&i)<<2); } /* 0 to 252 in steps of 4 */
(4) Make the sawtooth curve more staircase shaped:
i>>1
grows at half the speed of i
,
so it takes two steps of i
until anything can change.
The result is the sequence 0 0 8 8 16 16 .. 248 248:
main(i){ for(i=0;;i++) putchar((31&i>>1)<<3); } /* 0 0 8 8 16 16 .. 248 248 */
(5) Taking the staircase to the extreme: with
i>>4
(i
div 16) it takes 16 steps
of i
until the curve jumps to the next flight; masking
with 3 wraps around at multiples of 4, so to keep amplitude (volume)
we multiply with 64 (or left shifting by 6):
main(i){ for(i=0;;i++) putchar((3&i>>4)<<6); } /* 0:16 64:16 128:16 192:16 */
Two key things to note from the examples above:
- Masking increases frequency but decreases volume
(63=001111112, 31=000111112, 3=000000112). - Bitshifting before masking changes the octave (frequency);
bitshifting after masking changes the volume (amplitude).
The last two examples have the same frequency and amplitude as the third example, but different shapes. In the last example, after masking with 3 there are only four possible values: 0, 1, 2, 3. Left shifting by 6 yields: 0, 64, 128, 192. Because of the right shift by 4 (div 16) the actual sequence of values is 16 times 0, 16 times 64, 16 times 128, 16 times 192, 16 times 0 etc. This is more a staircase than a sawtooth, and indeed the last two examples differ in timbre, though not in pitch and not in volume.
To achieve intervals other than octaves we have to multiply the base frequency by values other than powers of two. For example, the perfect fifth is at 3/2 of the base frequency. Because here we deal with integers only, we cannot multiply by 1.5 but instead multiply by 3 and divide by 2 (by right shifting).
(6) Let's say this plays a C:
for(i=0;;i++) putchar((3&(i>>3))<<6);
(7) Then this plays a G because it is at 3/2 of the frequence of example (6); we divide by two by shifting right one more:
for(i=0;;i++) putchar((3&(i*3>>4))<<6);
(8) We arrive at this overall structure of a simple PCM sound generator, where n is the note (pitch), o the octave, and v the volume:
for(i=0;;i++) putchar( (3 & (i*n >> o)) << v );
After some refactoring and letting v = 4, we can find this very structure in Rob Miles's code:
g(i,x,t,o){
char *m = 3&(i>>16)?"BY}6YB6%":"Qj}6jQ6%"; // melodic material
int n = m[t%8] + 51; // note (pitch indicator)
return (3 & x & (i*n >> o)) << 4; // amplitude at time i
}
The actual values passed in for o are 9, 10, and 12, which is large enough to compensate for the small mask of 3 (two leftmost bits) and the large values for n (pitch indicator, see next section).
The maximum value returned by this generator is 48
(3<<4
), so that the maximum value
for the four voices added is 192 (4×48) and indeed
putchar
will never have to wrap around.
Notes and Tuning
The magic strings "BY}6YB6%"
and "Qj}6jQ6%"
relate to the notes (pitches) being played:
Char | '%' | '6' | 'B' | 'Q' | 'Y' | 'j' | '}' | |
---|---|---|---|---|---|---|---|---|
ASCII | 37 | 54 | 66 | 81 | 89 | 106 | 125 | |
+51 | 88 | 105 | 117 | 132 | 140 | 157 | 176 | |
Ratio | 1 | 1.19 | 1.32 | 1.50 | 1.59 | 1.78 | 2 | |
1/1 | 6/5 | 4/3 | 3/2 | 8/5 | 16/9 | 2/1 | just intonation | |
Note | C | E♭ | F | G | A♭ | B♭ | C' | C minor |
We find that, when 51 is added to the ASCII values of the characters
in the strings, the ratios approximate the frequency ratios of
just intonation
(reine Stimmung).
If we take '%'
for a C, then we get the notes of a C minor
scale (missing is D at 9/8 = 1.12 or '0'
).
Let's turn from sound generation to the creation of music.
Voices
We now look at the four voices generated by the code. From here on, all references to absolute time duration assume playback at 8192 Hz, which is very close to the actual playback at 8000 Hz but yields nice figures.
Recall the PCM generator with its four parameters i = time, x = on/off, t = pitch index, o = octave:
g(i,x,t,o){
char *m = 3&(i>>16)?"BY}6YB6%":"Qj}6jQ6%"; // melodic material
int n = m[t%8] + 51; // note (pitch indicator)
return (3 & x & (i*n >> o)) << 4; // deflection at time i
}
The “magic strings” are two sets of notes:
"BY}6YB6%"
corresponds to F A♭ C' E♭ A♭ F E♭ C"Qj}6jQ6%"
corresponds to G B♭ C' E♭ B♭ G E♭ C
The expression 3&(i>>16)
in the
g
function chooses between the two sets: it
yields the sequence 0 1 2 3 0 1 2 3 etc and progresses
every 216=65536 increments of i
or about every 8 seconds.
Zero (false) maps to the second set, whereas 1 2 3 (all true)
map to the first set. Both sets have 8 notes and are indexed
by t%8
so that values t≥8 wrap around.
Therefore, we get notes from the second set for 8 seconds, then notes from the first set for 24 seconds.
The main function, after further refactoring, looks like this:
main(i,n,s){
for(i=0;;i++){
n = i>>14; // increments every 2 secs
s = i>>17; // increments every 16 secs
putchar(
g(i, 1, n, 12) // 1st voice in 1/1
+ g(i, s, n^(i>>13), 10) // 2nd voice in 1/2 (after 16")
+ g(i, s/3, n+((i>>11)%3), 10) // 3rd voice in 1/8 (after 48")
+ g(i, s/5, 8+n-((i>>10)%3), 9) // 4th voice in 1/16 (after 1'20")
);
}
}
Now it is simple to see that n
increments
every 214=16384 steps of i
or about
every 2 seconds. Similarly, s
increments every
217=131072 steps of i
or about every
16 seconds.
First Voice
The first voice is controlled by n
, which
increments every 2 seconds. Within the g
function,
the current set of notes is indexed by n%8
, so that
we get the 1st half of the 2nd set, the 2nd half of the 1st set,
then the entire 1st set, before the melody repeats:
seq%8: | 0 1 2 3 : 4 5 6 7 : 0 1 2 3 : 4 5 6 7 |
set: | ----2nd---- : ----1st---- : ----1st---- : ----1st---- |
notes: | G B♭ C' E♭ : A♭ F E♭ C : F A♭ C' E♭ : A♭ F E♭ C |
It is the base line and repeats every 32 seconds. The heavy bar line marks the change between the two note sets.
Second Voice
The second voice is controlled by n^(i>>13)
,
which changes once every second (213 = 8192). It is in
half notes relative to the first voice and it starts only after
16 seconds, when s
first changes from 0 to 1.
What melody does it produce?
The values that index the note sets are
(i>>14)^(i>>13)
, that is, the xor of
a value and twice that value, essentially a^b
where
b=2a (or a=b/2) for increasing values of
a. Writing the binary representations for small values
of a, b, and a^b we find:
a=b/2 b=2a a^b dec. mod 8
0000 0000 0000 0 0
0000 0001 0001 1 1
0001 0010 0011 3 3
0001 0011 0010 2 2
0010 0100 0110 6 6
0010 0101 0111 7 7
0011 0110 0101 5 5
0011 0111 0100 4 4
0100 1000 1100 12 4
0100 1001 1101 13 5
0101 1010 1111 15 7
0101 1011 1110 14 6
0110 1100 1010 10 2
0110 1101 1011 11 3
0111 1110 1001 9 1
0111 1111 1000 8 0
(It turns out that this is a Gray code, which have the property that exactly one bit changes between successive numbers in binary representation.)
Remember that the “magic strings” are indexed by
[t%8]
, so only the least significant 3 bits
of the Gray code are used, and this bit sequence is an
endless repetition of the “mod 8” column above.
Since each note lasts one second, the first eight notes are pulled from the 2nd set, the following 24 notes are pulled from the 1st set. The generated melody, in halves (relative to the first voice) is as shown below and has a period of 32 seconds. Again, the heavy bar line marks the change between the note sets.
seq%8: | 0 1 3 2 6 7 5 4 : 4 5 7 6 2 3 1 0 |
notes: | G B♭ E♭ C' E♭ C G B♭ : A♭ F C E♭ C' E♭ A♭ F :
F A♭ E♭ C' E♭ C F A♭ : A♭ F C E♭ C' E♭ A♭ F |
Third Voice
The third voice is controlled by n+((i>>11)%3)
,
which changes about 4 times a second (i>>11
≡
i/2048
), so it is in eighth relative to the first voice.
It starts after about 48 seconds, when s/3
first changes
from 0 to 1.
The expression (i>>11)%3
generates the repeating
sequence 0, 1, 2. To this the value of n
(which increments
every 2 seconds) is added. The period of n+((i>>11)%3)
(mod 8) is the least common multiple of 3 (the repeating 0 1 2) and 64
(the length of n%8
in eighth), which is 192 eighth or
24 bars or 48 seconds. However, because the note set
selection sequence (2nd 1st 1st 1st) repeats every 32 seconds or
128 eighths, the third voice has a period of lcm(192,128)
= 384 eights or 48 bars or 96 seconds. Here is the
beginning of the sequence (M is the note set, i'
is (i>>11)%3 and + is addition modulo 8):
M 2-----------------------------------------------------------------
n 0 . . . . . . . 1 . . . . . . . 2 . . . . . . . 3 . . . . . . .
i' 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1
+ 0 1 2 0 1 2 0 1 3 1 2 3 1 2 3 1 3 4 2 3 4 2 3 4 3 4 5 3 4 5 3 4
G B♭C'G B♭C'G B♭ E♭B♭C'E♭B♭C'E♭B♭ E♭B♭C'E♭B♭C'E♭B♭ E♭B♭G E♭B♭G E♭B♭
M 1-----------------------------------------------------------------
n 4 . . . . . . . 5 . . . . . . . 6 . . . . . . . 7 . . . . . . .
i' 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0
+ 6 4 5 6 4 5 6 4 6 7 5 6 7 5 6 7 6 7 0 6 7 0 6 7 1 7 0 1 7 0 1 7
E♭A♭F E♭A♭F E♭A♭ E♭C F E♭C F E♭C E♭C F E♭C F E♭C A♭C F A♭C F A♭C
M 1-----------------------------------------------------------------
n 8 . . . . . . . 9 . . . . . . . 10. . . . . . . 11. . . . . . .
i' 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2
+ 1 2 0 1 2 0 1 2 1 2 3 1 2 3 1 2 4 2 3 4 2 3 4 2 4 5 3 4 5 3 4 5
M 1-----------------------------------------------------------------
n 12. . . . . . . 13. . . . . . . 14. . . . . . . 15. . . . . . .
i' 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1
+ 4 5 6 4 5 6 4 5 7 5 6 7 5 6 7 5 7 0 6 7 0 6 7 0 7 0 1 7 0 1 7 0
And here is the complete 3rd voice, with its tripel rhythm appeal
generated by %3
that overlays the even beats of the two
lower voices. The beaming shows one possible rhythmic interpretation.
Fourth Voice
The fourth voice is controlled by 8+n-((i>>10)%3)
,
which changes about 8 times a second, so it is in sixteenth relative
to the first voice. It starts after about 1'20" when s/5
first changes from 0 to 1.
(i>>10)%3)
generates the repeating sequence
0 1 2, as in the third voice, but twice as fast, and 16 times as
fast as does n
.
i>>10
increments 4 times as fast as does
n
, so n-((i>>10)%3)
yields 5 times
0 -1 -2, plus a single 0 (16 notes), before n
increases
and we get 5 times 0 -1 1 plus a single 0, etc. Adding 8 avoids
negative numbers. And after mod 8 we find a sequence that starts
as shown below. Again, after 8 seconds or 64 sixteenth the note set
changes from "Qj..."
to "BY..."
and
after another 24 seconds again back to "Qj..."
. This
fourth voice is a lively, though somewhat nervous, descant.
0 7 6 0 7 6 0 7 6 0 7 6 0 7 6 0
G C E♭ G C E♭ G C E♭ G C E♭ G C E♭ G
0 7 1 0 7 1 0 7 1 0 7 1 0 7 1 0
G C B♭ G C B♭ G C B♭ G C B♭ G C B♭ G
The period is the least common multiple of the duration
of the 0 -1 -2
sequence (3 sixteenth) and the
n%8
sequence (128 sixteenth), or 384 sixteenth,
as in the third voice. However, as the fourth voice plays at
twice the speed of the third, the cycle through the note sets
(2nd 1st 1st 1st) takes 256 sixteenth (32 seconds), the period
is lcm(384,256) or 768 sixteenth 48 bars or 96 seconds
(same as 3rd voice).
Voice Variation
The sound generator function g
takes a parameter
x
that controls when a voice starts, but not only:
3&x
serves as the “sawtooth mask”. Because of
the bitwise and with 3
, there are only four possible
mask values: 00
, 01
, 10
,
11
(binary). This is what we hear:
00
– voice off01
– softer, higher octave10
– medium, lower octave11
– louder, lower octave
Assuming n=1
and o=0
in the
refactored g
function (and remembering the
<<4
(times 16) amplification), it
generates the following PCM signals:
00
– generates0 0 0 0
(repeat)01
– generates0 16 0 16
(repeat)10
– generates0 0 32 32
(repeat)11
– generates0 16 32 48
(repeat)
The x
parameter is used by the four voices as follows:
Voice 1: Always 1, so the base line is soft (and an octave higher than we'd assumed).
Voice 2: Uses x = s
, so we have (a) 16 seconds silence,
before we hear (b) the 2nd half of the melody (bar 9 starting
with an F), then (c) the 1st half of the melody, louder and
in the lower octave, then (d) again the 2nd half with full body.
Then the cycle repeats.
Voice 3: Uses x = s/3
(integer division) so that the
3rd voice goes through the same off/soft/medium/full cycle
as the 2nd voice, but at one third of the speed: each phase
lasts 48 seconds. The melodic period of the 3rd voice is
96 seconds, so we “miss” the first half, hear softly the
2nd half, followed by a loud 1st half, then more gently
again the 2nd half.
Voice 4: Uses x = s/5
so the 4th voice, like the 2nd
and the 3rd, goes through the off/soft/medium/full cycle,
but each phase now lasts 80 seconds.
The Period
Periods of the melodies have been indicated before. In combination with voice variation we get these results:
Voice 1: melodic period is 32 seconds, voice does not vary, so we get an overall period of 32 seconds.
Voice 2: melodic period is 32 seconds, voice variation period is 4×16 seconds, overall period thus 64 seconds.
Voice 3: melodic period is 96 seconds, voice variation period is 4×48 or 192 seconds, the combined period is thus 192 seconds.
Voice 4: melodic period is 96 seconds, voice variation is 4×80 or 320 seconds. The least common multiple yields the combined period of 960 seconds.
The overall period of the piece is 960 seconds
or 16 minutes or 30 times the baseline (voice 1) or
7,864,320 increments of i
. This can be checked
by ear by changing for(i=0;;i++)
to
for(i=7864320;;i++)
in the code.
Harmony
Looking at the two sets of notes, we find that each contains the notes of a minor seventh chord:
"BY..."
consists of F A♭ C E♭ or Fm7"Qj..."
consists of C E♭ G B♭ or Cm7
Shared between the two sets are C and E♭, the D is missing from both.
In this view, the Bitshift Variations boil down to 8 seconds of Cm7 jingling followed by 24 seconds of Fm7 jingling. Dissonant intervals can and do occur, but will be softened by other voices that tend to “fill” the minor seventh chord.
Closing Remarks
Remember, this was all integer arithmetics, no floating point. Much of the terseness derives from exploiting C's operator precedence rules by omitting needless parentheses. Indeed, two more parens and one semicolon could be omitted, sparing another two (sic) bytes. The arguments to main are really just local variables declared using as little code as possible.
The music generated is not super complex, but quite effective. Knowing how little code it takes to generate both the music and the sound—this makes it so astonishing. Valentino Braitenberg coined the “law of downhill synthesis and uphill analysis” in his 1986 book Vehicles. Does it always hold? I found this article to be a fun analysis of a fascinating synthesis: Bitshift Variations in C Minor by Rob Miles (2016).
Appendix
Score Exceprt
This is how Bitshift Variations in C Minor “look like” (excerpt):
Times
The tune is played at 8000 samples per second, the default sampling rate of the aplay tool. Since this is about 213 = 8192, we get the following clock divisors and corresponding frequencies and note values:
Divisor | Frequency | Note Value |
---|---|---|
i>>10 ≡ i/1024 | 8 times per second | sixteenth |
i>>11 ≡ i/2048 | 4 times per second | eighth |
i>>12 ≡ i/4096 | 2 times per second | |
i>>13 ≡ i/8192 | once per second | half |
i>>14 ≡ i/16384 | two seconds | whole |
i>>15 ≡ i/32768 | ||
i>>16 ≡ i/65536 | ||
i>>17 ≡ i/131072 | 16 seconds |
Notes on C
The default type is int
, both for parameters
and function return values. Therefore, main(i)
is short for int main(int i)
, and similarly for
g(i,x,t,o)
, which has four parameters of type
int
and returns an int
.
Bitshift variations in C minor uses the following C operators, which appear here ordered form highest precedence (tightest binding) to least:
[] | indexing |
---|---|
* / % | multiplicative |
+ - | additive |
<< >> | bitshift |
& | bitwise AND |
^ | bitwise XOR |
| | bitwise OR |
= | assignment |
From this table we deduce that:
n^i>>13 ≡ n^(i>>13) ≡ n ^ (i/8192)
3&i>>16 ≡ 3&(i>>16) ≡ 11b & (i/65536)