As Casey noted, this is just a placeholder sound, and premature optimisation is the root of all kinds of evil. Nonetheless, we need something to do while waiting for the next instalment.
So let's try the well-known fast but stable algorithm for generating sine waves based on a second-order equation:
cos(x + 2d) = 2 cos (x+d) cos d - cos x
Here's the test program:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29 | #include <stdio.h>
#include <stdint.h>
#include <math.h>
int
main()
{
float x = 0;
float xmod = 0;
// Easy way to detect frequency drift: make
// the period of the wave a round number.
float d = (2 * M_PI) / 100;
float cosd = cosf(d);
float c0 = cosf(x);
float c1 = cosf(x+d);
for (uint32_t i = 0; i < (1 << 31); ++i) {
printf("%u\t%f\t%f\t%f\n",
i, cosf(x), cosf(xmod), c0);
x += d;
xmod = fmodf(xmod + d, 2 * M_PI);
float c2 = 2 * c1 * cosd - c0;
c0 = c1;
c1 = c2;
}
return 0;
}
|
I compiled this with clang -O3 -msse2 on a recent MacBook Air. Here's the output after 10 million cycles:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20 | 9999989 0.519647 0.877383 0.770501
9999990 0.571997 0.905778 0.809006
9999991 0.622113 0.930598 0.844318
9999992 0.669799 0.951746 0.876299
9999993 0.714870 0.969138 0.904821
9999994 0.757149 0.982704 0.929772
9999995 0.796472 0.992393 0.951053
9999996 0.832685 0.998165 0.968581
9999997 0.865645 0.999997 0.982287
9999998 0.895226 0.997884 0.992116
9999999 0.921311 0.991832 0.998030
10000000 0.943798 0.981865 1.000005
10000001 0.962599 0.968024 0.998033
10000002 0.977642 0.950362 0.992122
10000003 0.988867 0.928950 0.982296
10000004 0.996230 0.903871 0.968594
10000005 0.999703 0.875226 0.951069
10000006 0.999273 0.843126 0.929790
10000007 0.994940 0.807699 0.904842
10000008 0.986722 0.769084 0.876323
|
As you can see, the second order equation overshoots slightly, so we might need to make sure the samples are scaled down a little. However, unlike cos and cos-fmod, the second order equation is still in phase!
This is what we refer to as "low harmonic distortion", and it's why you often see this algorithm in digital synthesisers.
Finally, testing the performance:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79 | #include <stdio.h>
#include <stdint.h>
#include <math.h>
#include <boost/timer/timer.hpp>
#define CYCLES 100000000
#define DELTA ((2 * M_PI) / 100)
float
sine_wave_by_cosf() {
float x = 0;
// Compute the sum to ensure that this doesn't get aggressively
// optimised away.
float sum = 0;
for (uint32_t i = 0; i < CYCLES; ++i) {
sum += cosf(x);
x += DELTA;
}
return sum;
}
float
sine_wave_by_cosf_mod() {
float x = 0;
float sum = 0;
for (uint32_t i = 0; i < CYCLES; ++i) {
sum += cosf(x);
x = fmodf(x + DELTA, 2 * M_PI);
}
return sum;
}
float
sine_wave_by_secondorder() {
float x = 0;
float cosd = cosf(DELTA);
float c0 = cosf(x);
float c1 = cosf(x+DELTA);
float sum = 0;
for (uint32_t i = 0; i < CYCLES; ++i) {
sum += c0;
float c2 = 2 * c1 * cosd - c0;
c0 = c1;
c1 = c2;
}
return sum;
}
// Making this variable global ensures that assignments to
// it aren't optimised away.
float result;
int
main()
{
printf("sine_wave_by_cosf:\n");
{
boost::timer::auto_cpu_timer t;
result = sine_wave_by_cosf();
}
printf("sine_wave_by_cosf_mod:\n");
{
boost::timer::auto_cpu_timer t;
result = sine_wave_by_cosf_mod();
}
printf("sine_wave_by_secondorder:\n");
{
boost::timer::auto_cpu_timer t;
result = sine_wave_by_secondorder();
}
return 0;
}
|
Compiled with clang++ -O3 -msse2, I get:
| sine_wave_by_cosf:
1.353193s wall, 1.340000s user + 0.000000s system = 1.340000s CPU (99.0%)
sine_wave_by_cosf_mod:
2.127797s wall, 2.110000s user + 0.020000s system = 2.130000s CPU (100.1%)
sine_wave_by_secondorder:
0.384126s wall, 0.380000s user + 0.000000s system = 0.380000s CPU (98.9%)
|
Doing the fmod operation adds 50% to the runtime, but the second-order equation takes only 30% of the original. So if we need to generate lots of sine wave samples over the long term, this method seems like a winner.
Of course, if we only need one sample per frame (e.g. we're not generating audio, but the game features a flashing warning light or something), then this is probably overkill.