So I am playing around with hash tables and looking at how it effects performance. I wrote three different ways. My test consisted of 1000 integers and 1000 strings and I use the integers, hash them, then use them for looking up the strings. I used a performance counter to measure the time in seconds it takes for it to add 1000 to the table then again how long it takes to look them all up.
First one was the "naive approach", the idea was that I would write it as if I needed some way to look up strings given an integer value and did not know the concept of hash tables. Not surprisingly this one had the most loop iterations done. Given the simplicity of adding to the table I expected adding to be the fastest and the look up to be the slowest out of all the functions.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38 | D:\Development\HashTablesTest>build\HashTablesNaive.exe
TimeBegin 0.000005 TimeEnd 0.000043
Time taken to add 1000 values to table 0.000038 Iterations 1000
TimeBegin 0.000350 TimeEnd 0.000377
Time taken to look up 1000 values 0.000027 Iterations 1000
D:\Development\HashTablesTest>build\HashTablesNaive.exe
TimeBegin 0.000009 TimeEnd 0.000084
Time taken to add 1000 values to table 0.000074 Iterations 1000
TimeBegin 0.000785 TimeEnd 0.000836
Time taken to look up 1000 values 0.000051 Iterations 1000
D:\Development\HashTablesTest>build\HashTablesNaive.exe
TimeBegin 0.000009 TimeEnd 0.000055
Time taken to add 1000 values to table 0.000046 Iterations 1000
TimeBegin 0.000547 TimeEnd 0.000586
Time taken to look up 1000 values 0.000039 Iterations 1000
D:\Development\HashTablesTest>build\HashTablesNaive.exe
TimeBegin 0.000009 TimeEnd 0.000055
Time taken to add 1000 values to table 0.000045 Iterations 1000
TimeBegin 0.000558 TimeEnd 0.000597
Time taken to look up 1000 values 0.000038 Iterations 1000
D:\Development\HashTablesTest>build\HashTablesNaive.exe
TimeBegin 0.000006 TimeEnd 0.000036
Time taken to add 1000 values to table 0.000030 Iterations 1000
TimeBegin 0.000324 TimeEnd 0.000352
Time taken to look up 1000 values 0.000028 Iterations 1000
|
Second one was internal chaining. This one had less loop iterations but the time it took to look up was slower.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38 | D:\Development\HashTablesTest>build\HashTablesInternalChaining.exe
TimeBegin 0.000009 TimeEnd 0.000134
Time taken to add 1000 values to table 0.000125 Iterations 703
TimeBegin 0.000598 TimeEnd 0.000711
Time taken to look up 1000 values 0.000113 Iterations 702
D:\Development\HashTablesTest>build\HashTablesInternalChaining.exe
TimeBegin 0.000009 TimeEnd 0.000109
Time taken to add 1000 values to table 0.000101 Iterations 703
TimeBegin 0.000817 TimeEnd 0.000902
Time taken to look up 1000 values 0.000085 Iterations 702
D:\Development\HashTablesTest>build\HashTablesInternalChaining.exe
TimeBegin 0.000008 TimeEnd 0.000098
Time taken to add 1000 values to table 0.000090 Iterations 703
TimeBegin 0.000570 TimeEnd 0.000667
Time taken to look up 1000 values 0.000097 Iterations 702
D:\Development\HashTablesTest>build\HashTablesInternalChaining.exe
TimeBegin 0.000008 TimeEnd 0.000099
Time taken to add 1000 values to table 0.000091 Iterations 703
TimeBegin 0.000499 TimeEnd 0.000594
Time taken to look up 1000 values 0.000094 Iterations 702
D:\Development\HashTablesTest>build\HashTablesInternalChaining.exe
TimeBegin 0.000004 TimeEnd 0.000038
Time taken to add 1000 values to table 0.000035 Iterations 703
TimeBegin 0.000215 TimeEnd 0.000247
Time taken to look up 1000 values 0.000032 Iterations 702
|
Third one was external chaining. This one had a surprisingly low amount of loop iterations. I chose to allocate memory for each new node so I expected adding to be the slowest of the three. Look up was still slower than the first despite even less loop iterations than the previous.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39 | D:\Development\HashTablesTest>build\HashTablesExternalChaining.exe
TimeBegin 0.000007 TimeEnd 0.000135
Time taken to add 1000 values to table 0.000128 Iterations 232
TimeBegin 0.001469 TimeEnd 0.001539
Time taken to look up 1000 values 0.000070 Iterations 431
D:\Development\HashTablesTest>build\HashTablesExternalChaining.exe
TimeBegin 0.000007 TimeEnd 0.000438
Time taken to add 1000 values to table 0.000432 Iterations 232
TimeBegin 0.000776 TimeEnd 0.000824
Time taken to look up 1000 values 0.000048 Iterations 431
D:\Development\HashTablesTest>build\HashTablesExternalChaining.exe
TimeBegin 0.000004 TimeEnd 0.000049
Time taken to add 1000 values to table 0.000045 Iterations 232
TimeBegin 0.000266 TimeEnd 0.000310
Time taken to look up 1000 values 0.000044 Iterations 431
D:\Development\HashTablesTest>build\HashTablesExternalChaining.exe
TimeBegin 0.000008 TimeEnd 0.000128
Time taken to add 1000 values to table 0.000120 Iterations 232
TimeBegin 0.001622 TimeEnd 0.001693
Time taken to look up 1000 values 0.000070 Iterations 431
D:\Development\HashTablesTest>build\HashTablesExternalChaining.exe
TimeBegin 0.000006 TimeEnd 0.000096
Time taken to add 1000 values to table 0.000090 Iterations 232
TimeBegin 0.001861 TimeEnd 0.001929
Time taken to look up 1000 values 0.000068 Iterations 431
|
I guess more importantly is my question:
Why is the naive approach working faster than the others?
I have attached my source code, it is for windows.
(Also gotta thank Casey for showing how easy cl is to use, breaking my dependency of Visual Studio and allowing me to use Sublime in a much more efficient way)