27 posts
HashTables and Some Unexpected Code Performance
So I am playing around with hash tables and looking at how it effects performance. I wrote three different ways. My test consisted of 1000 integers and 1000 strings and I use the integers, hash them, then use them for looking up the strings. I used a performance counter to measure the time in seconds it takes for it to add 1000 to the table then again how long it takes to look them all up.

First one was the "naive approach", the idea was that I would write it as if I needed some way to look up strings given an integer value and did not know the concept of hash tables. Not surprisingly this one had the most loop iterations done. Given the simplicity of adding to the table I expected adding to be the fastest and the look up to be the slowest out of all the functions.

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 D:\Development\HashTablesTest>build\HashTablesNaive.exe TimeBegin 0.000005 TimeEnd 0.000043 Time taken to add 1000 values to table 0.000038 Iterations 1000 TimeBegin 0.000350 TimeEnd 0.000377 Time taken to look up 1000 values 0.000027 Iterations 1000 D:\Development\HashTablesTest>build\HashTablesNaive.exe TimeBegin 0.000009 TimeEnd 0.000084 Time taken to add 1000 values to table 0.000074 Iterations 1000 TimeBegin 0.000785 TimeEnd 0.000836 Time taken to look up 1000 values 0.000051 Iterations 1000 D:\Development\HashTablesTest>build\HashTablesNaive.exe TimeBegin 0.000009 TimeEnd 0.000055 Time taken to add 1000 values to table 0.000046 Iterations 1000 TimeBegin 0.000547 TimeEnd 0.000586 Time taken to look up 1000 values 0.000039 Iterations 1000 D:\Development\HashTablesTest>build\HashTablesNaive.exe TimeBegin 0.000009 TimeEnd 0.000055 Time taken to add 1000 values to table 0.000045 Iterations 1000 TimeBegin 0.000558 TimeEnd 0.000597 Time taken to look up 1000 values 0.000038 Iterations 1000 D:\Development\HashTablesTest>build\HashTablesNaive.exe TimeBegin 0.000006 TimeEnd 0.000036 Time taken to add 1000 values to table 0.000030 Iterations 1000 TimeBegin 0.000324 TimeEnd 0.000352 Time taken to look up 1000 values 0.000028 Iterations 1000 

Second one was internal chaining. This one had less loop iterations but the time it took to look up was slower.

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 D:\Development\HashTablesTest>build\HashTablesInternalChaining.exe TimeBegin 0.000009 TimeEnd 0.000134 Time taken to add 1000 values to table 0.000125 Iterations 703 TimeBegin 0.000598 TimeEnd 0.000711 Time taken to look up 1000 values 0.000113 Iterations 702 D:\Development\HashTablesTest>build\HashTablesInternalChaining.exe TimeBegin 0.000009 TimeEnd 0.000109 Time taken to add 1000 values to table 0.000101 Iterations 703 TimeBegin 0.000817 TimeEnd 0.000902 Time taken to look up 1000 values 0.000085 Iterations 702 D:\Development\HashTablesTest>build\HashTablesInternalChaining.exe TimeBegin 0.000008 TimeEnd 0.000098 Time taken to add 1000 values to table 0.000090 Iterations 703 TimeBegin 0.000570 TimeEnd 0.000667 Time taken to look up 1000 values 0.000097 Iterations 702 D:\Development\HashTablesTest>build\HashTablesInternalChaining.exe TimeBegin 0.000008 TimeEnd 0.000099 Time taken to add 1000 values to table 0.000091 Iterations 703 TimeBegin 0.000499 TimeEnd 0.000594 Time taken to look up 1000 values 0.000094 Iterations 702 D:\Development\HashTablesTest>build\HashTablesInternalChaining.exe TimeBegin 0.000004 TimeEnd 0.000038 Time taken to add 1000 values to table 0.000035 Iterations 703 TimeBegin 0.000215 TimeEnd 0.000247 Time taken to look up 1000 values 0.000032 Iterations 702 

Third one was external chaining. This one had a surprisingly low amount of loop iterations. I chose to allocate memory for each new node so I expected adding to be the slowest of the three. Look up was still slower than the first despite even less loop iterations than the previous.
  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 D:\Development\HashTablesTest>build\HashTablesExternalChaining.exe TimeBegin 0.000007 TimeEnd 0.000135 Time taken to add 1000 values to table 0.000128 Iterations 232 TimeBegin 0.001469 TimeEnd 0.001539 Time taken to look up 1000 values 0.000070 Iterations 431 D:\Development\HashTablesTest>build\HashTablesExternalChaining.exe TimeBegin 0.000007 TimeEnd 0.000438 Time taken to add 1000 values to table 0.000432 Iterations 232 TimeBegin 0.000776 TimeEnd 0.000824 Time taken to look up 1000 values 0.000048 Iterations 431 D:\Development\HashTablesTest>build\HashTablesExternalChaining.exe TimeBegin 0.000004 TimeEnd 0.000049 Time taken to add 1000 values to table 0.000045 Iterations 232 TimeBegin 0.000266 TimeEnd 0.000310 Time taken to look up 1000 values 0.000044 Iterations 431 D:\Development\HashTablesTest>build\HashTablesExternalChaining.exe TimeBegin 0.000008 TimeEnd 0.000128 Time taken to add 1000 values to table 0.000120 Iterations 232 TimeBegin 0.001622 TimeEnd 0.001693 Time taken to look up 1000 values 0.000070 Iterations 431 D:\Development\HashTablesTest>build\HashTablesExternalChaining.exe TimeBegin 0.000006 TimeEnd 0.000096 Time taken to add 1000 values to table 0.000090 Iterations 232 TimeBegin 0.001861 TimeEnd 0.001929 Time taken to look up 1000 values 0.000068 Iterations 431 

I guess more importantly is my question:
Why is the naive approach working faster than the others?

I have attached my source code, it is for windows.
(Also gotta thank Casey for showing how easy cl is to use, breaking my dependency of Visual Studio and allowing me to use Sublime in a much more efficient way)
27 posts
HashTables and Some Unexpected Code Performance
Oh man, should have waited 10 more minutes and I would have had my answer. After careful evaluation of HashTablesNaive.cc I realized I had one of the worst typos that you never get notified about in c/c++.

I noticed the iteration count for looking up was not right so I checked the loop and.....
 1 Line 82: if (Table[i].Lookup = Lookup) 

Sigh......

Well this is a bit more expected.

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 D:\Development\HashTablesTest>build\HashTablesNaive.exe TimeBegin 0.000007 TimeEnd 0.000036 Time taken to add 1000 values to table 0.000030 Iterations 1000 TimeBegin 0.000321 TimeEnd 0.003097 Time taken to look up 1000 values 0.002776 Iterations 500433 D:\Development\HashTablesTest>build\HashTablesNaive.exe TimeBegin 0.000004 TimeEnd 0.000019 Time taken to add 1000 values to table 0.000016 Iterations 1000 TimeBegin 0.000209 TimeEnd 0.001614 Time taken to look up 1000 values 0.001405 Iterations 500433 D:\Development\HashTablesTest>build\HashTablesNaive.exe TimeBegin 0.000004 TimeEnd 0.000023 Time taken to add 1000 values to table 0.000019 Iterations 1000 TimeBegin 0.000243 TimeEnd 0.002273 Time taken to look up 1000 values 0.002030 Iterations 500433 D:\Development\HashTablesTest>build\HashTablesNaive.exe TimeBegin 0.000009 TimeEnd 0.000047 Time taken to add 1000 values to table 0.000038 Iterations 1000 TimeBegin 0.000468 TimeEnd 0.004544 Time taken to look up 1000 values 0.004076 Iterations 500433 D:\Development\HashTablesTest>build\HashTablesNaive.exe TimeBegin 0.000015 TimeEnd 0.000063 Time taken to add 1000 values to table 0.000047 Iterations 1000 TimeBegin 0.000610 TimeEnd 0.004775 Time taken to look up 1000 values 0.004166 Iterations 500433 

Well, now that that's resolved, enjoy the code maybe it will help someone. Or maybe someone can take a look at it and see if there are things I can do better.
David Owens II
69 posts
A software engineer that enjoys living in enemy territory.
HashTables and Some Unexpected Code Performance
If you enable warnings the compiler will tell you. It's warning C6282 for MSVC.

https://msdn.microsoft.com/en-us/vstudio/esa6csd7(v=vs.111).aspx
27 posts
HashTables and Some Unexpected Code Performance
Thank you for that. I forgot about checking warning levels, that is something I need to do more often.
Dale Kim
22 posts
HashTables and Some Unexpected Code Performance
I would highly recommend using more items than 1000 if you want to check performance. 1000 really is a trivial number to the computer for all but the most horrible algorithms (or NP-hard problems).
27 posts
HashTables and Some Unexpected Code Performance
Edited by mojobojo on
DaleKim
I would highly recommend using more items than 1000 if you want to check performance. 1000 really is a trivial number to the computer for all but the most horrible algorithms (or NP-hard problems).

So I modified my code a bit to do what you suggested and boy did I see the differences. The naive approach actually took too long and after ten minutes I hit ctrl+c so time taken for the naive approach is > 10 minutes.

But wow, increasing the size of the hash table, barely had to wait for the other two to finish. There is also a cheap string generation I threw in there.

  1 2 3 4 5 6 7 8 9 10 D:\Development\HashTablesTest>build\HashTablesNaive.exe Time taken to add 1000000 values to table 0.180649 Iterations 1000000 ^C D:\Development\HashTablesTest>build\HashTablesInternalChaining.exe Time taken to add 1000000 values to table 0.300012 Iterations 10216664 Time taken to look up 1000000 values 0.113545 Iterations 10216664 D:\Development\HashTablesTest>build\HashTablesExternalChaining.exe Time taken to add 1000000 values to table 0.296565 Iterations 482547 Time taken to look up 1000000 values 0.066534 Iterations 842183 

EDIT: Went ahead and let the first one run and did something while it ran.

 1 2 3 D:\Development\HashTablesTest>build\HashTablesNaive.exe Time taken to add 1000000 values to table 0.171783 Iterations 1000000 Time taken to look up 1000000 values 1619.253833 Iterations 1784293664 

I am measuring in seconds so that is a full 26.987 minutes!