Adding Hashing Capabilities to EEPROM

Thanks – I appreciate the example implementation.
Although it’s still 2 bytes written to EEPROM rather than a RAM based seed value, passed to the hash function. Not sure there’s any way around that.

Out of curiosity, do you have a dislike for overloaded functions? Perhaps there’s a good reason to avoid them?

Edit: Never mind – the overloaded code gets ugly esp. with the extensive Doyxgen comments… although it might be manageable if the prototypes were split out and we had foo.h and foo.cpp.

Well, I have satisfied my curiosity and now totally agree. Keep It Simple!

Yeah, in fairness you can’t really do that without basically reimplementing the hash function simply because of how the hash function works.

No, I have nothing against overloads.

What I do have is a dislike for adding too many functions, because every extra overload has to be documented and maintained, and I’m the one who has to write that documentation, and either I or MLXXXp (or both of us) will be the one to maintain it.

Ultimately the use of a feature must outweigh the cost of adding the feature.

The cost of adding a feature includes writing, testing, documenting, and maintaining it.
When you add that up, even ten lines of code can become quite expensive.


Hrm, that’s not quite what I was expecting you meant.

I thought you meant trying to insert an extra value into the sequence:

static HashType hash(const unsigned char * data, size_t size, uint8_t extra)
{
	HashType value = size;
	
	value = (((value << 5) ^ (value >> 27)) ^ extra);
	
	for(size_t index = 0; index < size; ++index)
		value = (((value << 5) ^ (value >> 27)) ^ data[index]);
	
	return value;
}

I’m not sure if replacing the initial value with a user-defined one would invalidate any of the properties of the hash. Sometimes it’s only a convinient way of having a non-zero initial value, other times it’s important. I don’t really like messing around with hash algorithms because even the slightest thing can change their characteristics.

By the way your version would violate one of the guarantees I made for the seedless version:

If size is 0, the returned hash code will also be 0.

For your version the guarantee would have to change to:

If size is 0, the returned hash code will also be the value of seed.

I get the impression you didn’t realise how many functions would have to be overloaded.

Technically there’s a way to avoid needing the extra overloads for the templated functions, but it would be detrimental to the clarity of the documentation. It’s much better to have individually documented overloads.

It’s technically possibly to shunt the Doxygen comments into a separate file, but I’m not entirely sure how, and someone still has to write the documentation.

You can only do that for non-template functions.

Template function definitions must remain accessible from the header because the compiler needs to know the full definition ito be able to instantiate the function if requested to do so.

(For more detail see: “Why can’t I separate the definition of my template class from its declaration and put it inside a .cpp file?”.)

You can technically declare the template functions and then define them later on, so you could shunt the definitions into another header, but I try to avoid doing that unless there’s a tangible benefit.

1 Like

– My instinct here, is that it’s a cheap way to seed the hash. I should probably read up on the actual function… here. Chapter 6.4, p.513. (That was the cited source… but haven’t found it yet!?)

Update: From my skim reading, I haven’t found the algorithm in the cited source. Perhaps it’s not by DEK? Or it’s only the 1st edition? However I did find this ~
Here, under the heading ‘A common weakness’.
Here, under ‘Hashing Sequences of Characters’…
This is a nice explanation of how the shift parameters were tuned for ASCII text. I’m not sure if this reduces its merit as a general hash?..

It was cited as “under the topic of sorting and search chapter 6.4.”, no page given.

I’m begining to suspect that that it might not be.

Though if it were, it might be expressed in mathematical terms.
E.g. exor might actually be expressed as addition modulo 2, and shifting might be expressed as multiplication or division by a power of 2.
That would make it harder to recognise.

That C code looks older than I am! It’s using pre-ANSI parameter syntax.
That makes it both invalid C++ and invalid C99 code.

I didn’t read the whole thing, but I did test its assertion about hash("EXXXXXB") == hash("AXXXXXC"), and it’s true:

using System;

namespace ConsoleApplication1
{
    class Program
    {
        static uint hash(string key)
        {
            uint result = (uint)key.Length;

            for (int index = 0; index < key.Length; ++index)
                result = ((result << 5) ^ (result >> 27)) ^ (byte)key[index];

            return result;
        }

        static void Main(string[] args)
        {
            var keys = new string[] { "EXXXXXB", "AXXXXXC" };

            foreach (var key in keys)
                Console.WriteLine("{0} = {1:X}", key, hash(key));

            Console.ReadKey();
        }
    }
}

However, those Xs aren’t accidental - those intermediate values are likely to affect the hash.

Yes, it claims it’s a CRC, which could be correct.
(I’ve never found a satisfactory answer to the question of how CRCs work.)

I’ve got plenty of other hashes to choose from if this one is duff.

Though as you can see, doing it ‘properly’ isn’t easy.
Really it’s a job for a mathematician, which I’m certainly not.
(My highest maths qualification is a B at GCSE level.)

Potentially.

ASCII characters will always have a 0 in bit 7 by definition, and the frequency of values is likely to gravitate towards the letters, space, and digits, with some symbol and many unprintable characters never being encountered.

Arduboy games would not have those restrictions on their save data.
Without an analysis, it would be hard to say exactly what byte values might be expected from Arduboy save data, so I would just presume ‘anything and everything’.


By the way, now that @Mr.Blinky is back I’m going to try to give him some time to catch up a bit and see what his thoughts are on what’s been said in regards to whether the API is suitable for the FX, so I may not reply to any future replies for a little while, just to give him time.

1 Like

One small thing:
I’ve run the only test I really know how to do with a hash, and resolved to replace the supposed ‘Knuth’ hash with John Skeet’s hash because it appears to express superior properties.

I can provide a demo program on request, but you’ll need over 4GB spare to run it because it allocates a 4GB array. (Tricking the compiler into allowing me to allocate an array of that size took a bit of fiddling. I’m still not completely sure why some ways work and others don’t.)

1 Like

I’ve not had time to try, but I found there is a popular hash test suite ‘smhasher’ (like ‘diehard’). Also some more functions here and here.

Is it Jon Skeet’s post on StackOverflow (using magic primes and multiply) ?
If so, its source is ‘Effective Java: Second Edition’, 2008, p.47 onwards, by Joshua Bloch.

I dread to think how long that will take to run if my tests were anything to go by. Though perhaps they’re less brute force.

Either way, I’m not sure it’s worth it, and I’m already convinced that Skeet hash is the right hash for the job.

Yes that one.

Not quite. You’re only looking at the first implementation, which uses some pretty miniscule primes.

Later on in the answer Skeet provides a second one, implied to be of his own devising (hence I named it ‘Skeet hash’, not ‘Bloch hash’) with some much larger numbers, which is what I’ve been testing.


Basically my test consists of:

  • Allocate a 4GB array of bytes, all set to zero
    • Effectively uint8_t array[4294967296], but I have to dynamically allocate it and jump through some hoops to get the compiler to not reject it.
  • For x = 0 to 4294967295:
    • value = hash(x);
    • if(array[value] < 255)
      • ++array[value];
  • Declare uint64_t count[256];
  • Declare `uint64_t total = 0;
  • For x = 0 to 4294967295:
    • ++count[array[value]];
    • total += array[value];
    • if(total >= 4294967296)
      • break; - exit early if all the remaining entries will be 0

I ran that on both the Knuth hash and the Skeet hash.

The Knuth hash spat out this:
Knuth:
0 = 4286578688
1 = 0
2 = 0
3 = 0
4 = 0
5 = 0
6 = 0
7 = 0
8 = 0
9 = 0
10 = 0
11 = 0
12 = 0
13 = 0
14 = 0
15 = 0
16 = 0
17 = 0
18 = 0
19 = 0
20 = 0
21 = 0
22 = 0
23 = 0
24 = 0
25 = 0
26 = 0
27 = 0
28 = 0
29 = 0
30 = 0
31 = 0
32 = 0
33 = 0
34 = 0
35 = 0
36 = 0
37 = 0
38 = 0
39 = 0
40 = 0
41 = 0
42 = 0
43 = 0
44 = 0
45 = 0
46 = 0
47 = 0
48 = 0
49 = 0
50 = 0
51 = 0
52 = 0
53 = 0
54 = 0
55 = 0
56 = 0
57 = 0
58 = 0
59 = 0
60 = 0
61 = 0
62 = 0
63 = 0
64 = 0
65 = 0
66 = 0
67 = 0
68 = 0
69 = 0
70 = 0
71 = 0
72 = 0
73 = 0
74 = 0
75 = 0
76 = 0
77 = 0
78 = 0
79 = 0
80 = 0
81 = 0
82 = 0
83 = 0
84 = 0
85 = 0
86 = 0
87 = 0
88 = 0
89 = 0
90 = 0
91 = 0
92 = 0
93 = 0
94 = 0
95 = 0
96 = 0
97 = 0
98 = 0
99 = 0
100 = 0
101 = 0
102 = 0
103 = 0
104 = 0
105 = 0
106 = 0
107 = 0
108 = 0
109 = 0
110 = 0
111 = 0
112 = 0
113 = 0
114 = 0
115 = 0
116 = 0
117 = 0
118 = 0
119 = 0
120 = 0
121 = 0
122 = 0
123 = 0
124 = 0
125 = 0
126 = 0
127 = 0
128 = 0
129 = 0
130 = 0
131 = 0
132 = 0
133 = 0
134 = 0
135 = 0
136 = 0
137 = 0
138 = 0
139 = 0
140 = 0
141 = 0
142 = 0
143 = 0
144 = 0
145 = 0
146 = 0
147 = 0
148 = 0
149 = 0
150 = 0
151 = 0
152 = 0
153 = 0
154 = 0
155 = 0
156 = 0
157 = 0
158 = 0
159 = 0
160 = 0
161 = 0
162 = 0
163 = 0
164 = 0
165 = 0
166 = 0
167 = 0
168 = 0
169 = 0
170 = 0
171 = 0
172 = 0
173 = 0
174 = 0
175 = 0
176 = 0
177 = 0
178 = 0
179 = 0
180 = 0
181 = 0
182 = 0
183 = 0
184 = 0
185 = 0
186 = 0
187 = 0
188 = 0
189 = 0
190 = 0
191 = 0
192 = 0
193 = 0
194 = 0
195 = 0
196 = 0
197 = 0
198 = 0
199 = 0
200 = 0
201 = 0
202 = 0
203 = 0
204 = 0
205 = 0
206 = 0
207 = 0
208 = 0
209 = 0
210 = 0
211 = 0
212 = 0
213 = 0
214 = 0
215 = 0
216 = 0
217 = 0
218 = 0
219 = 0
220 = 0
221 = 0
222 = 0
223 = 0
224 = 0
225 = 0
226 = 0
227 = 0
228 = 0
229 = 0
230 = 0
231 = 0
232 = 0
233 = 0
234 = 0
235 = 0
236 = 0
237 = 0
238 = 0
239 = 0
240 = 0
241 = 0
242 = 0
243 = 0
244 = 0
245 = 0
246 = 0
247 = 0
248 = 0
249 = 0
250 = 0
251 = 0
252 = 0
253 = 0
254 = 0
255 = 8388608

Which basically means nearly everything clashed at least 255 times (potentially a lot more).

(I don’t have the hard drive space to spare to give you the exact distribution, a text representation would be on the order of tens of gigabytes, and it would likely take a few hours to churn that out.)

The Skeet hash on the other hand:

Skeet:
0 = 0
1 = 4294967296

Literal perfection. This is what no collisions looks like.
For 4-byte inputs, the Skeet hash is bijective, whereas the Knuth hash is non-injective, non-surjective.

The reason I thought to do this test is because it’s what I did many moons ago to the xorshift hash I nicknamed ‘SkullHash’ (which also got a perfect result) - the one used in Minesweeper.

(I acually tried to do it in Haskell, did a terrible job of it and ended up sending my computer into a paging panic. On the bright side, I’ll never forget to check if I’m using the lazy or strict foldl again.)

At any rate, if you want to run the smhasher tests and look for alternatives feel free to, but I am 100% sold on the benefits of the Skeet hash, and intend to use that to replace the Knuth hash .

HashType hash { 2166136261 };

for(size_t index = 0; index < size; ++index)
	hash = ((hash * 16777619) ^ pointer[index]);

return hash;
1 Like

I can move the starting address and repack the game if needed. Where is it safe to save data?

No where really … behind the couch is safest.

I quickly read the thread by I didn’t understood the problem. What could I do to avoid the issue?

Looks good. It makes sense to replace the original hash.

The new hash is by Fowler/Noll/Vo or the ‘FNV-1’ hash (32 bit variant). This has been slightly tweaked to an improved ‘FNV-1a’ hash, by the original authors. It was found that doing the xor first, achieved better avalanche for small input data (<5 bytes) - which is relevant on the Arduboy (my version here). FNV-1a has been studied and widely implemented, so seems a great choice. This analysis visually shows a good ‘random’ distribution achieved by the hash. It is possible that the ‘Murmur’ series may theoretically be better, but the added complexity will likely increase compiled size.


Compiled size overhead

I’ve been curious how much these hash functions add to a game’s compiled size.
Previously I’ve investigated PRNG’s and found any maths using uint32 significantly increased size for the 8 bit processor.
For testing, I used ‘OBS’ by PPOT, replacing the existing hash function. Size comparisons made to ‘vanilla’ version.

Results

  1. ‘Vanilla’ – 22640 bytes PROGMEM, 1627 bytes RAM (usage didn’t change for tests).
  2. Change checkSum from int16uint32. Switch to minimal hash function (hash = 0xFF). +82 bytes (22722).
  3. Use ‘DEK’ hash. +150 bytes (22790).
  4. Use ‘FNV-1’ hash (as @Pharap proposed). +170 bytes (22810).
  5. Use ‘FNV-1a’ hash. +170 bytes (22810) - smart compiler, no change.
  6. Use ‘FNV-1a’ hash using shifts and not multiply*. +170 bytes (22810) - smart compiler, no change.

* The original authors suggest the following:

// hash = hash * 16777619; // Can be replaced by ~
hash += (hash<<1) + (hash<<4) + (hash<<7) + (hash<<8) + (hash<<24);

Using the standard Arduino IDE (AVR gcc with -Os default), both compiled to the same. Experimenting with https://godbolt.org/ (AVR gcc no optimisation), the shift function produces lots of extra code. Please verify, but I think the simple multiply is clearer and just as efficient.

Looking through the compiled code for (4), many instructions are just dealing with the 4x bytes of hash as expected: each byte separately and then the associated overheads. It’s possible some steps could use 16 bit instructions (as per the sprite drawPlusMask assembler version).

A quick test with an uint16 hash version approx. halved the compiled code. However, I think it’s been agreed a 4 byte / 32 bit hash is the minimum requirement, so it’s a dead-end.

Conclusion – allow 150 - 200 bytes for using the new hash function in your code.
(For comparison using a minimal program, adding the current EEPROM.read() to iterate over each byte adds ~24 bytes; EEPROM.update() adds ~ 28 bytes).

For a quick fix, it’s best to move your start address further along. startAddress = EEPROM_STORAGE_SPACE_START + 999

The best resource is now @filmote’s CartBuilder. Or the original data source is here. Find an address which is least occupied.

You may want to wait and use @Pharap’s improved EEPROM functions, when they are integrated into the standard ‘Arduboy2’ library…

So, I went through and fixed EEPROM ranges on my games as the spreadsheet had a few mistakes. I didn’t change the other source. As we are having enough troubles getting people to maintain them at all, I wonder if having multiple sources of the truth is smart?

1 Like

Hrm, thinking about it, after the modifications to be per-byte it is technically FNV, which raises some questions about some of the figures I ended up with in my original hash tests.

I tested FNV1 and FNV1a as separate entities to the Skeet hash and got different results, but theoretically they should have produced the same code.

Testing again, it seems the Skeet hash is actually producing the same code, which means the recorded size must be wrong, which suggests that perhaps the figure I originally recorded was for a different hash function. My other figures seem to still be correct though.

Either way, I’ll switch to calling it FNV32, which is what I called it in my original tests. That means it’s the largest of the ones I tested.

I’m not sure that avalanche characteristics are necessarily going to be useful. The avalanche effect is desirable for cryptography where the goal is to make it impossible to guess the input from the output, but that’s not what the objective is here. The hash function only has to be good enough to detect data corruption.

That said, it doesn’t produce anymore code, so I see no issue in swapping the operations.

Not worth it, way too big. I ruled that one out right at the start.

FNV32 and FNV32a are the same size not because the compiler’s smart, but because there’s nothing about swapping the order of operations that would make one operation larger than another.

As for shifts, for a number that large the compiler is going to be producing a multiplication algorithm anyway because AVR’s mul only operates on 8 bit quantities, so it’ll likely be producing either the same as what you’re doing or something better (and potentially optimising whatever you wrote).

It would do, the Arduboy doesn’t have a barrel shifter.
There’s a machine code instruction to shift left by precisely 1 bit, no more.
So when you shift by multiple bits, the compiler often generates a (possibly unrolled) loop of shift instructions.

It may be possible to write some better assembly code by hand.

Halving the size of the data resulted in half the number of instructions?
It’s almost as if this is an 8-bit CPU that operates on single bytes at a time. :P

Unfortunately that often is the case. I don’t know if the compiler just isn’t geared towards 8-bit CPUs or if there genuinely isn’t any way to do better (which is entirely possible, the few instructions that can deal with 16 bit quantities have some limitations).


I was hoping to just switch to the Skeet hash and be done, but as it’s producing 50 more bytes of progmem than originally thought, I’m probably going to have to go back and examine some of the other rejects, which I really didn’t want to have to do because I was really hoping to have just picked the smallest seemingly decent hash, said ‘good enough’ and published the damn library.

I’ll go away and hopefully come back with a tidied up version of my original test, and possibly my new 4GB test. But once this is over I’m never changing the hash again.

After this though, no more side treks or distractions or talk of adding two characters to the data or anything else. I’ve got too many other things to do.

When the hash is set in stone I’m going to finish writing the remaining documents, possibly create some ‘safe’ and ‘unsafe’ versions of the API (though safe/debug versions could be published later as separate libraries), and then have the damn thing published so I can get back to my other projects. It’s been nearly a month now and my patience is wearing thin.

1 Like

Similar ball park to the numbers I found.
It adds 80 – 84 bytes simply going from int16 to uint32 for the hash. (So half the overhead is just handling the longer hash value, regardless of the algorithm).
Your numbers show the difference between a terrible hash (summation) and a good hash (FNV32a) is 102 bytes. I found the difference between an ok hash (DEK) and a good hash (FNV32a) just 20 bytes difference.
For comparison, the stock random() function adds ~500+ bytes… and no one has complained.

– I think you have already found the smallest decent hash. I don’t think the ‘budget’ for progmem bytes (or even EEPROM bytes) was ever discussed; the objective was finding a minimal(ish) hash that was effective. Halving the size, but having something fail 10% of the time would be pretty worthless.

My 2¢ – Unless you are willing to go for a compromised 16bit hash, stick with what you have. Most games can accommodate an extra 150–200 bytes.

IMHO, I think it’s very good and ready to be published. I’ve tried to be as critical as possible(!) and I’m sure other parties have examined it too. I don’t feel the safe/unsafe versions are necessary. The documentation is very thorough already.

You have produced something that was needed for a long time, but no one else was willing to undertake. Thank you!

1 Like

Re. 16 -vs- 32 bit hash ~
Looking at existing EEPROM usage, mean average is ~98 bytes per game (due to some whales). The median is ~22 bytes.
Assuming a typical save will be 32 bytes, for a perfectly distributed hash of 4 bytes -
32 ‘pigeons’ / 4 pigeon holes = 8 pigeons per hole.

That’s why I feel a 32bit hash is about right. :dove:

1 Like

First off, here is the revised version of my hash tests, including all the source code for the hash functions, mostly unaltered from the first time I did the test:

A few notes:

  • My original implementation of the ‘one at a time’ hash left out some processing that was supposed to occur at the end. This time I’ve added that in and created a second class that leaves it out, so see what the difference is between the ‘proper’ version and the trimmed down version.
  • The testing code only tests the 32-bit hashes, though I did include the code for some smaller hash functions such as fletcher 16 and one’s complement, since those were in the original tests.
  • I’ve added some links to the sources I used for the code of each hash.

I may run some collision tests later when I’ve got the RAM and time to spare. Though I may start by running a different test that exits on the first collision, just to partition the hashes into the ‘perfect’ and ‘imperfect’ hashes, rather than doing the full test which establishes just how imperfect the imperfect hashes are.

(I probably should have done a run with [[gnu::noinline]] applied to all the functions, just to see how things look without inlining and optimisation.)

As I joked before, you probably can’t expect any better from an 8-bit processor.

Its registers are 8 bits in size and most of the functional instructions (e.g. adding, bitwise operations) only operate on 8-bit quantities. Only a small number of instructions operate on 16-bit values, and most of those are just load/store instructions from what I remember.

My current values (which match my original statistics, it’s only the Skeet hash that was somehow wrong) indicate:

  • 32-bit summation: 6706
  • Knuth hash: 6742
  • FNV32a: 6808

For the full set of statistics, see here:

The FNV duet is the second largest.

True.

The stock random implementation is probably more complicated than anyone really needs to be honest. I’m never quite sure what algorithm it actually is, but I’ve looked at the source before and it’s quite substantial.

However, not every game needs random numbers.
EEPROM is probably more widely used than random numbers.

I have found a really small reasonable but flawed hash and a larger but better quality hash.

The actual characteristics of the others are still unknown, and one of the smaller ones could be just as good.

FNV32a shall be the fallback if I don’t get chance to run range tests on the others, though as I’m having to change anyway I may well test the rest when I have time.

(How many I can test at once depends on how many of my other programs I’m willing to close down, with the minimum being 0 and maximum being 3.)

I’m not going to be doing speed tests at all, because speed isn’t a priority.

There wasn’t really a ‘budget’, I was just aiming to go as small as possible to acquiesce the people who care about how much memory this will use.

Having collisions doesn’t necessarily equate to false positives, it depends more on how much of the data would have to change to produce the same hash value.

It’s always going to be possible for two blocks of data to produce the same hash, it’s just a question of how likely a collision is.

Except the ones written by Filmote. :P

Only Roman I think.
Mr.Blinky has been away and hasn’t had chance to get caught-up yet.

MLXXXp has said he’s not really interested in the actual implementation and is just going to trust me to do a good job, though he did review the documentation.

Hrm, I may end up publishing the ‘safe’ version later as a separate library.

Ideally it would be something only used in debug builds anyway, so people could eliminate the bugs before publishing, and then switch to the ‘unsafe’ one when they’re confident that there aren’t any bugs (or if they can afford to, just leave the debug version in in case something was missed during testing).

I’d still like at least a guide for implementers, because half the idea of this library isn’t just to act as an EEPROM library for Arduboy, it’s supposed to be a general interface that could be provided by other implementations.

In essence, the API should follow the Liskov substitution principle - that any implementation should be able to be substituted for the original and provide the same service. The implementation guide would outline the rules that implementations must follow and provide some example implementations for comparison.

Filmote attempted at least. If I hadn’t barged stepped in, I’m sure he would have made one.

1 Like

This is a trick I play on @vampirics all the time. I draw my best graphics - they look crap - and his inner OCD gets the better of him and he fixes them. :slight_smile:

1 Like

I was hoping to have this all done by the end of last month, but I’ve been particularly busy lately, and I’m still partly waiting on one more bit of feedback from one person.

If I get time to actually sit down and test it to prove that the implementation works then I may just drop the other line of inquiry and run with it as it is.

(If someone else wants to offer to write a test or adapt a game to use the new API as a means of proving it works, that would save me a bit of work.)

1 Like