Even then, I’m not sure it’s worth doing.It would take quite a bit of extra work to cover all the relevant information:
- How to use the technique
- Why it works
- When it doesn’t work
- What the alterantives are (e.g. using a different hash function)
- When/why the alternatives are better
Also, it’s potentially better to handle the issue when it actually happens, since it’ll draw attention to which games have the incompatibility.
Although I’m not intending to add the information to the document, I’ll discuss the situation a bit here briefly.
(It’s much easier to explain here in terms that people familiar with the scenario will understand than trying to explain it in a more accessible way.)
Consider this…
Presume game A and game B have save data consisting of 10 bytes, and are both storing data at address 40, and are both using the built in hash.
Now, game B inserts 2 characters in front of their data, bringing the data size to 12 bytes. Note that the values of the 2 chars don’t actually matter at this point because suddenly games A and B have different sizes, which is (theoretically) enough to change the hash value.
The author of game A also decides to add 2 characters, bringing the sizes of game A’s data and game B’s data to the same value: 12 bytes. Let’s presume they both chose different values for these identifying char values. All is good.
Game C, however, uses 12 bytes of data and stores that data at address 40. (Can you see where this is going?)
Previously it didn’t clash with anything, and now suddenly it’s now clashing with game A and game B.
Suddenly games are chasing each other’s tails.
(Granted it’s statistically very unlikely, but stastistically unlikely scenarios must be considered.)
Now, rewind back to the start a moment. What would have happened if game B had decided to switch to another hash function?
It would no longer be clashing with A, and as long as A didn’t use the same hash function as B, A would also no longer clash with B. (The odds of two hash functions producing the same hash for the same data is probably quite tiny.)
If both games had switched to the same hash function, one of them would have to change again, but they don’t run the risk of getting game C involved because the size of the data didn’t change.
While I think of it, I should point out that the very act of a game’s data increasing in size will actually invalidate existing save data and will run the risk of clashing with other games that were previously using the same address but a different size of data.
Incidentally, I actually future-proofed Minesweeper by building in a mechanism that would allow save data to grow and shrink as necessary without invalidating the hash value.
(I get the feeling you’re going to ask if that could be added to Arduboy2EEPROM and then I’m going to regret telling you… ( -_-')
)
Hopefully I’ve just explained why inserting characters is not a complete solution.
Granted, neither is using a different hash really.
I can think of half a dozen other approaches (my favourite is probably to exclusive or the hash value with a randomly chosen per-game or per-user identifier since that seems to combine the best of both worlds), but pretty much everything is going to fall prey to some stastistical anomaly somewhere down the line, and I’m not really a good enough mathematician to statistically prove which occurances are more likely, I can only prove by logic and example.
More accurately, Filmote started doing it for his games and then taught several other people to do it.
Granted it was better than not doing it because it at least detects whether the game has actually been run before, but in the grand scheme of things it’s not a massive improvement.
It’s also a great way to encourage people to copy and paste code without attempting to understand what the code is doing, which is frankly just setting them up for a fall later on, and it allows the no-effort script kiddies to prosper.
The code I wrote intentionally requires more than just the code presented to actually do anything useful. It showcases the important parts such that someone who understands the features used could work out what’s going on and adapt it, but it won’t work through copy and pasting alone, it needs to be modified and integrated.
Well that was easier than I was expecting.
To clarify, if the size of HashType
or the hash
implementation ever changed, that would be a breaking change, hence I had to give an example of a change that would technically not be a ‘breaking’ change.
(The more I think about it, the more I think perhaps I should go ahead with the struct
approach just so there’s no ambiguity as to what the interface/API is supposed to be…)
This is true, but it’s the price that must be paid.
You could do some crazy attempt at wear levelling by moving the data around each time, but that’ll just end up occupting more of the already scarce EEPROM.
Besides which, some games will have save patterns like that anyway. A simple scoreboard might not, but an RPG is potentially going to be overwriting the player coordinates every time. Anything attempting some kind of time stamp will definitely be overwriting that every time.
It could possibly be mitagated by using 4×8-bit hashes or 2×16-bit hashes, and treating the data as 2-4 streams of parallel bytes, which would theoretically make certain bytes of the hash change less frequently, but I would expect that to weaken the hash (though I have no clue by what factor) .