Shared EEPROM storage management across multiple apps


(Scott) #21

I would like to see the API defined well enough, and in such a way, that any conceivable (within limits, of course) back-end library could be used without any changes to the sketch, except minor changes to a user EEPROM specific header file included with the sketch.

The contents of this header file may differ depending on the back-end used. This header file would have a specific standard name, such as localEEPROM.h. It would contain information needed to locate or create the sketch’s reserved area in EEPROM. In order to have different library names for each back-end, so they could co-exist under the Arduino libraries folder, a #define, which I’ll call EEPROM_LIBRARY_HEADER, giving the name of the library header file, would be included. The localEEPROM.h file would be based on a standard template for a given back-end and be changed as little as necessary by the sketch developer or user or possibly an external management program, to uniquely identify the sketch per the back-end requirements.

localEEPROM.h would also include a #define for the number of EEPROM bytes required by the sketch, which I’ll call SKETCH_EEPROM_SIZE. This define would be available for use by the sketch. It’s included in the header, instead of the sketch, to allow the possibility of an external management program to easily use it.

A single class will be used and will have the same name for all back-ends. I’ll call it ArduboyEeprom.

The API will require a function to allocate or locate the sketch’s reserved space. I’ll call it open(). The parameter types passed to this function could vary between libraries. For instance, @Dreamer3 has proposed an 8 byte character string which is searched for, and a length, whereas I’d use a fixed EEPROM location, with a 16 bit user ID and an 8 bit sketch ID for verification. I would possibly also require a length but only for additional verification. To avoid having to alter the function arguments within the sketch when changing libraries, the function call would be defined by a #define in the header file. I’ll call this define EEPROM_OPEN.

An example for localEEPROM.h for a back-end library called eepromMLXXXp that uses my proposed technique, in which the user, or an offline management program, must manually set the start of each sketch’s EEPROM area such that they don’t overlap:

// Number of EEPROM bytes required by this sketch
#define SKETCH_EEPROM_SIZE 25

// This is the start address for the block of EEPROM space used by this sketch.
// This must be set so that areas used by all desired sketches don't overlap.
#define MLXXXP_EEPROM_START 0x0042

// The combination of user ID and sketch ID must be unique amongst all sketches using EEPROM.
// Change these as necessary.
#define ARDUBOY_USER_ID 1234
#define USER_SKETCH_ID 1

// ========== DO NOT CHANGE ANYTHING BEYOND THIS POINT ==========
#define EEPROM_LIBRARY_HEADER eepromMLXXXp.h
#define EEPROM_OPEN open(MLXXXP_EEPROM_START, ARDUBOY_USER_ID, USER_SKETCH_ID, SKETCH_EEPROM_SIZE)

The sketch, using any EEPROM library, would include:

#include "localEEPROM.h"
#include <EEPROM_LIBRARY_HEADER>

ArduboyEeprom savedData; 

void setup() {
  if (savedData.EEPROM_OPEN != 0) {
    // (handle error conditions here)
  }

Note that I propose that the open() function should return a status code as follows:

  • 0 to indicate that the area had been previously allocated and was OK.
  • 1 to indicate that a new area was successfully allocated but may need to be set to initial values by the sketch. A #define, such as EEPROM_ALLOCATED, should be added to the library header file for this, so the sketch doesn’t use a hard coded value, in case we want to change it in the future.
  • Negative numbers indicate an error, such as not enough space available. We’ll have to decide what possible error conditions could occur and create a #define for each of them.

If I’ve got things above correct, then only the localEEPROM.h file would need to be changed in all sketch folders to use a different library. The template for this file would be included with the library. You wouldn’t have to touch the .ino or other files.

For instance, localEEPROM.h for a manager similar to what @Dreamer3 proposed would look something like:

// Number of EEPROM bytes required by this sketch
#define SKETCH_EEPROM_SIZE 25

// Name of EEPROM block. Maximum 8 characters not including quotes.
// Must be different for all sketches using EEPROM.
#define DREAMER3_EEPROM_NAME "MyGame01"

// ========== DO NOT CHANGE ANYTHING BEYOND THIS POINT ==========
#define EEPROM_LIBRARY_HEADER eepromDreamer3.h
#define EEPROM_OPEN open(DREAMER3_EEPROM_NAME, EEPROM_SIZE)

As already discussed, there should be read byte, write byte and probably read block and write block functions. Except for read byte, I would have them all return a good or bad status. This could be used by back-ends that were capable of bounds checking. It would be nice to have the read byte function return a status as well but I think this complicates it too much. If a sketch wanted checking for a byte read, it could use a block read with a length of 1. I’ve used unsigned int for the address (actually an offset) for better portability.

uint8_t read(unsigned int address); //< read a single byte
boolean write(unsigned int address, uint8_t data); //< write a single byte
boolean read(unsigned int address, uint8_t *buffer, size_t size);
boolean write(unsigned int address, uint8_t *buffer, size_t size);

We need to discuss whether a function is required to free up allocated space. It’s kind of a waste to include code to free up EEPROM in the sketch that uses it. Depending on the specific back-end implementation, it may be possible to have the library include its own specific sketch which would be temporarily loaded to free up EEPROM, using a menu system or some other means.

Upon further thought, I don’t think I’d include a verify function. If the saved data gets corrupted, in most cases I think it would just show up as weird high scores or strange game play. And, as I’ve previously mentioned, a properly written sketch should include a means for the user to reset saved data. The Arduboy is just a simple gaming system with not much more than disappointment at stake if data is lost.

Finally, although I’ve used EEPROM in many ways in this proposal, we may want to use more generic terms to refer to non-volatile storage. It’s possible that, on future hardware, data could be stored somewhere else, such as on an SD card, but still be able to use this API with a different back-end. In such a case, having the term EEPROM as part of defines, etc., could be confusing. Even the title of this thread could be broadened to “Shared non-volatile storage management across multiple apps”.


(Josh Goebel) #22

Pretty sure you can’t do that type of magic with headers (use defines in them)… and if so I’m pretty sure the Arduino IDE isn’t going to like multiple libraries with exactly the same name. I need to read this all again, those those notes are just off the top of my head.


(Josh Goebel) #23

Without checksums I’d not’ know what purpose this would serve… EEPROM library is just a small wrapper around the CPU instructions… open() (or whatever) should verify the space requirements and fail or succeed… after that there really is not such thing as a “failed” read or write… they will “just work”. I guess if you did dynamic allocation you might have issues, but in that case I think the allocate (or open) method would again return an error, not the actual read or write… with this type of storage you want to know right upfront if there are any issues, not in the middle of a save.

I’m not interesting in thinking about anything other than EEPROM because other things are not similar enough IMHO. SD is likely going to have a FAT file system etc, not per-byte allocation, etc. And larger EEPROM stores are just that, larger EEPROM stores.

The way you’re wanting to go about things the entire idea of “Freeing” space doesn’t make any sense since you want to reserve space per-app instead of an allocation table as I proposed.


(Scott) #24

True, the voodoo magic that the IDE does to pre-precess a sketch can prevent this kind of thing from working. However, I’ve realised that the better way is just to put the #include for the library header file directly in the localEEPROM.h file.

I’ve created a fully functional “proof of concept” library, and included an example test program based on my proposals. I’ve put it on GitHub at:
https://github.com/MLXXXp/eepromMLXXXp

I decided that begin would be a better name than open for the initialisation function. It seems to be more common in the Arduino world, such as Serial.begin(). Otherwise everything else pretty much follows my API proposal.

For now, I have a ARDUBOY_EEPROM_RESERVED define in the library header file to define space allocated at the beginning of EEPROM for system use. Hopefully in the future there will be some way of obtaining this information from the Arduboy library or elsewhere.

In responding to your remarks below, keep in mind that my proposed API is intended to allow any number of back-end management techniques to be used without needing to modify the sketch, except for the single back-end dependent localEEPROM.h file. I’ve tried to accommodate your linked block method and my static mapping method, as well as other conceivable techniques and possibly different storage media.

Having a pass/fail return value for reads and writes could indicate out of range address values or block lengths passed to the functions. My library does this. It may help when debugging a sketch.

I think it would be a good idea for all back-ends to do bounds checking, at least on writes. You don’t want a misbehaving sketch to accidentally clobber the data areas of other sketches or the system save area. Also, who’s to say someone won’t want to write a back-end that does implement checksums?

There’s no requirement for sketches to actually use the return values. Even though my library provides return values, my example sketch currently doesn’t test them for reads or writes.

I see the API as providing a generic method of saving and retrieving data, that is retained across power cycles and the unloading/loading of a sketch. This data would be of fairly limited size, intended for saving high scores, game states, inventories, etc., not entire game level maps or the like. Whether the underlying media is EEPROM or something else in the future shouldn’t concern the sketch developer. The goal is that just by adding a new library and using a different localEEPROM.h file, the sketch will continue to work as before with the new technique/media (provided the program still remains small enough to fit when re-compiled).

For example, even if an SD card uses a FAT file system, a back-end could be written that stores each sketch’s non-volatile data in a separate file. The localEEPROM.h file for that back-end would have a define for the name of the file. The same abstracted begin function would open the file and the read and write functions would index to bytes within the file. Again, this is just for saving scores, etc. Other APIs and libraries could be developed for game levels or other uses of an SD card.

True, my technique doesn’t need a free function but your proposed technique could use one and I want the API to be universal. If we have a free, my library might do nothing except return success, but it could also clear the user ID and sketch ID, so that the next begin would return an allocated status.

My question involving a free function in the API is whether a sketch itself would ever use it. If a sketch is loaded, it would always want storage available to it. I’ve already said a sketch should have its own function to reset its storage but would it ever want to free it up? The only reason would be to release space for use by other sketches because you no longer want to use this sketch, but why waste program space in the sketch itself to do this?

If we include a free function it would probably be best if it was used by some kind of global EPROM management utiltity sketch that you would load when freeing up space was desired. Because such a sketch would need to somehow be told what area to free, and because how that was specified would be back-end dependent, there would have to be a separate utility written for each back-end and included along with the library. Since this utility is back-end specific, it could access EEPROM directly to perform the free, thus eliminating the need for such a function in the API.

Then again, having a free function available could allow a sketch with sufficient free program space to provide its own data release capability, thus preventing the need to load the utility before loading the sketch that needs the storage.


(Josh Goebel) #25

Your implementation looks nice for a statically allocated EEPROM segment. I think we’ll need more game on the platform to see how this whole issue shakes out long-term.

To me the biggest argument against static address allocation is how to deal with variable sized data allocation. If one wants to install a ROM that uses 500 bytes of storage that is going to rule out installing tons of other programs just because it’s likely to use “their” storage areas. I suppose you could say someone could manually tune those settings for every program they install, but that’s just math - and math is why we invented computers. :smile:

And you still need a defragger for when someone gets that 500 byte chunk “stuck” in the middle then wants to play another 300 byte game.

Fun, fun.


(Scott) #26

@Dreamer3,
Your observations and criticisms regarding my EEPROM management method are acknowledged and I fully agree with you. It was an attempt to handle multiple sketch EEPROM areas in a simple way, with a small program footprint. But, it requires manual management and suffers from the fragmentation and other problems you mentioned. If it fails to become accepted or popular due to its limitations, I wouldn’t be disappointed.

You method, using searchable chained blocks, is elegant and would work quite well. Its drawbacks are that it has a larger storage overhead, reducing the usable EEPROM space, and it will also use more program space. If the “powers that be” decide that these resource impacts are acceptable, then I say “go for it”.

Again, the reason for adopting an API with universal methods of isolating the sketch code from the back-end is so that either one could be adopted, and switched back and forth quite quickly, with little or no changes to the sketch.

I’ll note that one reason that my API proposal tries to contain all EEPROM management related variables in a single local file (localEEPROM.h), is so that an external script or program could be written to make management of back-ends like mine easier. I envision that a user would place all of the sketches that they like to use under a single folder (such as MyArduboySketches). A utility, that runs on the computer containing the sketchbook, could walk through and parse all the localEEPROM.h files found under that folder, and adjust the start addresses in the files, based on the SKETCH_EEPROM_SIZE values it found. This utility could be made interactive, as necessary. It could also be possible that a companion Arduino sketch could be written that, based on data provided by output from this utility, would perform defragmentation and deallocation functions.

Anyway, what do you and others think about my API design? Is it workable and something we should try to make “official”? Does it need more or less functionality? Should naming be changed to be more generic and not infer being strictly EEPROM targeted?

To further prove the concept, I’ve written a second back-end. This one allocates space in RAM using malloc, to simulate EEPROM. Obviously, the data won’t be non-volatile and will be lost after a power cycle or sketch change.

I wrote it just to test the portability of the API. However, it could be used during sketch development to prevent excessive writes to actual EEPROM, which has a limit to the number of writes before it starts to fail (specified as 100000). A sketch that accidentally got into a tight loop, writing changing values to the same EEPROM location, could cause it to fail in around 8 minutes.

With my arduboyEepromTest sketch (and presumably any other sketch) all you have to do is install the new library, copy in the localEEPROM.h file from it and, if necessary, change the value of the SKETCH_EEPROM_SIZE define. I’ve put this new EEPROM library on GitHub here:
https://github.com/MLXXXp/eepromInRAM

Switching the sketch back to using my eepromMLXXXp library is again just a matter of switching to the matching localEEPROM.h file.


(Josh Goebel) #27

Cute. Yeah I think we’ve identified the core methods for the API, as you’re showing by writing more implementations against other data stores (like RAM). So far what we have is generic enough to work with any randomly addressable data-source… as you’ve pointed out… I just don’t know that that means we need to hurry up and rename it to be something generic. :smile:


(Josh Goebel) #28

Thoughts:

Add a free() as suggested earlier to “back out” allocations, whether it’s be 00 or FFing the static location or by rewriting the allocation table (in something like my case).

Add a is_allocated() that returns true/false if the EEPROM is allocated.

I think we need to consider making begin NOT do allocation. Technically (for many apps) I don’t need to allocate until I actually have data to save. That might be never if I never use the save game functionality… but begin() sounds like something you’d call in your apps setup() block.

So I guess I"m thinking:

begin()
allocate()
is_allocated()
free()


(Scott) #29

Separating allocate from begin is fine with me. I was just going by your earlier example functions that combined them: eeprom_data_find_or_create or getSaveData.

I think we should use isAllocated instead of is_allocated, as per the Arduino Style Guide for Writing Libraries.

Including a free() function is fine with me as well.

To make the API less complicated, it would be best if a begin was required before using allocate(), isAllocated() or free(). Only begin would require parameters passed to it. Information required by the others would be set up in private class variables by begin. This way, only begin would require its parameters being abstracted with a #define EEPROM_BEGIN … in localEEPROM.h.

I think begin() would now no longer be required to return anything. It would just prepare for all the other functions to be used.

So the prototypes would be:

void begin(???); // parameters are dependent on the library implementation
// The sketch does not call begin() directly. It uses EEPROM_BEGIN.

// information needed by the following functions must be set by begin()
int allocate();
boolean isAllocated();
boolean free();

It would probably be OK to allow allocate() to be called without first calling isAllocated(). allocate() could do its own check and return 0 for “success”, 1 for “already allocated” or negative values for errors (as begin() previously did).

Is a boolean pass/fail return good enough for free() or does it need the capability to return multiple error codes? I think a free() should be allowed without first doing isAllocated() or allocate(), by doing its own find if necessary. Therefore, you could have a “not found” status as well as indicating “begin() hasn’t been called” and maybe more, but would a sketch care what the problem was? It might help with debugging.


(Scott) #30

The term non-volatile is often used for this type of storage, so replacing “EEPROM” with “NVdata” (as in NVRAM) may be a way to make it more generic but still get the meaning across.


(Josh Goebel) #31

For most cases I don’t see what error would come up for free (unless the area never was allocated in the first place)… so maybe it could return NOT_ALLOCATED if unallocated, otherwise it returns success.


(Josh Goebel) #32

Right, isAllocated is really just a convenience method. It shouldn’t really be necessary. It’s for edge cases like:

if (savedGames==0 and isAllocated) free()

Makes more sense in a dynamic allocation environment of course but also useful just to “clear” space.

Saving would always be:

allocate()
write()

(Scott) #33

A “bulletproof” back-end might also detect and want to report “begin() not called”. (This could apply to allocate() as well.) As another possibility, a future back-end for saving to an SD card might want to report “no card installed”. Again, the specific error might mainly only be useful for sketch debugging. In most cases you could probably just blindly call free() and not test the return code.

So maybe to be safe we should have free() return an int the same as allocate(), with 0 indicating “success”, 1 indicating “not allocated” and negative values indicating errors.

But if we decided that free() does its own search, when necessary, then you could just use:

if (savedGames == 0) free()

However, I could imagine isAllocated() also being used before prompting a player to save the game state or a high score, so they could decide if they actually wanted to consume EEPROM for that particular game:

Currently, space hasn’t been reserved to save your high scores. Do you want to allocate space and save it?

or

This game requires xxx bytes of space to save your game state, which could be used by other sketches. Do you want to go ahead and save it?


(Josh Goebel) #34

This is probably just how I’d write it - a little more expressively. You’re right free would do the check itself anyways so your code would do the exact same thing, but mine makes it a little clearer what is going on. You could also rename the method but I’m not sure that’s necessary.

freeIfAllocated()

So maybe isAllocated() plays better with read… I suppose you could call read and get a NOT_ALLOCATED error back, but now we’re just getting into the nuances of library design. I’d probably rather do the check first and then code around the knowledge rather than cross my fingers and code around the error.

Having isAllocated() allows for both approaches.


(Scott) #35

If begin() really doesn’t have to return anything and always has to be called before any other functions, then I think we can eliminate it and move its functionality to the (parameterized) constructor. Any required bounds checking, error reporting, etc., would be the responsibility of the other functions, although if it helps, the constructor could do some processing and set private flags for the benefit of other functions.

Can anyone see a need for having a return code or for not using the constructor?

In localEEPROM.h we would replace

#define EEPROM_BEGIN ...

with

#define EEPROM_PARMS ...

For my eepromMLXXXp library, localEEPROM.h would contain:

#define EEPROM_PARMS MLXXXP_EEPROM_START, ARDUBOY_USER_ID, USER_SKETCH_ID, SKETCH_EEPROM_SIZE

To instantiate the class, the user sketch would contain:

ArduboyEeprom savedData(EEPROM_PARMS);
```
The name of the object (*savedData*) could be changed to anything desired.
There would be no *begin()* function.

(Josh Goebel) #36

I don’t see the advantage of hiding the params here… makes it harder to understand what is happening… if there are 5 params they should all be listed so someone doesn’t have to go looking in a header file to find out what EEPROM_PARMS is.


(Scott) #37

The reason is due to my original goal:

It’s the same reason begin() was abstracted by a #define EEPROM_BEGIN … in localEEPROM.h, and to do a begin the sketch did:

ArduboyEeprom savedData;

savedData.EEPROM_BEGIN;

Each different back-end may need a different number and type of parameters passed to the constructor. My eepromMLXXXp back end needs a start address, numerical user ID, numerical sketch ID and a length. My eepromInRAM back-end needs just a length. Your dynamic block chain method would probably need a game name string and a length. An SD card implementation would likely use a file path/name string and a length.

Using a #define in localEEPROM.h for the constructor parameters means that the sketch’s .ino file doesn’t have to be changed to switch to a different back-end library.

The sketch developers don’t need to understand what’s happening. They just always use the same EEPROM_PARMS when instantiating and the #define in the back-end specific localEEPROM.h file will take care of the actual parameter requirements.


(Josh Goebel) #38

Yeah I get the goals, but too obscured IMHO. I wouldn’t use it like that. When I look at a function call I want to know what is being passed, etc.

It’s not too much to ask developers to understand the constructor for the EEPROM management library they are wanting to use IMHO.


(Scott) #39

In the case of an EEPROM management library constructor as part of a standard API, why do you want to know this? Other than for satisfying your curiosity, of what benefit is knowing what parameters are passed, with respect to using the API functions or for any other aspect of your sketch?


(Josh Goebel) #40

We just shouldn’t assume people are stupid. Revealing the call parameters is just good programming. I’m a strong believer in not hiding things. If you want to do that hiding the entire constructor is probably better than making it look like a method that takes 5 params takes only 1.