Bootloader and File Format [Wiki]

I think I am a bit lost on how it should work. Can you help explain?

So you don’t want to use file names / string identifiers for when a game wants to find it’s data section because of potential name clashes? What are the default pages? During development should we just hard code the data / save addresses? (If so this isn’t very user friendly for developers)

If we aren’t using file names / strings to identify game slots because of potential clashes then why does the naming matter? What is the significance of the EEPROM start / end addresses in the file name?

Sounds great! Perhaps we should also include an option to disable this behaviour per game, i.e if the game doesn’t actually access EEPROM? Also, what will happen if a user flashes a new game via the Arduino IDE in between playing games? Could the bootloader get confused and accidentally overwrite the save of the previously loaded game with some bad EEPROM data when a new game is loaded via the loader?

Do you think we should also keep the game data 4K aligned and padded too as I mentioned in my previous post?

Here is how it will work for now:

You want show a simple video made of 1K images. These images are stored in a binary file.
You write this file to the flash chip using a command line tool (or GUI version)
dev-writer --datafile=your_video.bin
if there is a savedata file you want to add:
dev-writer --datafile=your_video.bin --savefile=initial_savedata.bin
or if you just wanted to add save data block:
dev-writer --datafile=your_video.bin --savesize=4096

The tool will put all data at the end of flash like flash_start = flash_end - savesize - datasize and will print out the start page number of the data and start page number of save data (if any)

In Arduino program in setup you’d add:
flashInit(progDataPage, saveDataPage);

When somebody want to try out your program in Arduino IDE they would do the same but wouldn’t need to change the progDataPage, saveDataPage in the sketch cause they are the same (unless the person has a (homemade)Arduboy with smaller flash ofcourse)

When the program is added to the flashchip 'filesystem’with PC manager and burned with the loader the flashInit() function will ignore the progDataPage and saveDataPage values and fetches those stored in the interrupt vectors.

As mentioned in the header discription the EEPROM backup is optional. The EEPROM page in the slot header would be 0 for games that do not use EEPROM or need EEPROM backup

When a program is uploaded through the Arduino IDE the interrupt vectors are overwritten and the EEPROM magic number/page are no longer available and so is the backup feature.

If it’s possible I’d also like to add backup EEPROM before regular uploading so the game slot EEPROM is always up to date for the next time you play the game.

I plan to make the auto EEPROM backup feature optional. But as I mentioned earlier I’m still working on implementing the whole EEPROM thing.

1 Like

Hi. I got my devkit and now I want to check what software/tools we have to write content to the flash. Is there a git repo somewhere that I can look at?

I moved the info for how to use the Dev Kit to the head of the post:

But this is what you are looking for I think: GitHub - MrBlinky/Arduboy-Python-Utilities: Python script to upload .hex, .zip or .arduboy files to Arduboy and Homemade versions

1 Like

Thanks. For the image generation that is great. Is there a demo version of how to access the flash from the scetch (@Mr.Blinky)? Or at least some ideas on how to do that. I remember there was something around in some threads but I have hard times finding it. I think something simple like R/W examples would be good. Maybe even some initial idea of how the sketch finds its data section.

@Mr.Blinky is still looking at it but I don’t know if you can find the example that is based on this:

As that will be reading the frames at least.

Ah, ok. Found the demo code. Thanks for the hint.

1 Like

Yeah we are of course trying to wrap all of it into a library to make grabbing the assets as easy as possible, but the other night keeping me awake was making me realize that using some kind of paging system may be the most efficient way of handling it otherwise any kind of dynamic memory manager would probably be annoying and code space heavy to implement. I would imagine. I’m very bad at coding.

So it may be possible that you assign all your sprites, your strings and music whatever, and those then are swapped out for same sized replacements. I guess a little similar to a paging system used in the old consoles.

Here are useful links:

I’ve just updated the WIP library. As @Pharap sugested I’ve changed the set of functions into a static class.

A paging system maps memory into existing memory space. Flash memory will always be external.

I’m working on a drawBitmap bitmap function that looks like this:

void drawBitmap
int16_t x,
int16_t y,
__int24 pageAddress,
uint8_t frame,
uint8_t mode
pageAddress is a 24-bit address in flash memory. To be more precise it’s actually a 24-bit relative offset added to the start address of the programs data section in flash.

Hmm thinking of it now I need to call pageAddress, flashOffset or something.


Thanks for the update. For development is there some special mode of operation or shall I just create an image with just my game on it.

Or can I Program the game into progmem and use hardcoded flash addresses during development?

You can just upload you game from the Arduino IDE. If your program requires data stored in the flash chip. you can write your binary data file to flash using the flash-writer script with the -d option. When the upload is finished you will be given a page number. Which you need to use in the Cart::init(pageNumber) function.

Let’s imagine your program is the factory animation example.

  • 1st you you need to upload your animation data with the script and you’d type the following on the connamd promt (if both files are in the same directory) -d factory-frames.bin

When upload is completed it would say something like:

Please use the following line in your program setup function:


or use defines at the beginning of your program:


and use the following in your program setup function:


You need to remeber the value 0xDCF0

  • Now with the data stored on the flash cart. You can ‘develop’ your program in the Arduino IDE. In the Setup() function. you will need to add the init function with the value from the flash-writer script:

Now when uploading from the IDE, your program will know where to find its data. When using read and write flash functions your data starts at address 0.

Your data is made up of 1K images, each image is a frame of the animation. To show the animation you can just multiply a frame counter with the size of a image (1024 bytes) The code that will load your images could be like this:

constexpr size_t IMAGE_SIZE 1024;
__uint24 frame;

Cart::readDataBlock(arduboy.sBuffer, IMAGE_SIZE, frame * IMAGE_SIZE);

Note that once the program gets in a finished state. The hex file and data file can just be added to the flashimage .csv file. The flash-builder script will make sure your program will know where to find it’s data file.


Do you think you’ll actually use a uint32_t and just mask off the excess or store a uint8_t [3] and write some assembly for all the various operations?

Or is __uint24 a compiler extension?

Does that mean that you’d need to recompile the code if you reflashed the chip and the data ended up in a different area?

I have a few ideas/suggestions for some of the API names,
but I won’t suggest unless you’re interested in API name changes at the moment.

I’d recommend using constexpr variables instead of defines for constants and an enum class for chip commands, but I suspect there would be some resistance to those suggestions from people who are more used to macros.

I think uint24_t should be a type alias, though there’s no way to check whether a type named uint24_t already exists or not.

And lastly I can think of a handful of improvements that could make the API easier to use.
If you’re open to PRs then I could demonstrate a few.

Head post is a wiki, how about throw those links on top?

It’s an extention as of version 4.7 (But I’ll also do assembly optimisations later on)

Only when:

  • your program data size changes during development
  • your program save data size changes during development
  • your data is overwritten by an uploaded flashimage that is large enough to overwrite your program data and program save data at the end of flash memory

The flash-builder script will make sure a program can locate it’s data (the 24-bit addresses in a program are relative offsets)

I’m open for suggestions at any moment changen them early on may save work later.

Isn’t there’s a contradiction in a constant variable? I think there will always be two camps when it comes to constexpr and defines.

I’ve concidered adding a #define for uint24_t to cart.h but wasn’t sure I should do that.

I’m always open for PR’s

Thanks for all this info. I’d like to add some thoughts about the new library. I think one of the main use cases for this kind of memory is reading. So maybe we can put focus on that. Another main use case is to reduce PROGMEM usage for data so we can use it for more code. So what is used for PROGMEM?

First, all data that is explicitly put in PROGMEM. Here I think we can mode the Arduboy2 library to load all this data from external flash (e.g. display init data).
Second, all variables that are in the data section. E.g.

uint8_t state = 1;

That would cause the value 1 to be stored in PROGMEM and during boot the variables in the .data section would be initialized with values from PROGMEM. I think also most games will have some kind of constructor that is doing something like:

void Blah::Blah()
  instance_var = 1;
  instance_var2 = 5;
  memset(some_instance_struct, value, ...)

And so on. So if the interface we are defining would wrap this nicely for less experienced users that would be a plus. E.g. the class could use a struct with all its state variables and then just load all the init values with a sequencial read:

struct state { ...} state;
<now in the constructor>
flash_read(&state, sizeof(state))

And the whole instance variables would be initialized with very little code/data in PROGMEM. Just thoughts…

Now concerning my game the most benefit would be reading textures and the like from external flash. Reading each byte, waiting for it and then using it would be very slow. So my thought here would be we can just use the SPI hardware better. Say I need a byte from the texture. Usually I read it and do something with it. While I am doing something with this byte I could read the next one already. So maybe we can do something like a stream of bytes…

stream_start(... blah blah address page stuff)
(above we do an addressing of the flash)
(above we write(0) to read one byte but do not wait for finish)
<now do something else in game logic>
stream_get() <1>
(above we actually get the byte from the SPI register, assuming the bytes is already received, maybe we need a safe/unsafe version, safe version would check if byte is already received, function can request next byte already)
<use byte in game logic and repeat <1>>

Something like this. I not very into C++ but maybe we can do something like a stream or something similar. The basic idea is to trigger the SPI to clock in a byte while we do other things. That way, ignoring the initial overhead it could be even faster than PROGMEM read (it always take 3 cycles, whereas the SPI register read+write could be faster (maybe)).

As I said, just some thoughts…

Also we might need to thing about how to nicely create a data file.

My gut instinct is to be worried about that because compiler extensions cause various issues (mainly related to portability, backwards compatibility and compiler bugs), but I think creating an 24-bit type manually would be quite a bit of effort so it’s probably the easiest option for now.

It’s late here so I’ll spend more time thinking about it tomorrow.

My immediate thoughts are are mainly:

  • init should be initialise
    • (And possibly include initialize as a synonym to appease Oxfordians)
  • readWord should be something like read16 or readUint16.
    • The size of a processor word varies between CPU models, but a uint16_t is universally and unambiguously an unsigned 16 bit integer.
  • read should perhaps be either read8, readUint8 or readByte to avoid any ambiguity

There’s probably a contradiction of their etymologies,
but not a contradiction of their definition within the scope of the language.
(Technically the variable can still vary in value, just not at runtime. :P)

Perhaps, but I can’t think of a single good reason to prefer a macro over constexpr for constants,
and I trust the words of the standard C++ foundation about what’s good and what’s bad.

Currently it’s still in the source.

I’m not sure whether it’s worthwhile or not, I’d have to think about it,
but I know for definite that a type alias would be better than a macro.

Theoretically it’s possible that uint24_t could be added to stdint.h as part of the standard,
in which case #ifndef uint24_t wouldn’t actually do anything because uint24_t wouldn’t exist during the preprocessor stage (because it would be a type, not a macro).
Worse still, if other code was using uint24_t and it was expecting the type to behave differently to __uint24 at some point then the code could still compile but introduce some silent bugs.

With a type alias, at least you’d get an immediate error because the type uint24_t would already exist and thus cause a name clash.
(Fail fast, fail early, fail at compile time rather than runtime, blah blah blah.)

Good to know.

Mainly I’m thinking of adding template functions so that instead of:

uint8_t someBuffer[16];
Cart::readBytes(someBuffer, sizeof(someBuffer));

You can just write:

uint8_t someBuffer[16];

(And the same for readDataBlock and readSaveBlock.)

I’d also like to add a template version of writeSavePage that checks that the array you feed to it is large enough to prevent buffer overruns.

// Large enough, runs fine
uint8_t bufferA[256];
Cart::writeSavePage(page, bufferA);

// Buffer not large enough, get a 'static assertion failed' error
uint8_t bufferB[128];
Cart::writeSavePage(page, bufferB);

Also maybe some templates for reading classes/structures like EEPROM’s get.

// Read an array of 16 points
// and the compiler takes care of all the hassle for you
Point points[16];
readData(points, pageAddress);

(Now I see that, I’m thinking it makes more sense to have the page address first.)

And also making sure to mark the buffer on writeSavePage as const,
I.e. static void writeSavePage(uint16_t page, const uint8_t* buffer);
(const effectively means “I promise not to modify this”.)

Only if it has static storage duration (i.e. it’s a global).

There’s not much that can be done about that,
the compiler handles that sort of thing.

I’m not sure what this is supposed to be demonstrating.
(Also, do people actually use memset?)

Wrap what exactly?

See my suggestion about templates earlier in the comments.
readData(state, pageAddress); is certainly doable.

However, it would technically only work for certain objects.
I’m not sure what the full set of requirements would be,
but I’m pretty sure the class would have to be TriviallyCopyable or PODType.

It might be possible.

I don’t know enough about the flash chip and SPI to make any guesses as to what the performance increase would be like (or if there would even be a performance increase).

One problem is that there’s only one flash chip and potentially multiple streams.
Sharing resources can be very hard without threads, locking, mutexes, semaphores et cetera.

Either way it definitely needs to be more OOP:

FXStream stream { pageAddress };
uint8_t value = stream.read8Unsafe();

Thats exactly why I omitted the full word :stuck_out_tongue: Nay I don’t like the long word :slight_smile: I’m just thinking maybe call it boot or begin? as the Arduboy(2) library uses those?

I know it can be confusing to use Word. But when a word is 32-bit it is usually called a long(word).

Yeah when going for read16 we would also need to go for read8. Which kind of reads odd. The number 16 is more magical :slight_smile: taking the Arduboy(2) library in mind maybe we should go for Short? as there’s a delayShort() function in there.

HaHa so true. Unless the code is running on a very instable system :stuck_out_tongue:

Ah it is? wasn’t sure about that.

Yes that would be better. I’m just used at pulling defines out of my hat.

You’re examples make me understand templates better everytime. I haven’t added the ealier templates back after the class change.

Your other ideas sound good.

Yes but a few bytes of data may not be worthwhile. It’s mainly more usefull to store larger data there. Progmem can also be freed by not using certain exlusive functions. Like you could ommit using print functions by drawing text as bitmaps.

That would be nice if we could do that. But that would require a rewrite of avr-libc.

Yeah I know. each SPI read/write cycle requires 18 cycles. I’ll be writing optimized versions for the functions that handle multiple bytes. The one that will bennefit the most of this will be the drawBitmap (or drawSprite)

It’s possible to read a sequence of bytes like a stream using Seek once and then multiple readByte, ReadWord, ReadBytes until the flash chip is deslected.
You could also use something as myvar = SPDR; which would be the fastest. But you’re at the compilers mercy ensuring 17 cycles have passed (unless adding additional SPSR tests loosing some performance again)

Your thoughts are appreciated.

1 Like

Ok I see.

I wanted to show that many games have code that is initializing a whole list of variables to some values before running the game logic. Usually this ends up with quite some code and data from PROGMEM. If we could show that using a kind of state object (either a whole class instance or a struct) then the user could load that data in one shot from external flash. As you wrote you might have some constrains and need to use TriviallyCopyable or PODType classes.

listOfPeopleUsingMemset += veritazz

Maybe “wrap” was the wrong word. I meant to find a good interface in the library that makes it easy for the user to see it is good practice to use the provided library interface to init its game states or not set global variables right away but use the library functions. I think it needs to be kind of self explanatory. Maybe we need to provide application note alongside.

Yes for sure. Just thought we might also rework parts of the Arduboy2 library to reduce PROGMEM usage here and there. It can also serve as application example on how to efficiently use the external memory. I guess the library is the first source of examples for beginners.

Yeah something like that. There will be no mercy I guess but for full performance gain something like that might be required. I agree that for many things the standard functions are sufficient (e.g. huge animations, start/help/… screens, savegames …) but if you have data that is repeatedly read throughout the game logic (e.g. lookup tables) then some nice way to read a byte without actually waiting for it and instead do some other calculations would be great. Maybe I am just thinking about it because my game would benefit here.

Either of those would be better than init.

It depends on the platform.

On some platforms a ‘word’ is 32 bits and a 16 bit value is called a ‘halfword’.

Part of the reason for confusion is that the Windows API defines WORD as 16 bits, DWORD as 32 bits and QWORD as 64 bits because when it was originally written x86 was still primarily 16-bit and the 32-bit changeover was just beginning.
From then on that terminology was kept as backwards compatibility.
There’s a similar story with Intel’s x86/64 assembly language.

readUint8 and readUint16 don’t look that odd.
They probably make more sense than read1 or read2,
and they allow the creation of readInt8 and readInt16.

As an example from a ‘real world’ API,
C#'s BinaryReader class uses ReadByte and ReadUInt16.

Another alternative is readU8 and readU16 (and readS8 and readS16).

Specifying the actual size might not be the ‘prettiest’ way,
but it’s completely unambiguous, which is more important.

The problem with that is that short is another type that varies between platform.
short is allowed to be 32 bits or 64 bits.
Legally char, short, int and long could all be 64 bytes and thus sizeof(long) could legally be 1.
That’s precisely why the fixed width types (uint8_t et cetera) were introduced in the first place.

I’m pretty sure it’s called delayShort because it can only handle short delays (65.535 seconds max).

It depends what happens to the constexpr variable.

A lot of the time if it’s possible to encode the value as an immediate in an assembly instruction, the compiler will choose to do that, so despite being a ‘variable’ the actual object might not exist in memory.

A bit like how a local variable may never be stored on the stack if the compiler decided to keep it in a register for its entire lifetime.

If the variable did end up in memory then yes, it could change value,
but the same could happen with a macro, or possibly even the actual machine code.

Unless I’m looking at the wrong part:

I’m assuming that’s the ‘canonical’ version of cart.h?

If you want I could sit down and explain them properly.

The basics of templates aren’t really that hard to understand.
The only hard parts are how the compiler can infer the template parameters from the arguments and SFINAE.
The only scary part about templates is some of the complex wizardy that they can be used for.
(*hides shrine to Hermaeus Mora*)

The simplest version is that they’re basically blueprints for classes and functions,
and the compiler uses those blueprints to generate classes and functions.
Sometimes it can infer the template parameters,
sometimes they must be explicitly specified.

It depends on the usage and how the compiler optimises it.
Overall I don’t think it will be much of an issue.

With most games it’s the graphics and level data that eat the PROGMEM.

What do you mean by ‘whole class instance’?

class and struct are effectively the same thing.

It’s probably cheaper (or at least no more expensive) to just use a for loop.
Or (in a constructor) you can just use “in-class member initialisers” as demonstrated here.

I think it’s a bit early to conclude that.
We need to actually trial different approaches and see how they behave.
There’s a lot of rules about how things are initialised,
so what you expect to happen might not be the case.

We’ll definitely need examples, documentation and possibly a tutorial.

Beginners don’t tend to read the library source code though, so offloading (for example) the Arduboy logo onto the flash chip wouldn’t really be of an benefit to beginners.

I’m pretty sure most beginners look at existing games rather than the library,
so really we’d need to write some simple example games or just make sure the games we’re developing set a good example.

What you’re essentially asking for is a set of asynchrous read functions.
(To be technical.)

1 Like

To get back on that. You could put all your global (initialized) variables in a structure and then initialize it in setup() function:

struct GameState
 //all your globals here

GameState gameState;

Cart::readBytes(*gameState, size_of(gameState) , gameStateFlashAddress);

It would be prone to bugs though. as you’d have to carefully keep track of your gameState structure and it’s data in the data(script)file.

As on saving PROGMEM It would be better to use the Arduboy2Base class instead of using Arduboy2 class. When creating an Arduboy2 object the virtual functions bootLogoExtra and write will always be included and may take up some ~2K

I’d probably go for begin then. It’s more inline with Arduino and easier for beginners

That’s why I try to ommit numbers in function names.

Yes. and It made me just realise it’s actually not a fully correct naming as a short represents a signed 16-bit number.

I’ll keep your suggestions in mind.

Sorry it was a rhetorical question. I’ll change it an alias in the next version. like this right?

using uint24_t = __uint24;

Thanks. I’ll let you know when I’m ready for that (Just trying to keep my focus on coding the library functions)