Portability of the assembly language in Arduboy2 library?

…are the assembly language portions of the Arduboy2 library portable / compatible with, for example, an ATmega2560 or ATmega328? or are those sections specific to the ATmega32U4, meaning the library is locked to that microcontroller only? :face_with_raised_eyebrow:

Good question. What are you planning to build ?

I do not think there is any specific asm code, besides the routines for speeding the screen/lcd redraw/communication

Currently there is inline assembly code in the SPItransfer() (just a NOP for timing), drawPixel(), fillScreen() (called by clear()) and Sprites drawBitmap() (used by drawPlusMask()) functions.

The ATmega2560, ATmega328 and most other recent processors using the Atmel AVR instruction set should be compatible with this assembly code.

1 Like

Firstly there’s the matter of pins.
The library relies on certain pins doing certain things.
Pretty much all of the pins are #defines so they’re easy enough to remap if they aren’t already remapped when choosing a different board.
So before you could expect anytthing to work you’d have to double check that all the pins map to what they should do.

As for the assembly code itself, the AVR instruction set has a small number of instructions that aren’t present on all boards.

Just now I went through the Arduboy2 code, made a list of all the assembly instructions that are used and compared them with atmel’s instruction set listing. The instructions used by Arduboy2 that are not found on all AVR devices are (in alphabetical order) as thus:

  • ADIW
  • LD
  • LPM
  • MOVW
  • MUL
  • POP
  • PUSH
  • ST

So to directly answer your question:

No it’s not locked to just the 32u4, but there probably are ATmega processors that it won’t work on.

I could possibly help more if I knew why you wanted to know or if you had some specific chips in mind. Pointing at a specific chip and knowing if it supports the instructions is just a matter of checking its device-specific instruction set summary, but getting a yes/no for all of them would take a long time (hence why such a thing isn’t already available).

Thanks for the replies!

Pin mapping I can get my head around, changing those defines and searching for any PORT / PIN / DDR commands used is workable… I just haven’t dived into any AVR assembly, so wanted to test the waters before I did if there may be no need to adapt those sections!

If it helps to point to a specific chip then lets say the ATmega2560?

According to the datasheet the ATmega2560 appears to support all those instructions.

The datasheet can be found here.
The instruction set summary is section 34 on pages 404 to 406.

So for the ATmega2560, yes it supports all the required instructions.
(I don’t know if the pins will need remapping though.)

Could we consider adding a compile flag like TARGET_32U for #ifdef'ing the custom ASM, or #ELIF swapping with the more general C(++) code?

Having portability with the Thumby (RP2040 chip) and other platforms, like home made systems, would be great! I appreciate more work would be required, but having a TARGET type flag would be a simple change and a good start.

I don’t want to have to deal with maintaining the Arduboy2 library for anything other than the official Arduboy.

What happens when assembly code is added to optimise for a different architecture? Then we need another “TARGET” define for it. Then another and another… Then, all those architectures have to be individually tested and verified for each new library release.

In the case of the Thumby, it’s even more complicated because the display dimensions are different, amongst other hardware differences, so just compiling assemby or C++ equivalent code won’t be enough.

It’s easy enough to fork the library then modify, rename and maintain it separately for a new target device.

1 Like

I totally understand, no one wants extra maintenance. But I guess I’m just asking about a small convenience, for people to fork any custom builds. Just like for screen size, the WIDTH and HEIGHT macro has existed for some time…

Would you consider just having a simple macro (‘flag’) to switch between ASM and C++ code then? This is already part implemented in places. I believed this was always your intention. I could probably put together a PR, but only with your blessing :slight_smile:

I don’t see the point. Anyone porting is going to have to make changes, otherwise no port would be necessary. It’s easy enough to search for asm in the code and you’ll certainly find, from reported errors, any missed ones the first time you compile. In fact, it’s probably better to manually search and change, since it will make you aware of what areas might be suitable for architecture dependent optimisation.

Using #if 0 is just a convenient way of commenting out large sections of code, with the benefit that some editors will still syntax highlight code commented out this way. My request to include the C++ equivalent is for documentation purposes. I like to document to assist others with porting but not so much that it implies that the equivalent C++ code is tested and maintained. (That’s not to say that I wouldn’t at least try to maintain it to match any assembly changes.)

1 Like

Just to truly hammer home why this would be a huge maintenance headache…

Consider that at the moment there’s:

  • Mr Blinky’s homemade package, which covers
  • At least two different ESP-board ports (one for ESPBoy, one made by Hartmann1301, and probably others I’m not aware of)
  • My port for the Pokitto (which I never completely finished for various reasons)
  • Simon Merrett’s SAMD port

That’s before factoring in Thumby and other boards that I may be unaware of.

Trying to reconcile all of those differences into a single library is difficult. It’s certainly not impossible, but it’s difficult. Especially when some targets may need assembly to achieve decent speeds, and especially when you have to factor in targets that don’t support Arduino and don’t have the non-standard features provided by avr-libc.

A real world example of the latter problem: when developing FixedPoints, a large number of early complaints were because I was using random, which is actually provided by avr-libc rather than Arduino, so some Arduino targets don’t support it. The complaints came from people using non-AVR Arduino targets like SAMD.

Arduboy2 uses srandom, so a similar conundrum would arise here. You could conditionally use srand instead, but that’s not going to account for all those Arduboy games that are calling random, so you would actually have to reimplement a fair chunk of avr-libc on top of Arduboy2 (which is actually what I chose to do for my unfinished Pokitto port, as well as a fair chunk of Arduino, minus serial printing).

Honestly though, aside from the assembly in Sprites.h, the rest is all very easy to pick through when porting. (The difficulty of assembly is greatly exaggerated anyway. It’s not really difficult to understand if you know about registers and addressing, it’s mostly just tedious.)

Since the introduction of SpritesB you don’t have to pick through the Sprites code either, you can just reimplement Sprites.h as:

#include <SpritesB.h>

using Sprites = SpritesB;

(If you really wanted a literal translation though, I have an almost complete one. My local copy may even be fully complete, wherever it is. Edit: I checked, I never got around to implementing drawPlusMask.)

Though I question how many people use WIDTH and HEIGHT or Arduboy2::width() and Arduboy2::height() versus how many use hardcoded 128 and 64, and how many assume sBuffer is guaranteed to be 1024 bytes.

I also wonder how many games break when the width and height exceed the range of a byte - i.e. when they are greater than 255.

One can create a suitably portable API, but one can’t stop people writing their code in a way that hampers porting.


I totally get no one wants extra maintenance. :+1:
So let’s forget ‘portability’ and any other systems, and allow me to rephrase:

Can I tidy up the existing code, by either…
(1) wrapping the ASM with an #ifdef and a suitable named macro, e.g. #define ASM_OPTIMISEDor
(2) conversely wrap the substitute C++ with #define DISABLE_ASM_OPTIMISATION, etc.

I don’t feel this would create any more maintenance than is currently done… especially as the library is quite stable. Point (2) essentially just replaces the #ifdef 0. If there’s a 50:50 chance it might get accepted, I’ll do the PR to show how minimal this request is… :slight_smile:

PS- Although obvious, I do appreciate both (1) and (2) require ASM and C++ blocks being touched.

I don’t think this change would actually achieve anything useful.

  1. Nobody using a (genuine) AVR-based Arduboy would ever want to use anything other than the optimised assembly because the pure C++ version would be less efficient (otherwise the assembly version wouldn’t exist anyway).
  2. There’s a chance the C++ versions don’t actually work properly because they potentially won’t be maintained and tested. At present they only exist as a reference for porters and there’s no guarantee that they actually work or that they’ll be maintained.
  3. Having the conditional compliation doesn’t significantly help porters because they’re going to have to change all the pin setup code anyway (and potentially quite a few other things, like the screen rendering code). At most you’d be shaving about 5-10 minutes off their efforts (or however long it takes to delete the assembly and the corresponding preprocessor directives - someone with decent regex skill could probably do it in 2 minutes).

Though if it were to be done, the macro would probably have to be called something more like ARDUBOY2_NO_ASSEMBLY to avoid potential clashes with other macros. (You’d certainly want the assembly version to be the default and the C++ as the opt-in because of point 1 above.)


@Pharap, as always - thanks for sharing your thoughts. I really appreciate your different perspective on this. I was just browsing the forks and the different versions of drawPlusMask. To me it seems a shame to not capture these developments by the wider community, even just for the sake of documentation.

1 Like

You could always create a stand-alone repo that contains and documents the differences if you wanted. Though really I think SpritesB is enough for most people.

As I mentioned before it’s easier to just reuse SpritesB than to attempt to decipher Sprites, and anyone for whom SpritesB isn’t fast enough would probably still be better off using SpritesB as a base and tailoring it to their own platform rather trying to decipher Sprites.

One of the main reasons Sprites is faster is that it’s juggling registers, which is a platform-specific benefit that’s only going to be relevant to AVR. That in turn means it’s not particularly useful for porters.

If you want to understand what it’s doing for purely academic reasons, you’re best off just learning to read AVR assembly, because the most interesting aspects of it can’t be expressed as C++. What can be expressed as C++ is expressed in SpritesB.

Lastly, bear in mind that resorting to assembly is a last-ditch attempt for when something can’t be expressed in C++ (as is the case with pgm_read_byte et al) or in the rare case where the human can produce code that’s more efficient than what the compiler produces (which usually happens by doing something the compiler isn’t allowed to do or by exploiting platform-specific quirks). 95% of the time, a modern compiler will do a good enough job.

1 Like