In the call of .drawExternalmask it doesn’t to my knowledge use the assembly code in drawbitmap
204 case SPRITE_MASKED:
205 uint8_t *mask_ofs;
206 mask_ofs = (uint8_t *)mask + (start_h * w) + xOffset;
207 for (uint8_t a = 0; a < loop_h; a++) {
208 for (uint8_t iCol = 0; iCol < rendered_width; iCol++) {
209 // NOTE: you might think in the yOffset==0 case that this results
210 // in more effort, but in all my testing the compiler was forcing
211 // 16-bit math to happen here anyways, so this isn't actually
212 // compiling to more code than it otherwise would. If the offset
213 // is 0 the high part of the word will just never be used.
214
215 // load data and bit shift
216 // mask needs to be bit flipped
217 mask_data = ~(pgm_read_byte(mask_ofs) * mul_amt);
218 bitmap_data = pgm_read_byte(bofs) * mul_amt;
219
220 if (sRow >= 0) {
221 data = Arduboy2Base::sBuffer[ofs];
222 data &= (uint8_t)(mask_data);
223 data |= (uint8_t)(bitmap_data);
224 Arduboy2Base::sBuffer[ofs] = data;
225 }
226 if (yOffset != 0 && sRow < 7) {
227 data = Arduboy2Base::sBuffer[ofs + WIDTH];
228 data &= (*((unsigned char *) (&mask_data) + 1));
229 data |= (*((unsigned char *) (&bitmap_data) + 1));
230 Arduboy2Base::sBuffer[ofs + WIDTH] = data;
231 }
232 ofs++;
233 mask_ofs++;
234 bofs++;
235 }
236 sRow++;
237 bofs += w - rendered_width;
238 mask_ofs += w - rendered_width;
239 ofs += WIDTH - rendered_width;
240 }
241 break;
242
That said, from the looks of the commit history v4.1.0 might not have the fix.
Try getting a copy of the Arduboy2 library straight from the master branch (using download zip) and then dumping the contents of the zip in your Arduino/libraries folder (replacing whatever copy of Arduboy2 you had installed).
Unfortunately it’s all part of the same function, but there’s some odd rules about inlining that mean the assembly code might not be used depending on the circumstances.
The problem is that the r28 and r29 registers (which make up the Y register) are not declared as clobbers and the Y register is not declared either. so the compiler thinks it can use r28 and r29 in the other C code of that function. So it decides to put sprite_ofs_jump in r28 (see line bb4:). But the assembly code at the beginning overwrites it by putting buffer_ofs_page_2 into r28,r29 (see line b72:)
Fixing takes a bit more then just clobbering R28,r29 or declaring Y as pointer as the compiler generates an error when doing so. probably because it runs out of registers pointer registers.
But that’s only in the master branch, it hasn’t made it into a release yet.
The easiest thing would be to just swap to a C++ code version, but I know for a fact that increases the code size substantially so it would break games that are desperately near the progmem limit, so that’s a bad idea too.
I’ve got a feeling that __attribute__ ((noinline)) might fix it since that worked before the assembly rewrite.
Otherwise there’s going to have to be some more register juggling put in place.
That’s strange. I tried your latest code on GitHub (commit number 6739e84563823fb08675b7a88d2325e562404329).
With the latest release, 4.1.0, of the Arduboy2 library, I get the artifacts as you described. Using the latest Arduboy2 library from GitHub, there are no artifacts.
Here are the details of my testing:
Tested with both a real Arduboy and my breadboarded system based on an Arduino Micro.
Sketch program and global variable sizes given below are with Board: Arduboy selected.
Arduino IDE version 1.8.4
Arduino AVR Boards version 1.6.20
(as displayed using Tools > Board: > Boards Manager...)
To be safe, flashlight mode was used for all uploads.
With the 4.1.0 Arduboy2 release (installed using the Arduino Library Manager): Sketch uses 11776 bytes (41%) of program storage space. Global variables use 1888 bytes (73%) of dynamic memory
With the latest Arduboy2 code from the GitHub master branch
(commit number 6360eb2b37502538ebde4e58e6ccb840b0a5f463): Sketch uses 11792 bytes (41%) of program storage space. Global variables use 1888 bytes (73%) of dynamic memory
And the differences are in the sprite draw mask assembly bit. buffer_page2_ofs is not passed on through registers but calculated in assembly by pasing on WIDTH as an immediate value, saving register(s). also the register declaration is different. But the bug is still there. The compiler may decide to use r28, r29 to pass on values to the assembly in other situations.
I haven’t had much to do with the assembly code for drawPlusMask(). I’m not sure if @Dreamer3 (yyyc514 on GitHub) did the original code but he’s the one who worked out the latest changes.
@Mr.Blinky, If you think there’s still problems and have any fixes, please let us know.
AFAIK my fix completely resolves the r28/r29 issue as it forces the use of lower registers by all the incoming/outgoing variables… then we steal r28/r29 for ourselves by pushing/popping it… but always possible there is some OTHER problem. If someone is using the PATCHED library and still seeing an issue it’d be helpful to know for sure.