Wow, thx. I see now. So maybe that code can be changed to not use int8_t and just use uint8_t. Then all the casting can be removed. I think the only reason for it is because -1 is used a marker for “not yet initialized”. This can easily be 0xff instead with a little change in the logic. Then the code size should also get less at this point.
I’ll have a look when I get round to it.
To be honest it’s probably not a major deal because the compiler is probably smart enough to figure out it can turn all that into just a regular address take.
The type system only actually exists before compilation, so a lot of casts are effectively nonexistant by the time the code is compiled. And references are usually just compiled into pointers depending on inlining and the context.
I’ve started to have a read through of the code to see if I can work some voodoo on it. So far I’ve made one change and got it down from 28444 bytes (99%) of progmem to 28276 bytes (98%).
I’ll do a bit more now and again to see how far I can get it down.
Yes. For the size I think it will change because handling unsigned vars usually produces less code than working with signed vars. Hope that is correct, my memories are weak here.
Voodo like is the like that I shared. Tried it sometime ago on one of my projects and it freed up quite some space.
The whole stateVars thing is a bit of an ugly wart in the code because I was in a hurry. Most of the code was written after the first deadline was over!
What it does:
States need to store things while they’re active. It can’t be put on the stack and I didn’t want to put it in the heap, so I made the stateVars array.
While I didn’t actually check for this in the disassembly, the casting shouldn’t take any space at all. I’m just telling the compiler “this variable exists over there, don’t care if the type is wrong!”
What I probably should have done, but didn’t think of it at 2am:
union {
struct state1{
int8_t offsetX, offsetY;
};
struct state2{
...
};
};
Thanks for the patience and sorry about the cryptic variable names. 2AM.
Since this is my first Arduboy game, there’s a lot of weird experimenting going on to see how things behave.
“ox” should be offsetX. And instead of x/y, maybe a 1d offset could be used since it’s a 32x16 matrix.
Using one variable instead of two should free up some space.
I think I’m going to do a major cleanup after the voting is over. For now I’ll just do some more testing and taking note of whatever voodoo you guys come up with.
I really should use PROGMEM arrays more.
Consequently, gameState shouldn’t be a function pointer, it should be an uint8_t index into a table.
Wow if this is your first Arduboy game it is even more impressive. Not sure if I could manage that and even till 2am. So no need to excuse here for any coding style. I am also one of those guys from time to time
Voodoo is the link that I shared that rewrites the digialXXX macros to resolve at compile time so it boils down to 1 or 2 instructions. It won’t get you kb’s of memory but at least a little. The big change I think will come when you use compression for your graphics.
Unless they got inlined without a trace, I don’t think any of the digitalXXX functions are in there.
Which makes sense, the library accesses the registers directly.
I really need to look into graphics compression.
Kb’s of memory would become available if I removed the USB code. It’s huge.
How do people here feel about this?
I think there was already a thread about this. Not sure how it ended but I think they do not want to have a custom bootloader that is too different from the original Arduino one.
Depending on which operations are being used, that’s generally correct.
Unions are good, a tagged union is even better.
If they ever do another C++11-scale overhaul I hope they add syntax to make tagged unions easier.
It would save some space, but I’d be surprised if it saves a large amount.
When you factor in null checks and array indexing, I think the difference between a table index and a function pointer is probably minimal.
I think one is probably better for saving RAM and the other is better for saving progmem.
(I haven’t checked that because function pointers are quite rare in Arduboy code. I think I’ve seen used about 2-3 times before.)
Macros are a bit of an odd demon.
One of the things I did on Dark & Under to save us memory was to replace the Arduino macro abs
with a template function absT
. I can’t remember how much it saved, probably less than 10 bytes, but that added up to a fair sum because of how and where we were using it.
You’d think that the two approaches would result in the same code, but actually gcc seems to get on better with inlining/optimising functions than it does macros.
I think part of the issue is that because macros are just pasted in, they end up complicating the expression to the point where the compiler can’t do much with the result, or doesn’t notice the repetition so it ends up doing something twice.
(And rarely it actually has to do something twice because of the language rules about sequencing operations.)
On the other hand, if the compiler is looking at a function call instead of the result of a macro expansion, it has an easier time reasoning about things because it sees a function as an input-output situation rather than a chain of operations.
That’s probably what happened. The point at which functions stop being inlined varies with how often they’re used, but calling a function is surprisingly expensive so a lot of the time functions do end up being inlined.
I’ll look at some more stuff later, possibly after you’ve cleaned up a bit if you’re definitely going to be cleaning up since that would help a lot with knowing what changes can and can’t be made, but for now the one change I have found is in person.cpp
Replacing:
switch( eyes ){
case 0: eyeSprite=eyes1; eyeMask=eyes1_mask; break;
case 1: eyeSprite=eyes2; eyeMask=eyes2_mask; break;
default:
case 2: eyeSprite=eyes3; eyeMask=eyes3_mask; break;
case 3: eyeSprite=eyes5; eyeMask=eyes5_mask; break;
}
With:
const uint8_t * eyeLookup[] = { eyes1, eyes2, eyes3, eyes5, };
const uint8_t * eyeMaskLookup[] = { eyes1, eyes2, eyes3, eyes5, };
const uint8_t * eyeSprite = (eyes < 4) ? eyeLookup[eyes] : eyeLookup[2];
const uint8_t * eyeMask = (eyes < 4) ? eyeMaskLookup[eyes] : eyeMaskLookup[2];
Note that if you put eyeLookup
and eyeMaskLookup
outside of the function you get 28276 bytes (98%)/2170 bytes (84%)
, but if you put it inside the function you get 28364 bytes (98%)/2162 bytes (84%)
, so it depends on whether you want more progmem or more RAM. Either way you’ve regained a small chunk of progmem.
I haven’t actually checked if the result is the same, but theoretically it should be.
The thing is, digitalWrite
isn’t a macro. And all the port registers are read/written directly anyway, AFAIK.
That’s going to look even better when I use drawPlusMask. The separate masks aren’t necessary.
Ah, I didn’t know what was being refered to by digitalXXX
.
I assumed they were macros because @veritazz said
Probably a slip of the tongue (or finger).
I might wait for you to finish some of your edits before having a proper go at cutting the size then.
I’ve done some experimental work on the Arduboy2 library to aid in removing the USB code. I’ve continued the discussion here:
Yes sorry. Sure they are functions right now. The code from the link will change them to macros that boil down to just a single instruction when used correctly. The AVR has a special instruction to access the IO registers. Of course if you plan to use these functions a thousand times it is not worth the effort.
I did this for testing quite a while ago and it freed up quite some space and aside from that it is much faster.
Sorry for the confusion
As has already been mentioned, in release 4.0.0 of the Arduboy2 library I replaced all Arduino digitalXXXX pin control functions with direct pin manipulation.
I don’t think Hello, Commander does any pin manipulation.
Also, the ArduboyTones library has always used direct pin manipulation and I modified the ArduboyPlaytune library to do the same.
+1 for the games Main features list!
Now that the voting is over, I made an update. Still not a major overhaul, but I managed to free well over 2kb.
Used some of that space already: Fixed some bugs, added recoil screen shake, new character accessories, made terrains more complex so that it’s easier to find cover. The game feels much less random now.
I’m probably forgetting to list something, but it’s past 2AM again.
2Kb? Does this involve a customer boot-loader or dropping the USB support?
Neither. I did drop random(), that freed a good bit of space.
Also trimmed the transparent area around sprites, replaced the scar sprite (was the size of the entire head) with a cap (looks better, takes up less space), and replaced various setCursor+print calls with a single one.
People often don’t realise that random
is more expensive than rand
because it works with 32-bit numbers instead of 16-bit.
random
is non-standard, rand
follows the C definition of using int
. (I learnt that the hard way.)
In this case I have a hash function for generating terrain, so it made sense to use that instead of random. I haven’t checked the distribution, its probably terrible, but in this game it looks good enough.
I just wrote a simple test to see what the difference between drawPlusMask and drawExternalMask is.
Sprites::drawPlusMask( 0,0, av1_plus_mask, 0 );
main: 1436b
av1_plus_mask: 2050b
19200b free
versus:
Sprites::drawExternalMask( 0, 0, av1, av1_mask, 0, 0 );
main: 1492b
av1+av1_mask: 2050b
19300b free
So, while the main function got smaller with drawPlusMask, overall the drawExternalMask call is 100 bytes smaller? o_O