Completely overhauled Arduboy


(Josh Goebel) #61

You could just mark the stack with dirty values and then measure the “high water mark” after a few iterations of your programs typical main loop (or x minutes, etc). I forgot where I saw this done the other day, but seemed like a great and easy idea.

Good point. :slight_smile: Although interrupts shouldn’t be going crazy deep on the stack IMHO.


(Shawn) #62

This is a common method to determining stack usage. If you have any real time debug capabilities like itm/dwt/etm over a serial programming interface there’s also options to monitor the stack and execution in run time but this isn’t often a luxury available to low end 8 bit chips.


(Pharap) #63

That will only give you an average, that won’t necessarily find the worst case scenario (i.e. the absolute highest that the stack will reach).
To be sure that you’ve found the worst case scenario you’d have to test every possible branch.
Testing every branch at runtime would be highly impractical.
For some programs even testing all the most common branches would be quite time consuming.

The ideal way to test would be to do a static analysis of the program, but C++ is a hard language to parse and that wouldn’t take inlining into account, so you’d be looking at a result that’s potentially deeper/worse than the actual result.

There most likely are static analysers that can measure the theoretical stack depth without optimisations, but they probably don’t handle AVR calling conventions.


(Kevin) #64

Atmega does have debug you just need the atmel programmer.


(Shawn) #65

Small correction: it isn’t the average per-se that the high water mark type testing measures but rather a sort of local maxima for a given execution branch. In industry a lot of work is done to analyze code using static and dynamic techniques with agreed upon standards like misra to help nip most problems in the bud at the design phase before they occur. And in the case where an extremely rare occurrence does get by to cause stack overflow then that is why watchdog reset hardware exists.


(Josh Goebel) #66

I dunno if I’d call it highly impractical, I guess it depends on the scope of your app. I think for most “simple arcade like games that would fit on this platform” playing thru the whole game once and winning would have a very high probability of testing almost every possible branch.

You have to remember you’re not necessarily looking for 100% coverage, only equal depth… so if a routine has 3 possible choices and you only ever call one that’s ok so long as the stack depth of the two alternates are pretty much the same.

No, it’s not perfect, but I think it would get your really, really close for hobby non-industrial/medical use.


(Shawn) #67

I agree that “good enough” testing by playing through the game a few times is good enough for this application. If you’re really worried, you can always get some beta testers trying to break the game for feedback.


(Josh Goebel) #68

I can’t imagine a real situation where your stack would grow huge during your render phase. During your calculation phase you technically have 1kb of extra RAM you could jiggle around and let the stack expand into. Then after whatever complex calculations are done you do your render stage… and that only needs to allow depth enough for the built in graphics call stack.

I’d imagine a typical (very broken down) render stack could look like:

  • renderAll() which calls:
  • renderBoard() which calls:
  • renderPlayer(), etc:
  • renderPlayerPortion()
  • drawRect()
  • drawLine()
  • drawPixel()

I’m not saying the core library does this exactly, this is just an imagined example to get an idea of depth. Of course you need to count up room for arguments on the stack, etc. And of course other people have pointed out interrupts etc…

But that is 7 deep there… maybe in real life it’s be 6 or 8… but it’s not like randomly your render calls should jump to 20 or 30 randomly unless you’re doing something very strange.

I’m just pointing out that this stuff should be testable (within a margin) and predictable.


(Pharap) #69

It’s an example of what the average behaviour will be.
(Average behaviour, not mathematical average.)

I’ve seen it happen in development of at least two games,
so it’s not so rare as to be irrelevant.

There are people here who push the boundaries of the RAM in the pursuit of good quality games,
so such occurances are eventually inevitable.

‘Simple arcade games’ aren’t the games that are going to be needing to check their stack usage.
Though I’d say there’s quite a lot of games for the Arduboy that are more than just ‘simple arcade games’.

Even games that seem like simple arcade games can be more complex than they appear, or can eat more RAM than expected.

But the alternatives aren’t necessarily the same.
Branches can be drastically different, especially when there are state machines within state machines, or complex calculations.
One branch could overflow purely because it has a few more variables than another.

If a significant amount of RAM is in use then having 7 stack frames instead of 6 can be the difference between normal execution and stack overflow.

That difference could be something as simple as a previously inlined function no longer being inlined because it’s hit the usage threshold.
I.e. the usage can change as the program evolves.


But at any rate, this is getting off topic.

The original point was that most people tend not to write games that make heavy use of RAM purely because there isn’t much RAM to work with.


(Josh Goebel) #70

The core assumption was that you’d have a shallow stack at RENDER time. Again, during compute time it’d be possible to dip into the VRAM as long as you dipped back out of it before it was time to render. That doesn’t cover all types of apps, but it covers a lot.

If a significant amount of RAM is in use then having 7 stack frames instead of 6 can be the difference between normal execution and stack overflow.

Well sure if someone insists on using every last byte with no buffer then they are going to have a hard time no matter what.

That difference could be something as simple as a previously inlined function no longer being inlined because it’s hit the usage threshold.

Someone that tight on RAM should really be taking more control of in-lining and such lower level things if they really want to push the limits that closely. Can’t really have your cake and eat it here too.


(Josh Goebel) #71

That difference could be something as simple as a previously inlined function no longer being inlined because it’s hit the usage threshold.
I.e. the usage can change as the program evolves.

Wouldn’t it be easy enough to build some sort of gauge into the development mode of the library that showed a red/yellow/green LED based on the stack level? (measured by high water mark that periodically resets)

I mean it doesn’t magically give anyone more RAM, but it would let you know if you were getting close during development… moving from green into yellow into orange, etc.


(Shawn) #72

I don’t do much avr development, but use MPLAB for pics which has a handy stack monitor I use in debug (I’m sure there’s something similar in atmel studio). Wont necessarily catch worst cases but gives a good idea of what your baseline average stack usage would be at runtime so you know how likely it is you are gonna run into problems.


(Josh Goebel) #73

I don’t think most Arduboy devs use Amtel Studio. Does that even work over USB or does it require special header wires for the debugging intereface?


(Shawn) #74

Atmel studio can program with something like an stk500 iscp programmer or you could always use the built in scripting to automatically invoke avrdude via cli after a build which would allow you to flash over usb/serial bootloader.


(Josh Goebel) #75

I was referring to how the live debugging part works - pretty sure that doesn’t happen over USB.


(Shawn) #76

Oh yeah you’d need a dedicated icsp for that