Rooftop Rescue

(Bert Veer) #21

Yeah I thought about that kind of optimization but I deemed it too much nitty-gritty for this small project. The aim was to get something finished in a reasonable time, which kind of worked out (I spent about two weeks which is way more than I allocated). If I find the time I might improve it.

Thanks for your elaborate help.

1 Like
(Josh Goebel) #22

The warning is only because the compiler can’t be sure how much stack space you are going to use. (I supposed it could get close for a lot of programs that don’t use recursion, but…) So instead Arduino just yells at you whenever it thinks you’re anywhere near the “danger zone”. If your programs runs just fine however (meaning the stack never overflows) then you’re still good.

You could even add some code to see how much space you really had to spare if you wanted by pre-loading the stack area with a constant and then after your program has run a while checking to see how much area at the bottom of the stack is still “undisturbed”…

(Bert Veer) #23

According to the compiler output I still have some 400+ bytes left for a stack, which should be sufficient. So I think I will leave it to that and try to be even more conservative next time.

(Pharap) #24

Using bitflags isn’t really ‘nitty-gritty’ when you’re writing for such a limited processor.

Really using bitflags/bitvectors is something that should be done even on more powerful systems,
but people these days tend to take the stance that “everyone’s got at least 4GB of RAM to use, it’ll be fine”,
so they end up writing programs that chew up far more RAM than necessary.

If I find time to make some memory savings would you be interested in a PR or should I just let sleeping dogs lie?

I remember one of the problems during the development of @filmote & @Vampirics’s Space Taxi was the stack overflowing.

I briefly had the same problem in Minesweeper because of my EEPROM code,
but I fixed it by checking the available stack before trying to allocate a buffer that might have otherwise tipped the stack.
(Technically it’s still breaking standard C++ rules,
I meant to go back and fix it properly but never got round to it.)

When using new or when replacing new with static allocation?

#25

I had a go on adding the additional A button control to lower/raise the rope besides the up /down buttons and fixed some minor issues. Made a PR for if you want to add them.

1 Like
(Bert Veer) #26

I would agree that memory is unnecessarily wasted nowadays. I started in the early 80’s myself so I’m somewhat familiar with low resources. I guess that’s one of the reasons I bought the Arduboy.

While this is my first project for the platform, I coded it up quickly and initially wasn’t planning to release or maintain the code. But if you feel an urge to improve on it, be my guest :slight_smile: Perhaps it’s not a bad thing for me to do some community-coding, which in itself is new to me.

Let’s see where this is going.

Note about the bitfields: there are about a dozen bools in all the classes, I don’t think I can find a way to replace them with less than 12 bytes of code. Any ideas on that?

(Bert Veer) #27

I merged your changes, thanks for the support.

However the single button action does make the game easier to play, which is not really what I intended. It basically renders the up/down controls cumbersome. Perhaps then using up/down as toggle buttons as well?

1 Like
(Bert Veer) #28

This was the smallest I could come up with, but it’s obviously more than 12 bytes.

enum e_boolvar 
{ 
    rope_stopped_moving, 
    chopper_moving, 
    chopper_exploding, 
    (...),
    BOOLVAR_END 
};

namespace Bool
{
    uint8_t vars[1+BOOLVAR_END/8];
    
    void set(e_boolvar var, bool value) {
        if (value) vars[var/8] |= (1 << (var % 8));
        else vars[var/8] &= ~(1 << (var % 8));
    }
    
    bool get(e_boolvar var) {
        return vars[var/8] & (1 << (var % 8));
    }
};
(Bert Veer) #29

Sorry for incorrectly quoting, I made a mess of this topic :slight_smile:

(Pharap) #30

If there’s only 12 bytes of progmem left then the answer is to find a way to free up progmem.

I’ll have a look if I get chance.

On a quick skim I suspect:

if (++counter % frames_wait == 0) {
	current_frame = (current_frame+1) % frame_count;
}

Would use less memory as something like:

++counter;
if((counter % frames_wait) == 0)
{
	if(current_frame < frame_count)
		++current_frame
	else
		current_frame = 0;
}

And perhaps offset_y = (rand()%10)/8; would be cheaper as offset_y = (((rand() % 10) < 8) ? 0 : 1);.
(Though if the odds don’t have to be 8/10 and 2/10 then it can be made even cheaper by using odds based on powers of two.)

Division and modulus operations are expensive because AVR chips don’t have a hardware divider.

Unscoped enums ought to be avoided in favour of scoped enums (enum class), they’re much safer.

The way I’d normally implement specific bitflags would be something like this:

enum class RopeFlags : uint8_t
{
	None = 0,
	Moving = (1 << 0),
	StoppedMoving = (1 << 1),
	Accessible = (1 << 2),
	Occupied = (1 << 3),
	DudeEnter = (1 << 4),
};

constexpr RopeFlags operator |(RopeFlags left, RopeFlags right)
{
	return static_cast<RopeFlags>(static_cast<uint8_t>(left) | static_cast<uint8_t>(right));
}

RopeFlags & operator |=(RopeFlags & left, RopeFlags right)
{
	left = (left | right);
	return left;
}

constexpr RopeFlags operator &(RopeFlags left, RopeFlags right)
{
	return static_cast<RopeFlags>(static_cast<uint8_t>(left) & static_cast<uint8_t>(right));
}

RopeFlags & operator &=(RopeFlags & left, RopeFlags right)
{
	left = (left & right);
	return left;
}

constexpr RopeFlags operator ^(RopeFlags left, RopeFlags right)
{
	return static_cast<RopeFlags>(static_cast<uint8_t>(left) ^ static_cast<uint8_t>(right));
}

RopeFlags & operator ^=(RopeFlags & left, RopeFlags right)
{
	left = (left ^ right);
	return left;
}

struct Rope
{
	// Blah blah
	RopeFlags flags;
	// Blah blah
	
	bool is_moving() const
	{
		return ((this->flags & RopeFlags::Moving) != RopeFlags::None);
	}
	
	// etc
};

And then of course instead of having long chains of = false you can just this->flags = RopeFlags::None;

(Bert Veer) #31

Although I like your solution (never realized you could type an enum), it will still compile to a considerable amount of code. In the end you need some critical mass of booleans for such a scheme to be profitable.

Also, thanks for reminding me division is not free :slight_smile: Now that we’re at it, how are the trig functions implemented? Anything to gain there?

(Pharap) #32

Are you talking about 12 bytes of progmem or 12 bytes of RAM?

Are where did you pull the ‘12’ from, I compiled the master branch and got:

Sketch uses 24698 bytes (86%) of program storage space. Maximum is 28672 bytes.
Low memory available, stability problems may occur.
Global variables use 2136 bytes (83%) of dynamic memory, leaving 424 bytes for local variables. Maximum is 2560 bytes.

So there’s more than 12 bytes left of both.

It’s a C++11 thing.
C++11 drastically overhauled the language and introduced a large number of useful features.

Are you sure about that?
enum classes are implemented as if they were just plain integers,
functions as small as x |= y are almost always inlined and only use about 2-3 instructions.

Are you sure you want to know? It’s written in assembly.

If you want to try switching to fixed point artihmetic then that’s one option and I know a good library for that,
but it would be a pretty big change.


I started having a go at changing the flags, but as I was doing so I noticed a few things.

I noticed stopped_moving was always equivalent to !moving and tried to get rid of it,
but annoyingly that increased progmem by 30-40 bytes, so I left it.

However I also noticed that accessible is always equivalent to:

(!moving && !occupied && (length > 0))

so replacing the variable with an is_accessible() function saved 26 bytes of progmem and 1 byte of RAM.

(Bert Veer) #33

Well I see now that it isn’t really a tradeoff since the code ends up in progmem anyway. The whole memory model still has to sink in a little with me. So thanks for that insight. I’m not afraid of a little assembly though :slight_smile:

The stopped_moving variable was used as a sound-off flag, iirc. You may find more redundant constructions, as you did, however some were intended to keep the game and render logic separated.

Again, this was coded up quickly, so I’m not sure if it’s entirely worth the effort but in any case it’s a great way for me to learn the platform. So if you have any more questions or suggestions that would be welcome.

Cheers

(Pharap) #34

It’s more than just a little assembly.
Here’s a small sample:

sin.S
/* Copyright (c) 2002  Michael Stumpf  <mistumpf@de.pepperl-fuchs.com>
   Copyright (c) 2006  Dmitry Xmelkov
   All rights reserved.

   Redistribution and use in source and binary forms, with or without
   modification, are permitted provided that the following conditions are met:

   * Redistributions of source code must retain the above copyright
     notice, this list of conditions and the following disclaimer.
   * Redistributions in binary form must reproduce the above copyright
     notice, this list of conditions and the following disclaimer in
     the documentation and/or other materials provided with the
     distribution.
   * Neither the name of the copyright holders nor the names of
     contributors may be used to endorse or promote products derived
     from this software without specific prior written permission.

   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
   AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
   IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
   ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
   LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
   CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
   SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
   INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
   CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
   ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
   POSSIBILITY OF SUCH DAMAGE. */

/* $Id: sin.S 2473 2015-04-09 08:10:22Z pitchumani $ */

#if !defined(__AVR_TINY__)

#include "fp32def.h"
#include "asmdef.h"

/* float sin (float A);
     The sin() function returns the sine of A, where A is given in radians.
 */

ENTRY sin
	push	rA3
	XCALL	_U(__fp_rempio2)
	pop	r0
	sbrc	r0, 7
	subi	ZL, -2
	XJMP	_U(__fp_sinus)
ENDFUNC

#endif /* !defined(__AVR_TINY__) */
fp_rempio2.S
/* Copyright (c) 2002  Michael Stumpf  <mistumpf@de.pepperl-fuchs.com>
   Copyright (c) 2006  Dmitry Xmelkov
   Copyright (c) 2008  Ruud v Gessel
   All rights reserved.

   Redistribution and use in source and binary forms, with or without
   modification, are permitted provided that the following conditions are met:

   * Redistributions of source code must retain the above copyright
     notice, this list of conditions and the following disclaimer.
   * Redistributions in binary form must reproduce the above copyright
     notice, this list of conditions and the following disclaimer in
     the documentation and/or other materials provided with the
     distribution.
   * Neither the name of the copyright holders nor the names of
     contributors may be used to endorse or promote products derived
     from this software without specific prior written permission.

   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
   AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
   IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
   ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
   LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
   CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
   SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
   INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
   CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
   ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
   POSSIBILITY OF SUCH DAMAGE. */

/* $Id: fp_rempio2.S 2473 2015-04-09 08:10:22Z pitchumani $ */

#include "fp32def.h"
#include "asmdef.h"

/* <non_standart> __fp_rempio2 (float x);
     The __fp_rempio2() function computes the remainder of dividing
     absolute value of x by Pi/2. The return value is x - n*Pi/2, where
     n is the quotient of abs(x)/(Pi/2), rounded towards zero to an integer.
   Output:
	rA3.rA2.rA1.rA0.rAE	- flt40_t remainder
	ZL			- low byte of n
 */

#define	HI40_PIO2	0x3FC90FDA	/* (flt40_t) Pi/2	*/
#define	LO40_PIO2	0xA2

FUNCTION __fp_rempio2

0:	XJMP	_U(__fp_nan)

ENTRY __fp_rempio2
  ; split and check finite
	XCALL	_U(__fp_splitA)
	brcs	0b		; only finite numbers are valid
	clt			; ignore a sign
  ; init division result
	ldi	ZL, 0
  ; extend A
	clr	rAE
  ; check (and modify) exponent
	subi	rA3, hhi8(HI40_PIO2 << 1)
	brlo	5f		; fabs(A) < 1.0 radian, C is set
  ; prepare loop
	ldi	rB0,  lo8(HI40_PIO2)
	ldi	rB1,  hi8(HI40_PIO2)
	ldi	rB2, hlo8(HI40_PIO2 | 0x800000)		; + hidden bit
	rjmp	1f

.Loop:	lsl	ZL
	lsl	rAE
	rol	rA0
	rol	rA1
	rol	rA2
	brcs	2f
1:	cpi	rAE, lo8(LO40_PIO2)
	cpc	rA0, rB0
	cpc	rA1, rB1
	cpc	rA2, rB2
	brlo	3f
2:	subi	rAE, lo8(LO40_PIO2)
	sbc	rA0, rB0
	sbc	rA1, rB1
	sbc	rA2, rB2
	inc	ZL
3:	dec	rA3
	brpl	.Loop

  ; Normalize, we know that rA2.1.0.E >= 0x0E. You can check this with
  ; a test program below.
	cpi	rA2,0x80
	brcc	5f
4:	dec	rA3
	lsl	rAE
	rol	rA0
	rol	rA1
	rol	rA2			; C := 0
	brpl	4b

5:	sbci	rA3, hhi8((HI40_PIO2<<1) + 0x01000000)	; undo the subi 0x7f
	XJMP	_U(__fp_mpack_finite)
ENDFUNC

#if 0
/* This is a test program to find the smallest value of rA2.1.0.E after
   division. The nonzero value gives a garanty that normalization loop
   is finite.	*/

#include <stdio.h>
#define	MNT32_PIO2	0xC90FDAA2

int main ()
{
    unsigned long rA210;
    unsigned long rA210E;
    int rA3;
    unsigned long c;
    unsigned long amin = 0xffffffff;

    for (rA210 = 0x800000; rA210 <= 0xffffff; rA210 += 1) {
	rA210E = rA210 << 8;
	c = 0;
	rA3 = 127;	/* this is max for finite number	*/
	goto m;
	do {
	    c = rA210E & 0x80000000;
	    rA210E <<= 1;
	  m:
	    if (c || (rA210E >= MNT32_PIO2))
		rA210E -= MNT32_PIO2;
	    if (rA210E < amin) {
		amin = rA210E;
	        printf ("min of rA210E: 0x%08lx\r", amin);
		fflush (stdout);
	    }
	} while (--rA3 >= 0);
    }
    putchar ('\n');
    return 0;
}
#endif
fp_sinus.S
/* Copyright (c) 2002  Michael Stumpf  <mistumpf@de.pepperl-fuchs.com>
   Copyright (c) 2006  Dmitry Xmelkov
   All rights reserved.

   Redistribution and use in source and binary forms, with or without
   modification, are permitted provided that the following conditions are met:

   * Redistributions of source code must retain the above copyright
     notice, this list of conditions and the following disclaimer.
   * Redistributions in binary form must reproduce the above copyright
     notice, this list of conditions and the following disclaimer in
     the documentation and/or other materials provided with the
     distribution.
   * Neither the name of the copyright holders nor the names of
     contributors may be used to endorse or promote products derived
     from this software without specific prior written permission.

   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
   AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
   IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
   ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
   LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
   CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
   SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
   INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
   CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
   ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
   POSSIBILITY OF SUCH DAMAGE. */

/* $Id: fp_sinus.S 2473 2015-04-09 08:10:22Z pitchumani $ */

#if !defined(__AVR_TINY__)

#include "fp32def.h"
#include "asmdef.h"

/* float __fp_sinus (flt40_t A, ZL);
 */

#define	HI40_PIO2	0x3FC90FDA	/* high 4 bytes Pi/2	*/
#define	LO40_PIO2	0xA2

ENTRY __fp_sinus
	push	ZL
	
	sbrs	ZL, 0
	rjmp	1f
	ldi	rBE, LO40_PIO2
	ldi	rB0,  lo8(HI40_PIO2)
	ldi	rB1,  hi8(HI40_PIO2)
	ldi	rB2, hlo8(HI40_PIO2)
	ldi	rB3, hhi8(HI40_PIO2 | 0x80000000)
	XCALL	_U(__addsf3x)

1:	XCALL	_U(__fp_round)

	pop	r0
	inc	r0
	sbrc	r0, 1
	subi	rA3, 0x80

	ldi	ZL, lo8(.L_table)
	ldi	ZH, hi8(.L_table)
	XJMP	_U(__fp_powsodd)
ENDFUNC
	
	PGM_SECTION
.L_table:
	.byte	5
	.byte	     0xa8,0x4c,0xcd,0xb2	; -0.0000000239
	.byte	0xd4,0x4e,0xb9,0x38,0x36	;  0.0000027526
	.byte	0xa9,0x02,0x0c,0x50,0xb9	; -0.0001984090
	.byte	0x91,0x86,0x88,0x08,0x3c	;  0.0083333315
	.byte	0xa6,0xaa,0xaa,0x2a,0xbe	; -0.1666666664
	.byte	0x00,0x00,0x00,0x80,0x3f	;  1.0000000000
	.end				

#endif /* !defined(__AVR_TINY__) */

That’s not all of it, you’d also have to look at fp_powsodd.S, addsf3x.S, fp_split3.S and a bunch of others.

If you want to see the whole code, the source is here:
http://download.savannah.gnu.org/releases/avr-libc/

It’s the avr-libc-2.0.0.tar.bz2 file and all the floating point code is in avr-libc-2.0.0/libm/fplib.
There’s 79 files in all, though admittedly you’d probably only need to read about 10-20.

It’s still possible to have a degree of separation without introducing a second variable in some cases.

Using functions is a good way to have that separation becuase then the interface is separate from the implementation.

Most short functions (especially ones that only set or access a variable) will be inlined,
so you don’t need to worry about the cost most of the time.

If you’re not all that interested then I’ll probably leave it for now since I’ve got other projects on the go.

I haven’t checked if the code still works after making my changes but if you’re interested in them I can make a PR.

(Bert Veer) #35

Hi Pharap, I’m certainly interested in your remarks and will study them carefully. You’ve given me valuable information so far.

But time is precious indeed and I have to move my focus elsewhere, unfortunately. I wouldn’t want to give you the impression I’m wasting your time. Still, Arduboy programming was really fun so I might start another project one day.

If you have any merge requests or tips I will gladly accept them.
Thanks for the elaborate suggestions.

(Pharap) #36

Glad to hear it.

If you have any questions about anything, feel free to ask.

My time is also precious, which is why I was double checking whether there’s interest or not.

I believe it certainly would be possible to reduce the memory usage further,
but if nobody is asking for it then there’s not much point.

I enjoy helping people, but I’d rather spend time helping where the help is wanted or needed than working on something there’s no interest in.

I’ll make a PR for the two changes I found since it would be a shame for them to go to waste:

I don’t think it will cause any bugs, but I haven’t tested because I haven’t had chance to load either version onto my Arduboy, so you might want to double check before accepting it.

(Kevin) #37

I just played this for like a minute, it’s really good!