Large Number of Moving Objects

How many objects would be reasonable for arduboy to push at 30FPS? I have a game that has roughly 100 moving/colliding on screen objects, and with low hanging optimizations, I can get about 20-30 objects moving around colliding with each other with a decent frame rate. Beyond that I feel like I drop to < 15

Is 100 unrealistic? I can redesign the game to be lower number but larger objects, but this feels like a cop out.

Is there a good example out there of large numbers of objects moving/colliding at a good frame rate? Maybe I can look at the code and get some ideas.

To put it in perspective - I’m making an arduboy version of Robotron.
At level 1, you are looking at 20 enemies, 1 player, and 4 bullets.
The most populated level has 104 enemies, 1 player and 4+ bullets.

I have an alpha done, so now it is optimizing and adding enemy AIs in. I have a solution to the lack of a “twin stick”, but we’ll have to see if people like it.

I’m not quiet ready to share code, (it’s a bit messy currently), but just looking for what others have done or if this is an achievable goal, or I should just redesign for a larger sprite and smaller number of enemies. (currently sprites are around 4x4-5x5 in size, but I might just go 8x8 and reduce the number of enemies - honestly, this might be too small but I don’t have my arduboy in yet so I can’t see what it looks like on the device. :frowning: )

Some optimizations that are already in place:

  • only check collisions on N enemies per frame with the player, not all enemies (dependent on movement speed)
  • only check collision on bullets every M frames (dependent on movement speed)
  • no trig functions - limited cos/sin lookup table used only.
  • moving enemies towards the player just uses the ratio of the difference between player/enemy x_delta or y_delta multiplied by cos_table or sin_table. This is one of the few areas of floating point math (I’m trying to do it all in integer to keep it fast)

But honestly, it feels like just drawing 100 enemies is slow (drawPlusMask). I can turn off my collision code and it is still slow.

I get a HUGE FPS increase by not clearing the screen and doing some bitmap splatting (use drawErase to erase the current sprite then use drawPlusMask to draw it in the new location). This allows me to not have to redraw every enemy, just the ones that move. However, without access to a prior framebuffer, the drawErase function ends up blacking out pixels that should be considered “white” underneath (when there is overlap of enemies). Also the bullets become problematic since they are so fast and would potentially leave behind huge streaks. The bullets might just need more TLC - I think I’m trying to do too much without maintaining enough state (the bullets have a length, but it is determined by old pos -> new pos - I only track 1 set of x,y, so I’m doing math to try to figure out what the old pos was last frame (so 2 frames old)

When I see stuff like Catacombs running so fast, I gotta think, robotron should be possible…but I’ve also worked on this for about 8 hours…so I’m guessing there is a lot more I can do.

That will occupy roughly 1mm square of screen space.

Large enough to notice, but if your enemies start overlapping they’ll be hard to differentiate.


I can’t give you any performance guesses because it’s impossible to tell without actually running tests, but I can suggest some possible ways to make the calculations cheaper:

  • Use fixed points instead of floating points
    • There’s a library for that.
    • Note: may or may not be faster than floating points depending on the circumstances. If you can use a UQ8x8 or SQ7x8 you’ll probably get a decent speed boost.
  • Use some kind of spatial partitioning to cut down on the number of collision checks
    • E.g. BSP Tree, Quadtree
    • Be careful - avoid using new/malloc, dynamic memory allocation is just asking for runtime errors (i.e. crashes)

Yes, it probably would be.

Drawing is a very intensive operation because the screen uses an odd 1 bit per pixel format and the CPU has no barrel shifter so bit shifting is expensive.

It is a good technique, it may be worth the effort of trying to go down that route.

You may have to track the overlap.
The afforementioned spatial partitioning data structures may help with this.


I can think of one last alternative but it’s a little crazy.

You could limit yourself to a 64x32 frame buffer, which would allow you to double buffer and give you an extra 512 bytes to work with.

However you’d need a custom display function to present the frame buffer,
as well as custom drawing functions.
I happen to have the former lying around, but not the latter. The latter is the more complex to pull off.

Thanks for the fast reply Pharap.

I think doing a 64x32 framebuffer isn’t really viable. At that point, my tiny sprites get effectively either twice as big to remain comprehensible OR they get crushed down and lose resolution destroying comprehension.

So, my biggest problem for bitmap splatting are bullets. I’ll look at that deeper as it feels like it should be solvable, I just need an additional coord tracked. If that doesn’t work, I’ll go with larger sprites + smaller number of units. And it sounds like from a readability POV it might be necessary anyways. If I’m doubling the size of the sprites (4x4 -> 8x8) and halving the number of units, then I think I’m within striking distance of my perf target (30fps with ~50 units on screen tops, and I think I can hit that with some reasonable optimizations)

By pressing F3 in the emulator, you can change the virtual hardware device. With that, along with changing the size of your browser window, you should be able to get a screen size close to a real arduboy.

There’s 2.5K of RAM. A 128 x 64 screen buffer takes 1K, so there’s room for 2 full size buffers but the remaining 512 bytes probably isn’t enough for what you have in mind.

I use double buffering for my ArduboyLife sketch but don’t need much additional RAM.

Fill the cup controls a large amount of “particles” but each particle is rendered as a single pixel, so doesn’t require the overhead of a sprite drawing function.

A dedicated rendering function tailored specifically to the size and shape of your objects might help.

you can do an arduboy.display(CLEAR_BUFFER); to display the screen buffer and clear the buffer at the sametime at no extra cost.

hmm…not sure the arduboy.display(CLEAR_BUFFER) will help.

The boost in FPS comes from having 100 items on screen, then just taking, say 5 of them, and clearing just those locations on the screen, then updating their new position. Everything else stays as it was last frame, so I actually don’t want to clear the complete buffer at all if I’m using texture splatting. I need those other 95 images to stay right where they are to give the illusion they are still around. :slight_smile:

Hence why I suggested two 64x32 buffers.

64x32 is a quarter of the size of the usual 128x64 buffer,
so even with two 64x32 buffers the RAM usage is half of the usual 1KB,
giving an extra 512B on top of the usual 1.5KB,
bringining the amount of free RAM to 2KB.


Assuming that at the moment your approach is giving each enemy/bullet a ‘dirty’ flag to indicate which enemies need to be redrawn…
(Though if you are doing that, I’m not really sure it can be called ‘texture splatting’ because that’s to do with applying alphamaps.)

Have you instead considered using a list of ‘dirty rects’ and redrawing all enemies that are within the rectangles?
It means more collision tests, but collision checking should be significantly cheaper than drawing.

I wonder if you could change your design to have a bigger screen 256 x 128 or even bigger and it scrolls with the player movement. The nature of the game says that you will only need to render a portion of the enemies at any one time and by the time they crowd in on you it would be game over anyhow.

I cannot see how you would fit 104 enemies at 4x4 pixels onto the little Arduboy screen. Stretch the screen and it seems a bit more reasonable.

Collision detection becomes easier too - if the enemy is off screen then you can simply ignore it.

I am also interested in the trig functions - I am not sure why you would need them? I though you can only shoot in 8 directions?

Is there a good example out there of large numbers of objects moving/colliding at a good frame rate?

Look into sweep&prune, it’s a simple but effective way to handle many moving and (eventually) colliding objects. The simple idea is:

  1. for one axis of your game (e.g. X ) fill an array with the min and max of the object’s bounding volume/rect. e…g //that’s only an example, you can make it more efficient
    array[index++] = {pObject,pObject.min.x, true};
    array[index++] = {pObject,pObject.max.x, false};
  2. sort the array.
  3. now process the array, for every entry, if it’s the beginning, iterate until you hit the entry that is the end for the same object
    e.g.
for(int a=0;a<ElementCount;a++)
{
   if(array[a].begin==false)
     continue;
  for(int b=a+1;b<ElementCount&&array[a].owner!=array[b].owner;b++)
   if(Collision(array[a].owner,array[b].owner)
   {
     ...do your collision handling;
   }
}

(I hope my pseudo code is understandable, again, it does not suppose to be perfect, just an example :slight_smile: )

now, if your objects just move slightly or most don’t even move, you don’t need to re-do the previous points, you can just update the min/max entries.
e.g.

for(int a=0;a<EntryCount;a++)
   array[a].sortValue=array[a].begin?array[a].owner->min.x:array[a].owner->max.x;

some nice features of this:

  1. assuming the array is mostly sorted, you can run bubble-sort (or insert sort) on it, which usually is slow, but on sorted arrays it’s very efficient.
  2. compared to hierarchical approaches, it’s very simple to implement
  3. the memory usage is not related to the field size, you don’t pay xy or some nlog(n) cost. it only depends on the unit count.

some downside is that IF there are many units overlapping on the X axis, you’d still do a lot of checks. (There are more crazy sweep&prune implementation to work around this). But I think it should be fine in your game.

However, without access to a prior framebuffer, the drawErase function ends up blacking out pixels that should be considered “white” underneath (when there is overlap of enemies). Also the bullets become problematic since they are so fast and would potentially leave behind huge streaks.

An old trick is to xor the pixel, that way

  1. you can perfectly erase by xoring again.
  2. on overlapping elements (e.g. your unit walks on a white field), you’d still recognize the sprite, instead of everything being just white).

I hope you’ll share some WIP screenshots of your game :slight_smile:

@rapso,

When you post code in this forum, please enclose it in markdown code tags:

Start with a line containing three backticks followed by cpp
Insert your code starting on the following line.
On the line following your code, put another three backticks.
On a U.S. PC keyboard the backtick character is commonly on the key below the Esc key and to the left of the numeral 1 key, at the top left. If you can’t find it on your keyboard, you can copy and paste from the tags here:

```cpp
The first line of your code
More code
The last line of your code
```

I’ve added the tags to your post above but please do it yourself in the future.

thanks for the fix! next time I’ll use code tags! :slight_smile:

1 Like