Smart Response XE Re-purposed into Arduboy

(Larry Bank) #263

I think I’ve described it well enough. To see what’s going on yourself, start here:

It has a good blow-by-blow description of the data flying back and forth with avrdude. You should get a second XE unit to do some wireless experiments on your own. There’s nothing magical about the wireless system provided by the ATmega128RFA1. You still have packet loss due to collisions and interference. The zigbee frequencies lie on top of wifi (and lots of other 2.4Ghz devices). Like I said, the wireless works well, but STK500 is a s**t protocol that isn’t designed to work over wireless. You can speculate all you want about how things should work, but it’s pretty meaningless until you actually try it for yourself.

(Josh Goebel) #264

Ugh. 2.4ghz is NOT what it used to be. It’s like terrible here. That alone might be a huge part of the problem.


I couldn’t resist to test my interlaced idea in the weekend so I took apart my donated smart response xe (Thanks again @uXe ) and soldered up a 6 pin ICSP connector so I can keep it hooked up and powered by my ICSP programmer all the time.

One of the first things I noticed was that the PaintScreen() didn’t handle the clear option correctly (you can see this in the logo animation) and after looking at the disassemly I found out that each SPI transfer (inner loop) took twice the amount of time then actually was needed (~36 cycles instead of 18) so its actually is more than 34 times slower (not counted outer loops) compared to Arduboy (I was curious why the animations where so slow)

I did some minor optimisations that made it rougly twice as fast and then added interlacing making it another two times faster. It looks pretty good to me and games become more playable with the extra speed gained :slight_smile: I’ve also set invert as default (white on black same as Arduboy) as moving white pixels on a black background are easier noticable.

Here’s the modified arduboy2core.cpp test and a hex file of Virus LQP-79 as test game.

I will work on better (assembly) optimized versions for both 3x2 and 2x2 pixel sizes later. But need to switch back to another project first now.

Virus LQP-79 looks and plays pretty cool on the SR XE

(Simon) #266


(Josh Goebel) #267

Did you try the code I posted for rendering 3x2? Also did you make sure you’re using SPI CLK/2? That would explain the 36 vs 18…

(Larry Bank) #268

What do you guys think about my next modification idea for the XE? Adding a CP2102 USB to serial adapter. It can serve to power the unit since it has a built-in 3.3v regulator and connect to Serial0 or Serial1. It will need to be epoxied to the case since a lot of force is used when inserting/removing the usb cable.

These cost 98 cents each from AliExpress. The LEDs will need to be removed since they interfere with the other operations of those GPIOs, but I use these to connect to all my Arduino projects.

(Larry Bank) #269

Video evidence shortly…

Video demo:

(Josh Goebel) #270

@bitbank Congrats on getting it working!

Here is what I’ve been working on:


I’m affraid this week is too busy for me to cook something up for the jam.

No I missed it. But I scrolled back and had a look.

SPDR = x;
while (!(SPSR & _BV(SPIF));

valuable cycles are lost just waiting. You could execute some code in the mean time (If the code takes more then 16 cycles you can skip the wait completely)

FYI if a NOP is added after setting SPDR the code would execute faster (one wait loop less) taking 24 cycles

I like what you did with the vertical drawing. But is it faster?

  • The original code only sets row address once every line thats 64 * 6 bytes SPI data
  • Your code sets column address for every column. Thats 128 x 6 bytes SPI data

The savings must be the unrolled loops then.

I forgot to paste my interlaced version earlier so here it is:

void Arduboy2Core::paintScreen(uint8_t image[], bool clear)
  static uint8_t oddEvenRows;
  uint8_t  y = 4;
  uint16_t i=0;
  oddEvenRows = ~oddEvenRows;
  if (oddEvenRows) y = 5; 
  while (y < (HEIGHT*2))
    for(uint8_t bit = 0x1; bit != 0; bit <<=1)
      sendLCDCommand(0x2B); // Set Row Address 
      sendLCDData(0x00);    //
      sendLCDData(y);       // first row
      sendLCDData(0x00);    //
      sendLCDData(0x83);    // last row
      for (uint16_t x = 0; x < WIDTH; x++) //used uint16_t as uint8_t doesn't compile properly
        uint8_t byte = 0;
        if (image[i+x] & bit) byte = 0xFF;
        SPDR = byte;
        if (clear && (bit == 0x80)) image[i+x] = 0;
        //while (!(SPSR & _BV(SPIF))); // no need to wait as code to loop takes more cycles than a SPI transfer
    i += WIDTH; 
  while (!(SPSR & _BV(SPIF))) { } // wait for the last byte to be sent

It could have been more optimized though but it was the interlacing I was focused on.

I was thinking of adding an Pro Micro as ICSP and USB serial bridge.

(Josh Goebel) #272

I don’t think 384 is really noticeable out of a total of 17,000 bytes… but I was also assuming the prior code wasn’t saturating the SPI bus - it’s doing like the least performant thing possible (ie, TONS of extra work). It’s possible there is CPU burn happening when no SPI is being sent at all. One would actually have to benchmark both algorithms on the actual hardware to confirm for sure.

If you don’t have to add any wait states for SPI then it’s pretty likely you’re not fully saturating the SPI bus. Either that or your timing is exactly correct, which seems less likely.


Point taken

I don’t see how you can fully saturate the SPI bus by adding wait states as the wait state itself wastes cycles that could have been used for the next SPI transfer.

(Josh Goebel) #274

No. That’s not what I’m saying. I mean that if you don’t have to pad your code with NOP, etc. that you still aren’t likely pushing SPI as fast as technically possible. You code is using too much CPU and therefor there are clock cycles where SPI is going completely unused.

Optimally you’d have a few NOPs… which means your CPU code is very fast and you’re waiting on the SPI and then pushing another byte immediately… if you have 0 NOPs then either your timing is perfect… or you’re burning CPU cycles while SPI is idle.


Ah I see. what I ment by adding the NOP was to get the following code:

SPDR = x
asm volatile(“nop”);
while (!(SPSR & _BV(SPIF)));

The nop is there to ‘align’ the read of SPSR to a multiple of 2 cycles. without it it would just miss the state change and cause an extra wait loop wasting 4 extra cycles.

Sending continious SPI data is possible at 18 cycles per SPI transfer (including the OUT SPDR instruction) so thats 17 cycles to spent on usefull code rather then NOPs. Getting this exact timing though means using inline assembly.

(Josh Goebel) #276

Sure. I was just pointing out if someone isn’t writing assembly and counting the cycles then they probably aren’t saturating the bus completely. Having to add NOPs is a guarantee you are.

And of course you’d never use that silly “polling” strategy. You’d simply pad properly to being with and then there would be no need for polling.

(Larry Bank) #277

I had forgotten that the closed XE units are always “on”. I just added a function to my support library to power down the XE and wake it up when the power button is pressed (SRXESleep).


Would be interesting to see what screen it has. Who manufactures it.
Install a screen like this, on an arduboy!

(Larry Bank) #279

This screen isn’t really suited for gaming. You can get a bunch from China of various sizes, but all of the monochrome LCDs I’ve tested have very slow refresh rates.


Do you know what lcd would be really cool for the arduboy, the sharp memory lcds. They come in sizes up to a few inches, have super contrast, fast refresh rate, ultra low power consumption, and even have an option where the unlit pixels are super reflective like a mirror. They are most widely used in smart watches like the pebble. The problem is I don’t know of a good supplier except from the manufacturer in bulk. In one off quantities I’ve seen them floating around on ebay or digikey.

(Josh Goebel) #281

Is anyone playing with this in the context of having a USB Zigbee on your computer that talks to the unit? I’d be much more interested in the wireless stack if I could have software on my computer talk directly to a unit without having to use another unit as a bridge.

I haven’t been paying that much attention to the “coordinator” unit for the XE series… it has USB serial, yes? I presume it doesn’t have a bootloader that lets you re-flash it though, right?

Edit: Or perhaps I’d have better luck with buying a Zigbee module for the Uno or something and going that route?

(Holmes) #282

Aren’t there generic USB Zigbee thingies? Flashing a device over Zigbee wireless would be awesome. Someone could write a script that grabs the .HEX and does just that. :grinning: