Arduboy clone on RISC-V/FPGA

Yet another clone (sort of)!

I designed a performant and tiny RISC-V core called Kronos, and had it run on the iCEBreaker FPGA board (kinda popular in opensource hardware community). I decided that the best way to show off the core was to play games on it. Hence, I chose the minimalist Arduboy to port over!

Learning a lot from this forum and looking at some of the ports out there (espboy, esp8226 port, gamebuino, pokitto, adafruit), it took no longer than a Sunday to have it compile for risc-v and run the examples in the Arduboy2 repo. Tricky parts were dealing with the AVR asm code and Arduino specific calls. I kinda wish the api was decoupled from the implementation. It’s still a work in progress. Lot’s to cleanup and rework. I just rushed to play ArduBreakout…

3 Likes

I send this simple prototype board out to fab (3.8x2"). Should slide straight into the icebreaker’s dual side ports. I plan on using the onboard 16MB flash to host a library of games and a loader. I am calling this the krzboy for now.

On a side note, the fpga dev board is pricey (compared to uC boards) – but the fpga itself (lattice ice40up5k) is one of the cheapest and smallest (the WLCSP package is 2x2mm, but the 7x7mm QFN48 is more easier to deal with (solder)). It also has integrated 128KB of regular sram (aside from block ram that you traditionally find in fpga). The krzboy runs entirely off the 128KB internal sram. I just copy over the application on boot from the flash over to the sram.

I have read the “Development Ends” thread :frowning:. Aside from the hardware, the Arduboy2 library is serious gem and should live on forever. The sheer simplicity of the api enables ease of designing games using it. My dream of the library’s evolution was a pure game-engine api that looks like the PICO-8 or TIC-80, completely decoupled from hardware. Hardware could be anything (like adafruit feather, esp32, etc) - just needing an adaptor class for hardware calls.

In fairness, the API wasn’t designed for porting to other systems,
it was designed to target the Arduboy (and nothing else).
That considered, it’s actually quite easy to port most of it.

There’s only about 10-20 functions that have to be rewritten to get the majority of games working.
(Those that don’t rely on hardware details at least.)
A fair chunk of games will probably work soon after the lower end of the estimate.

I never completely finished it, but this has C++ implementations of all the Arduboy2 assembly functions:

(Some of which ended up in Adafruit’s Arduboy2 port.)

1 Like

I did look at your port (among others)! The porting process wouldn’t have been as easy without it.

In fairness, the API wasn’t designed for porting to other systems

True true. It’s just wishful thinking :sweat_smile:.

The Arduboy and the ecosystem around it is seriously unique. A marvel of community effort. Even if better hardware exists (adafruit feathers, esp32, etc) – without an easy-to-use game-engine and people’s willingness to build games on it, all those cpu MHz and KB of ram are just processed sand.

5 Likes

Currently, assembly code is only used in 5 places in the Arduboy2 library:

  • One is just a no-op to provide a delay, so it’s easy to replace or discard as necessary.
  • Three include the equivalent C++ code in a comment block accompanying them.
  • The Sprites class has one large block that is fairly well commented but doesn’t include equivalent C++ code. However, the SpritesB class is equivalent to Sprites, so can be used for both, and contains no assembly.

Therefore, the fact that Arduboy2 contains assembly code shouldn’t be a porting issue. The main difficulties would be its direct interfacing to specific hardware, and (as noted) dealing with the underlying Arduino environment if porting or replacing it is necessary.

Of course, the reason for using assembly code is to gain a speed or size advantage over what the compiler would produce with equivalent C++ code. This fact may have to be considered when porting, depending on the speed and storage resources in the new environment.

Another issue could be code that (perhaps unintentionally) has a reliance on being in an 8/16 bit environment, when porting to a 32 or wider bit environment.

However, many Arduboy sketches use other libraries in addition to Arduboy2. Some of these libraries could present porting difficulties.

Three include the equivalent C++ code in a comment block accompanying them.

I am extremely thankful for this part. The C++ code equivalent was a drop in replacement. As for the SPRITES_PLUS_MASK routine in the Sprites::drawBitmap - I looked at the esp8266_arduboy2 port. However, I now I am thinking I should just clean it up to use the SpritesB call – since it’s official.

I hope I didn’t give the impression that it was hard to port the Arduboy2 library :sweat_smile:. It was actually a breeze because of the exact reasons you listed. The code was a pleasure to read. Seriously readable. I simply removed all ASM references for C++. Took half a Sunday, no kidding.

In truth, I designed the SoC with awareness of how the Arduboy2 library was interfacing with the underlying hardware. For example, my SPI call is trivial memory mapped write, just like in the original. With a small hardware twist. To speed things up, I have a 128B buffer.

// Write to the SPI bus (MOSI pin)
void Arduboy2Core::SPItransfer(uint8_t data)
{
   // Non-blocking write if there's space in the TX Queue.
   while ((KRZ_SPIM_STATUS & 0x00ff) >= SPIM_TXQ_SIZE);
   MMPTR8(KRZ_SPIM) = data;
}

Another issue could be code that (perhaps unintentionally) has a reliance on being in an 8/16 bit environment, when porting to a 32 or wider bit environment.

This would only be worrisome if there was serious serious misaligned memory access. But, byte access (the most common memory iterator in the lib) can never be misaligned. So, things worked out!

However, many Arduboy sketches use other libraries in addition to Arduboy2. Some of these libraries could present porting difficulties.

This is going to be touch and go, I am afraid. Like Print and WMath for starters. Gonna port relevant parts of the ArduinoCore as I come across them.

The hardest part was the C++ pure virtual functions which needed re-tagetting the system calls (sbrk). And need to look at calls like srandom() from the avr-libc. Right now, I put workarounds for all of them in a rush to demo ArduBreakout. I need to clean this part up for reals.

@MLXXXp - thanks for the Arduboy2 library. If you ever build another game engine, I’d like to help in anyway I can.

1 Like

No, it’s more a case of overflow and wrap around. For example, with AVR using 16 bits for an int and another architecture using 32 bits for int. I think @Pharap has encountered this problem, but I haven’t looked into it.

jawdrop hadn’t even thought of that!!! Forgot that int isn’t 32b in avr-libc. Gotta look into this deeply. And map basic types from avr-libc. I just saw that typedef signed int int16_t in https://www.nongnu.org/avr-libc/user-manual/group__avr__stdint.html . Fuark…

1 Like

Also, if someone has allocated storage with a specific layout on the assumption that an int will occupy 16 bits.

All of these types of issues could occur in sketches as well as libraries.

Might be worth combining your efforts with the maker of this board (also based on the iCE40UP5K):

I painstakingly created the equivalent C++ once (as literally as possible).
It’s here if anyone wants to see it, but I don’t think I’d be prepared to guarantee that it’s bug free.

(On second thought, it was actually this that Adafruit used,
and I never got round to putting this into my other port.)

Indeed. There’s a lot more functions that use hardware-specific code.
Of those, things like generateRandomSeed() are likely to be overlooked.

Interesting… It looks like they’ve used the SpritesB code as a base and modified it.

Those should more or less work out-of-the-box unless long is a different size on your system.

I have ported as much of avr-libc as Arduboy uses,
but there’s probably the odd game that uses something that isn’t covered.

For srandom, if you’re on a 32-bit system (with access to the C or C++ standard libraries) you can cheat by using srand/std::srand.

Indeed I have.

There was (and still is) code in Sprites.cpp that depends on 16-bit integer overflow and it breaks when int is suddenly 32-bit.
(Specifically, ofs + WIDTH causes overflow, and the code relies on that.)

When I finally figured out the problem I wrote a longwinded tirade about solving it because it took me at least a few hours to get to the bottom of the issue.

For future ports I recommend just using the contents of SpritesB instead because it’s less hassle.
(If the CPU is powerful enough that is.)

I’m fairly certain that’s the only case though.
I can’t speak for other libraries of course, but for Arduboy2 that’s all.

I’ve never encountered a case like this, but I wouldn’t be surprised if one exists.

This is one of the reasons I try to encourage people to use the fixed width types instead.

2 Likes

Would you open an issue for this on Github?

running a virtual ATmega32U4 on an FPGA

I am trying to run this natively on risc-v rather than a soft avr core. However, I do see that @lulian has some plans for risc-v on his arduFPGA board too. Couldn’t find a port on the git though. So, perhaps this has not happened, yet?

True, except I was not using avr-libc – and didn’t realize what that meant until @MLXXXp pointed it out. Total noob mistake. Now all those long in the ArduinoCore makes sense. It’s 32b. Need to do some serious cleanup.

I finally realized what my problems were. It was newlibc + it’s crt0. Switching over to picolibc solved my vtable (Print) and C++ constructor (Arduboy) issues and bloated rand (WMath) problem (compare newlibc’s rand vs picolibc rand – no default dependency on reent – which malloc’d the rand state holding variable for thread safety). Straight from the author:

PicoLibc is library offering standard C library APIs that targets small embedded systems with limited RAM. PicoLibc was formed by blending code from Newlib and AVR Libc.

Great. Apparently there’s talks of making it the default libc for machine-mode/semi-hosted risc-v systems.

However, I should have looked at this first - https://github.com/Pharap/PokittoArduboy2Prototype - @Pharap, you’ve already done all the work!

generateRandomSeed()

Without an ADC, I just gave it the good old cycles since boot (for now). Which is a recipe for RNG hacking (anyone remember Golden Sun?). Or a simple PRNG on the fpga.

For future ports I recommend just using the contents of SpritesB instead because it’s less hassle.
(If the CPU is powerful enough that is.)

Aye. I’ll just use SpritesB and alias Sprites to it (using Sprites = SpritesB – as you mentioned here).

I am a bit concerned about the type width mismatch in games. I’ll have to typedef it or swap out all ambiguous types for fixed types with a script.

1 Like

Done:

Yeah, the only reason random exists in the first place is because rand is defined by the C standard to return an int, and the authors evidently wanted a 32-bit PRNG.

avr-libc is actually open source, so if you really wanted you could use the actual random implementation, but I doubt anyone’s depending on the implementation details in any meaningful way, so any old PRNG should be a suitable replacement.
Even something as crap as a linear congruential generator would probably be fine.

Yikes. std::rand() isn’t supposed to be thread safe,
anyone expecting it to be is being unreasonable as far as I’m concerned.

They could probably find a decent off-the-peg PRNG without too much looking.
Xorshift is particularly good for something small and cheap.

More or less.
The EEPROM code would probably have to be changed.

You don’t necessarily need an ADC,
but you do ideally need some source of nondeterminism.
An unconnected pin would probably be suitable if it measurably has a reasonable degree of noise.

It’s a seed generator, not a full RNG so it doesn’t have to be fast or extremely random, just enough to provide some variance between start ups.

I doubt it will be an issue for most games.
If it is, you can probably fix it with a simple text replacement.
unsigned int -> uint16_t, int -> int16_t et ecetera.

For most cases, the sudden size increase shouldn’t be an issue,
it’s only likely to be a problem if someone’s depending on integer overflow or if they’re doing something daft like using a hardcoded value when they should be using sizeof(Type).

ikr, when I saw the objdump for the rand with newlibc, I was like O_o?. Rand is not supposed to be thread-safe.

Can’t float pins with this fpga, the pad driver have a weak pull up. But, I am happy with picolib’s rand implementation. For seed I’ll xor some of the internal hardware performance counters (cycles ^ instr retired – using some factor of time as a seed).

I am going to clean start the port using your - https://github.com/Pharap/PokittoArduboy2Prototype - as a base. It’s for the best. Should have started with this in the first place. And, now that I know things will work out (as evidenced with my current rushed dirty port on breadbord), I’ll tackle this with patience. Can’t wait for the gamepad pcb to come in! It’s gonna look good.

If it is, you can probably fix it with a simple text replacement.
unsigned int -> uint16_t , int -> int16_t et ecetera.

That’s the plan! Some python script.

I also need to think how to use the flash for storing state (“eeprom”). Some small and simple flat filesystem, because it will be per game. And store all games! Any thoughts on that? You have on-chip eeprom and separate sdcard on the pokitto.

1 Like

As long as they’re unlikely to be the same value it should be fine.
x ^ x is always 0 and seeding a PRNG with 0 usually breaks things.

Otherwise have a look around for some hash combining algorithms.

It’s been quite some time since I was last working on it,
but let me know if you have any questions or issues.

If in doubt, an identifier, offset, size allocation table in a fixed location often works alright.

(Though I suppose size would always be 1024?)

I just mapped eeprom to eeprom for the sake of simplicity.

There is a so called ‘cookie’ system for Pokitto’s eeprom but there’s no decent documentation for how it works under the hood so I never use it.

1 Like

As far as I know, the plan is to have the board primarily run on RISC-V, but with the secondary ATmega core for Arduboy compatibility. Running Arduboy games on a RISC-V CPU is fun, but in the end just creates a lot of friction for users having to recompile every game they want to play - that is assuming they can even find the source to recompile from! Having a core with the potential for full compatibility that can just run precompiled Arduboy hex files is ideal…

If you go backwards in the commits to my Arduboy_MiSTer project, at one point before switching to the ATmega core I had also customised the Arduboy2 library to run on RISC-V (FPGArduino’s version), here it is - might not be a lot of use to you though? :sweat_smile:

Everything you said is true. But, I am not trying to make a generic FPGA arduboy solution at all. This endeavour is show off the Kronos RISC-V core. Hence, I am porting the arduboy2 library and compiling arduboy games for the platform (Kronos powered SoC) with the native risc-v toolchain (+picolibc). This port isn’t quite intended for folks who simply want Arduboy on an FPGA (MiSTer project is the obvious choice for this). This is more for risc-v soc builders (custom socs or litex builds) and people who want to mess around with risc-v.

Yesss! Thanks for the porting reference. I see you got rid of EEPROM and sound entirely?

Not quite - I did some hacky business by re-writing the sound functions to just continually pass across a value for the desired pitch (or zero for off) to a verilog square wave module… and for the EEPROM I just left a 512 byte block at the front of the actual compiled hex file and took advantage of MiSTer’s interface for writing files back to the SD card in 512 byte blocks!