I’ve always wanted to make a raycaster program, I think they’re super cool. I used the tutorial from ‘Lodev - Raycasting’, but it had a lot of problems (and some bugs!). I got the performance pretty reasonable using @Pharap’s Fixed point library, but it still wasn’t fast enough for me, so I ended up spending days tweaking it (btw @brow1067, Ardens is a masterpiece). I might make a separate blog post talking about all the optimizations; I got it fast enough to eventually add fake shading!
I’ll work on getting an arduboy package + title image soon-ish.
Of course! For one of the maze types, I looked up a constant memory algorithm called Eller’s algorithm, and the other one I just wrote custom. I needed an algorithm which didn’t use a stack, and nearly all of them do (or some other large dataset close in size to the maze itself). Eller’s algorithm only needs enough data for one extra row, which I can definitely do!
Nice! My algorithm I made was stack free and only required 2 bits per maze space (wall, floor, or unvisited). It would then solve the maze also stack free using the same bits as before (wall, floor, solution, unvisited).
Yeah, I don’t think 3D is doable for larger games, but if you’re willing to halve the framerate, I feel like you can still do a LOT. You can also simply halve the resolution, which along with the halved framerate will give you an absolute deluge of performance. Plus, a lot of processing is tied up with the shading; you may be able to simply trade shading for textures. It’s something I want to try, I just wanted to make a game in the “smooth” version first. I originally had it close to 60fps before I added the floor + better shadows.
Edit: I was mistaken; I made the shading conditionally compiled and it only saved like 3% of the total CPU according to ardens. You get a bigger saving dropping the frame rate from 45 to 40 lol
Thank you! I read the blogpost of the person who got the teapot working on arduboy, that one is WAY more impressive 3D Rendering on an Arduboy – a1k0n.net. This is just basic raycasting like others have done. I just made some shortcuts, like:
Removing 20% of the screen for the menu (but it was actually to increase fps by 20% lol)
Precalculating as much as possible before casting rays (especially division)
Some funny reductions in precision (and accompanying hacks) to get everything down to 2 byte integers
Fake floor
Very low draw distance masked by a cheap shadow effect
I’m working on it! It did indeed require half the framerate. I’m hoping to find ways to improve that a bit (it looks a bit better on device I think). Textures are 16x16, I don’t think I can go higher than that (it would require 32 bit math I think?)
I think it’s kinda hard to tell in gif form but I was able to optimize it up to about 28fps from 20 in the first one, but there’s absolutely no headroom for any extra logic at that framerate (but still, if you don’t need anything other than the map and checks on button presses… very usable). But, maybe you can tell in the video: part of it required that I greatly reduce the texture rendering precision. It’s fine far away but if you see when I get even a little close to the wall the texture gets kinda bad, and it gets worse the closer you get. But maybe you could say it has “ps1 vibes” lol. Anyway, with the more precise textures and bringing the framerate down to 20-22, you have quite a bit of headroom, about 14% cpu on average. IDK how much you can fill with that but I intend to try some things.
Edit: I need to just draw some more textures so it looks cool. Single texture is fine and all, but it does support up to 255 different wall tiles, so…
I must admit I always wanted to try remaking Eternal Labyrinth on here, but you’ve done a really nice job of making the same basic concept. Looking forward to adding this to my Arduboy as I imagine it’ll get a lot of play time.
Speed achieved. 35 fps, can lower to 30 to have some headroom for logic. I think if I make a game it’ll be separate from the original posted, as a lot of compromises had to be made.
Thank you @brow1067 for the division table, I honestly just didn’t want to generate it myself so I copied yours lol. I was able to use a more optimal lookup because the main divisions are x < 1, which is just a direct lookup in the table. I might simply copy your division function and add other divisions, but those left and right shifts by 8 and 16 in your code are scary…
Oh I have a question: so there’s a setting which greatly increases performance at the cost of texture quality, particularly close up. It’s a compiler flag, so I can’t make it a setting in game. I wanted to know which one you think is better: the first one is the “accurate” version and the second is the “inaccurate”. Get closer to a wall to see what happens.
The difference in performance is about 15%, which at this level is quite a lot. Also, definitely try it on actual hardware, it’s way cooler imo and the dither effect is less jarring.
Personally, I prefer the first “accurate” one. I can’t notice any difference in speed between the two but the texture quality is way better in the first one, so I’d definitely say that the first one is better.
They don’t end up being shifts, just moving bytes around.
That being said, ArduGolf predates Ardens, and I’m sure more performance could be squeezed out of parts of it like the division routine, if I were writing it now. In particular, I’ve discovered that the face sorting takes as much (or more) time as the rasterization itself.
Ahh ok, so the compiler is being smart about it. Great, I should’ve just tried it! Thank you
I tend to agree, just wanted to see what others thought. I get the feeling that, unless I find more performance lying around, I’ll have to drop it back to like 24 to get some of these features I want
Now I find myself questioning whether that behaviour has changed between compiler versions.
A few years ago FManga found that doing >> 8 causes the compiler to divide by 256 instead of discarding the lower byte:
But if it’s actually discarding them now, either the compiler has been updated to account for that in the meantime or something else was going on in that original case.
(The ‘jam entry’ mentioned in the linked comment is this one in case anyone is wondering.)