Suguru Generator

I am looking to create a Suguru game for the Arduboy.

As part of this, I am trying to work out the best approach to generate puzzles. I have am assuming a number of things:

  • fixed puzzle sizes, maybe 8x6, 10x7, 12 x8. Actual sizes might be restricted by the screen resolution and graphics I choose!
  • predetermined layouts for the blocks. I am hoping to have many layouts per puzzle size stored in PROGMEM for variety.

So, the question is how to generate these.

I am assuming I will start by populating a complete puzzle then removing pieces to obtain a starting point.

I can do it using a ‘brute force’ approach across the entire board or (more likely) by progressing from one block to another by randomly placing the tiles and then comparing the placement against the previously placed blocks. If a block cannot be populated randomly, then entire process is aborted and the started again.

This might end up in an endless loop! Does anyone have any other ideas on how to populate the board?

Once populated, I can then remove pieces to work back to a starting point … but how to do that logically so that the puzzle has enough detail to allow someone to solve it?

Are you planning to generate the puzzles actually on the Arduboy or generate them by computer and then store them on the Arduboy?

Also, does it matter if a puzzle has more than one solution?

On the Arduboy … and multiple solutions are fine.

I am prototyping it in another language and it looks scary to ‘brute force’ it.

This sounds interesting!
At first I was confused, thinking of Sugru - the neat adhesive!
… But then I found a guide to playing Suguru. Looks challenging :slight_smile:

Re. generation - I wonder if it’s like Sudoku? Algorithmically produced Sudoku isn’t as fun to play as those crafted by a human… You almost need an understanding of how people solve the problem, and what makes the process fun.

1 Like

My first attempt produced a 5x5 grid that cannot be populated!

Time to refine the code.

I wrote this attempt in VBA (in Excel, something installed on my work machine).

The left hand side defines the blocks, the right hand side is showing a partial iteration. I ran it over night and the furtherest it got was to block 12. I didn’t count the iterations but am guessing it would have done tens of thousands of attempts.

The issue with the code (and maybe the problem) is that you cannot just back-track a block and retry as it could be any of the early blocks that have dictated subsequent blocks and ultimately led to an impasse. When the process restarts, it simply tries all the combinations again thus covering old ground.

Any ideas on how to improve this algorithm?

The attached is a Macro enabled Excel spreadsheet. Yes, Excel. (22.3 KB)

That’s going to restrict which techniques you can use…

Upon seeing the right hand side, my instinct is to suggest that you start collecting list of numbers that each cell can’t contain.


That way, you can delay filling more flexible cells until later and fill stricter cells sooner, which should reduce the amount of backtracking.

E.g. In this situation:

  • You’d fill the “can only be 1” cell now
  • You’d mark all its surrounding cells as “can’t be 1”.
  • You’d mark the remaining cells in the yellowish area as “can’t be 1” because you’ve already got the ‘1’ for that area

And then when you start filling the blue area in the bottom left corner, you’ll know only two of the four cells can possibly be 1.

This should also reduce the amount of randomness involved in the process, making it more methodical.

This is actually similar to a technique used for solving sudoku puzzles where the possible values of the cell are written in the margins of the cell and the player crosses numbers out as they are proven to be invalid solutions to the cell.


Assuming you never go above groups of 5, you can use 3 bits to store each impossibility, meaning you’d only need 12 bits (3 bits * 4 possibilities) for a complete list (since 5 impossibilities means the whole thing is impossible), and 3 bits to store the actual value, meaning you can fit both the actual cell value and your list of impossible values in just 2 bytes (and still have a bit left over). For a 12x8 grid, that’s 192 bytes, which is fairly cheap.

Definitely … it may also be a silly requirement!

Yes, this was really my next step. However, I am thinking that I can do that on a per ‘block’ level than across the entire board. In this way, the size of the grid is actually not relevant as its only composing 5 cells or less at a time.

Thanks for the input … let me play with it more!


I’m sure you’ve already Googled this… but…

(Quite maths dense. More code examples at the bottom).

This is an excellent example of why I hate maths and mathematicians.

Why explain things in plain English when you can litter your explanation with a load of hard-to-decipher symbols and arbitrarily chosen letters? :P

That probably seems a bit hypocritical coming from a programmer, but personally I think pseudocode and most programming language are far easier to decipher than most maths. (A for loop or sum function are certainly easier to understand than a large capital sigma.)

My one concern with this is how you would pick which cell to tackle next.

If you gather contraints first, you can use that to decide which cell to tackle next. I.e. the cell with the fewest possibilities should be filled first, so the more flexible cells can be reserved for when it starts becoming difficult to assign numbers.

Yes … but this if for solving puzzles not creating them.

My latest cut of code takes between 1 and 5 seconds to create a puzzle. Of course that is on a PC, not on the Arduboy. (30.9 KB)

I am processing the blocks in a given order - starting withe blocks of only one cell then doing the adjacent blocks in order of size. Within the blocks themselves, I solve those cells that have one solution then determine the available options for the remaining cells. I then randomly cycle through until I find a ‘workable solution’.

1 Like

The problem of course is that if you have 3 cells remianing of which

  • the first can accept a 1, 2 and 4
  • the second can accept 2 and 4
  • the third can accept only a 2 and 3

it is easy for a human to say that the second cell must be 2, the third cell must be 3 and the first cell can therefore be 1 or 4. With a limited computer, this logic gets quite hard due to the number of combinations,

These are all classic scenarios in Suduko is well.

Personally I would have assumed that the first is 1 and the last is 3.

If you weight numbers by how frequently they appear in the list of possibilities for a cell, the numbers that appear least frequently are more likely to be appropriate solutions. Or at least that’s what I would assume.

Yes my example was a little off - I imagined it in my mind rather than use a real example!

How random is the random number generator for the Arduboy?

I have taken the code that finishes in a approximately 40 - 100 iterations from my Excel code and translated it to an Arduboy sketch. However, it never completes and I suspect that it is because the sequences are not totally random or it starts cycling through the same sets of random numbers after a period.

1 Like

rand and random actually come from avr-libc, so the best way to inspect the PRNG is to download the source.

You may want to try giving an xorshift variant a go.

Excel’s PRNG won’t be ‘totally random’ either.

At any rate, for this kind of problem you want less randomness, not more.
If there’s too much randomness involved in selecting values, you’ll actually decrease your odds of the generator halting because there are more invalid configurations than valid configurations.

If simply swapping out the PRNG doesn’t work, I’d suggest logging the steps taken by the generator (over serial) in detail (i.e. including the numbers generated) and then trying to analyse it to see if there’s any obvious reason why it isn’t working or if it’s just down to invalid choices being statistically more common than valid ones.

Like I say, the less you leave to chance, the better.

1 Like

Arduino uses the default AVR libc libraries (ref.: Arduino - UsingAVR)

Arduino calls these functions from ArduinoCore-avr inWMath

Source from SVN:

This is based on the well studied Park-Miller PRNG, after their 1988 paper:

I’ve done quite a bit of work on XORSHIFT and don’t think this will offer any improvements in distribution or period (assuming good seed values). AFAIK Excel uses the Mersenne Twister algorithm.

Would suspect the issue is elsewhere. Good luck :slight_smile:

Actually, I can quite confidently say the xorshift has roughly double the period of avr-libc's Lehmer...

The period of an LCG or an xorshift depends on the number of bits used to represent the state and the set of parameters used. Some sets of parameters, for both xorshift and LCG, will create a sub-optimal period, whilst others will reach the optimal period.

For a 32-bit version of either, the optimal period is 232 - 1.
(There might be some PRNGs were the 0 doesn’t ‘break’ them, but the ones we’re talking about here have the flaw where 0 maps to itself and thus ‘breaks’ the PRNG.)

I know for definite that the parameters in the xorshift I linked to attain the full period because they were from a list of full period parameters published in a paper by Marsaglia.

The Lehmer used by avr-libc will have at most a period of 231 - 1 (half that of the xorshift) because the modulo used is 231 - 1 (a mersenne prime), and the modulo limits the number of representable values.

It’s possible for an LCG with 32 bits of state to manage a better period of state, but not a Lehmer because by definition a Lehmer is an LCG where the modulo parameter is prime, and any prime that can fit in 32 bits will make it impossible to reach the optimal 232 - 1 period because all such primes are less than 232 - 1.

Personally I’m quite disdainful of LCGs. They’re very easy to implement, but they suffer from what Marsaglia described as falling “mainly in the planes”, which can be useful if you’re actually trying to create patterns, e.g. in procedural content generation, but otherwise are generally a nuisance.

From what I’ve found, it probably depends on the version.

Hopefully by now it is MT19937. Mersenne Twister is an excellent PRNG.
It’s a shame it’s far too heavy for the Arduboy.

@filmote, do you have any plans to publish what you’ve got so far?

If there’s some code to play around with it might be easier to spot flaws/improvments than it would be by being theoretical about it.

1 Like

Yep, I can publish it. I will do so tonight.

I am guessing @pharap that you are still using LibreOffice so the VBA version will not work for you? It might be handy to compare the VBA to the C++ version as one works, one does not.

Not sure I follow your logic there. I feel that the ‘lack’ of randomness produces the same results again and again thus the process never finishes.

Amen to this.