On Randomness

Check this article out, this graph shows the distribution of values from the analog read:
analogread-entropy

I think it’s better practice to count ms since boot until the first button press from the user as a seed, as this will be different every time unless you are a robot.

1 Like

The current system uses both pin entropy and the start-up time.

Maybe it’s not worth bothering with the pin entroy?

I don’t know if that graph would be representative of the ADC on Arduboy though. I don’t know the details of what the ADC is/does, but I’m hoping it’s better than just an unconnected pin.


Sidenote: I really wish Hackaday would change the glyph they use for their lowercase ‘i’, it looks really clumsy.

1 Like

Looks good to me. I’d like to do some more tests on it to be sure it’s faster and by how much exactly. I haven’t had a chance to look at it more carefully yet (my attention has been split between this, audio/timers/usbstack in ProjectABE, a game, and work).

Looking at the graph made by the Pseudo Theremin, I think there’s enough variation in the ADC for games. I was under the impression that the startup time would be a worse source of entropy.
Unless the bootloader is non-deterministic, why does startup time vary at all?

2 Likes

Likewise, I’ve got my own set of projects that I’m involved in.

There’s no rush of course, a new PRNG is quite a significant addition so it makes sense to take time deliberating and developing to make sure it’s going to be suitable.

Determinism usually only extends to the operations and the order of operations, it doesn’t extend to the speed at which the instructions are executed. There’s enough noise in the electrical signals and other physical aspects that execution time varies.

That said, I wouldn’t have thought it would vary that much so I share your sentiment there. If nothing else, it’s probably in the same order of magnitude each time.

It would be good to do some research into the variances of ADC and micros at start up. (Perhaps by storing the readings in a variable and then writing the readings out via USB, turning the Arduboy on and off several times.)

Depending on how good the sources are, it might be worth considering implementing a hash function to combine the two sources instead of just making the ADC the high word and micros the low word.
Granted that hashing is still a deterministic operation, but part of the point of hash functions is to magnify small variations, so it’s worth considering.

Either that or taking several readings a couple of milliseconds apart - again depending on what the figures say.

1 Like

Which may impact the amount of operations, therefore the overall time.

So we’d use a function that takes a number and transforms it into another, apparently random one? :stuck_out_tongue:

The analog pin is just unconnected, but actually physical interaction with the device seems to have significant impact on the value. Check out @chame and his pseudo thermain:

Theoretically rubbing the polycarbonate induces a small static charge into the device so maybe that is effecting the dialectic properties but I’m pretty surprised the device is able to pick that up.

4 Likes

All of the pins are exposed as test points on the back, and all of them are labeled.

They are also on a 0.1" global grid so you can make an adapter based on perfboard.

2 Likes

I think most of the branching in the booting process is loops waiting on hardware rather than general decision making. Either way the non-cpu hardware behavior is probably going to cause more speed variance than the CPU itself, even with variance introduced by branching.

I’ts more a case of takeing something that might always leave the top few bits as zero and spreading it out across the full bit range.
Granted it’s not really increasing the number of outputs but it might make a difference.

Like I say, it depends on what the data throws up.

For example, take the first 5 elements of the first 5 xorshift32 sequences (excluding 0):

1: 1, 36, 19, 2066, 1026
2: 3, 72, 3, 2147, 3133
3: 2, 108, 16, 3145, 2111
4: 6, 145, 7, 4323, 6249
5: 7, 181, 20, 5321, 7275

Notice that the orders of magnitude are similar because the inputs aren’t very far away from each other. That property seems to hold throughout all the sequences.

For example:
1000: 540, 29223, 667, 13257, 29161
1001: 541, 29187, 648, 14307, 30187
1002: 543, 29295, 664, 15274, 32212
1003: 542, 29259, 651, 16256, 31190
1004: 538, 29366, 668, 9002, 27008
1005: 539, 29330, 655, 9984, 28034

So the point of hashing would be to give the inputs a wider variance. Certain hash functions can take numbers that are close together (e.g. 1 - 5) and spread them out, which might be beneficial here for eliminating this pattern.

(I don’t have one to hand. Non-cryptographic, non-crc hash functions are surprisingly hard to track down online.)


That’s why it pays to check the documentation.
Java and C++ might look similar on the surface, but they have very different philosophies so they do things differently.
Java likes to make things quick and easy for the programmer.
C++ prefers to give the programmer flexibility to do what they want, but in exchange the programmer has to work harder.

Ah yes. That’s because of modulo bias, which is what I was disussing earlier. Scroll up and look for the uniformRandom template function I posted. If you use that, e.g. int value = uniformRandom(rand, 0, 2) - 1; then you should get a better distribution.

@FManga posted that a couple of comments ago :P

It seems plausible. And since people usually hold their device when turning it on, it would follow that doing so could have an impact on the ADC.

We really need a study into this, it would be interesting to know if the RNG’s seed could be influenced by whether a person is holding the Arduboy before turning it on, and perhaps whether putting the Arduboy in an anti-static bag might have an effect.

A hardware RNG is an interesting idea.
It wouldn’t be something everyone could do though, and it would probably be a bit slow.

I saw an RNG Arduino library that takes over one of the interrupts to periodically reseed the RNG. I can’t remember what it was called though.

2 Likes

I get what you mean, it just feels like we’d be using a PRNG to seed another PRNG in order to fix a flaw. A good RNG should not exhibit that pattern you posted, but considering this isn’t for crypto, it might still be good enough.

1 Like

Not true. All of the hardware is on the CPU chip and tied to the same clock, so will always operate at the same rate. Without external influence, after every boot the system time will be the same value at any given point in the code.

(The display is external hardware, and has a different clock, but the processor only talks to it. We never receive any signals or data from the display that could influence system timing.)

If you call initRandomSeed() after begin() system time is always going to be the same [*see note below]. That’s why initRandomSeed() does an analog read on an unconnected pin.

So why does initRandomSeed() also use system time if it’s always going to be the same? Because for better entropy it’s recommended that you wait for a button to be pressed (such as to dismiss a splash or instruction screen that’s only shown once after bootup) before calling initRandomSeed(). The variance in the time the user takes to press the button will alter the system time that initRandomSeed() is called at, making it a source of entropy.

*Note: When the Arduboy is connected to an active USB port, interrupts generated by it would be an external influence and could change program flow, thus affecting system time.

1 Like

An easy way to spread out numbers is to multiply them. Preferably with a large prime number. And just in case the seed is zero, we should add some constant. Sounds familiar? :stuck_out_tongue:

We’d be combining two PRNGs so they’d cover up each other’s flaws.

Marsaglia does this in his KISS generator:

#define znew (z=36969*(z&65535)+(z>>16))
#define wnew (w=18000*(w&65535)+(w>>16))
#define MWC ((znew<<16)+wnew )
#define SHR3 (jsr^=(jsr<<17), jsr^=(jsr>>13), jsr^=(jsr<<5))
#define CONG (jcong=69069*jcong+1234567)
#define KISS ((MWC^CONG)+SHR3)
1 Like

It depends how the behaviour translates into actual usage within a game. I suppose if most people are just reducing the value to something in a small range (e.g. < 100) then it probably won’t make much difference.

Fair enough.

I tested to make sure. It is indeed 2286016.

I hadn’t noticed that before.

I’ve seen a lot of people call initRandomSeed in setup, so perhaps that needs to be brought to people’s attention.


@MLXXXp When I said “it would be interesting to know if the RNG’s seed could be influenced by whether a person is holding the Arduboy before turning it on” I meant in a way that could be used to influence the PRNG seed. I know it already causes noise - that’s the third time the link to the theremin has been posted.


Maybe, but if it provides better results then it might be worth it depending on the use cases.

We don’t really have any statistics/data on ‘real-world’ usage, so it’s hard to make judgements. As I said before, if all use cases are reducing the generated value to something negligably small then it might not matter anyway, but if there are any cases using ~10 bits of the result (i.e. in the range of 0-2048) then those might be affected by the pattern.

I did a quick sweep of some games to see how many adhere to this.
Of the ones I checked, nearly all were calling initRanomSeed in setup.

To pick a few:

And so on.

Admittedly there’s going to be some variance because they’re not all immediately after boot so it’s not a total loss, but the point is that it seems very few people are aware of the recommendation to randomise after a splash/loading screen.

1 Like

It’s in the Arduboy2 library documentation for initRandomSeed(). What other documentation do people use as a reference for the Arduboy2 functions?

Other than by generating electrical noise or a static charge, that code uses hardware to read, in what possible way could holding the unit influence the PRNG seed? Using psychokinesis to manipulate electrons in the processor?

I think generally people either only read half the description or just learn by word of mouth and/or looking at examples.

Probably more the latter, I don’t think many people sit down and read the documentation unless they need to look at something specific.

Obviously that’s not ideal, but I’m not really surprised, most people here are hobbyists rather than professionals (or professionals doing this as a hobby) so they just want to get programming without worrying too much about the details.

(That said, it looks like even the professionals missed that one - Modus Create’s Evade also calls initRandomSeed() after their logo is drawn and EEPROM has been read, but without waiting for a button press.)

If the static from a person’s hand influences the pin noise (as demonstrated by the theremin) then theoretically it would be possible to influence the pin that is used to seed the RNG.

Depending on how easy it is to control that influence, it might be possible to influence the RNG seed. E.g. if placing one’s thumb over the CPU alters the ADC input substantially but not touching the Arduboy results in a stead ADC input then that might be usable to strategically influence the PRNG of a game.
(As pointed out already, in most games the placing of initRandomSeed means that micros is going to be somewhat deterministic.)

1 Like

Whilst I have recently been using the library documentation for reference it doesn’t always show a “how to” from there i’ll dig around in other people’s code to see its implementation in action.
At that point it’s easy to fall into the trap of copy paste or follow a trend.

Edit:
I’ve actually since referencing the docs pointed a few things out to “experienced” programmers ???

1 Like

In my case, I developed “Hello, Commander” entirely in the emulator, where the ADC is hooked up directly to javascript’s Math.random() :stuck_out_tongue:

To date there really hasn’t been much in the way of tutorials, discussions or other documents on using random numbers in Arduboy sketches.

Waiting for a button press does generate more entropy but if you’re worried about sketches that call initRandomSeed() after begin(), with no external influence to vary system time, try the following sketch and see how often you get the same sequence of numbers (indicating the same seed was generated). Also see if you are able to influence it in any way by the placement of your hand, etc.

#include <Arduboy2.h>

Arduboy2 arduboy;

void setup() {
  arduboy.begin();
  arduboy.initRandomSeed();
  arduboy.clear();
  for (byte i = 0; i < 8; i++) {
    arduboy.println(random(2000000000L));
  }
  arduboy.display();
}

void loop() {
}

Also keep in mind that in many games, within game play itself, what the next random number affects will be dependent on the timing and decisions for actions that the player makes.

E.g. If a player presses a button before the time the next enemy is generated, the next random number will be applied somehow to the player’s action. If the button is pressed after the enemy is generated, the next random number will be applied to something to do with that enemy.

For games such as this, especially if they’re fast paced, even if the PRNG starts out with the exact same seed every time, the game will become “random” quite quickly.

1 Like

I remember looking at this topic when I looked a little at the subject you may find some related discussion.

1 Like