The Laws of Probability

I've long had a nagging suspicion the universe isn't as random as we are led to believe. I don't attribute it to anything. I don't claim what I see is evidence of a divine plan or the intervention of celestial beings. I just can't shake this feeling the things I see aren't as "random" as they ought to be.

Is that crazy? Maybe. Maybe not. There is no inherent reason the universe must be random, but the idea a person could actually identify non-randomness in it is... difficult to believe. Humans are too prone to biases in how they perceive and remember things to expect them to reliably discern the difference between "random" and "non-random."

I get that. I really do. Still, I can't help but wonder if what I see is really random. For instance, years ago I created a simple program on a graphing calculator to play Rock, Paper Scissors against an computer opponent that would choose its play randomly. I've long since lost the calculator and program, but my win rate was greater than 35% after over 1,500 matches.

What does that mean? I don't know. The results are fairly surprising, but the odds aren't too extreme. Besides, computers cannot produce true randomness (arguably, nothing can). Computers produce "random" numbers via algorithms and calculations, none of which are truly random. It's always possible the human mind might perceive patterns in the "random" numbers chosen by a computer and thus be able to win more often than the odds might suggest.

With that in mind, I decided to test my ability to win at Rock, Paper, Scissors again. I couldn't find a program I liked online so I decided to write one of my own. I copied some Python code I found online, modified/expanded it a bit and started playing. My first test of the code produced these results:

Wins:		37
Losses:		44
Ties:		48

I was happy to see things were working even though a 28.68% win rate did nothing to support the hypothesis I was testing. I was actually happy when I saw those numbers. I'd like to be able to convince myself my previous results with this test were just a fluke of a bad RNG on that calculator I had. It'd make the world seem more sane. This is why I was happy to see the next batch of results:

		Current		Total
Wins:		113		150
Losses:		104		148	
Ties:		120		168

About this point I was talking to a friend about how this was probably a waste of time, but it was more enjoyable than losing to seemingly bizarre luck in video games like I had been all night. He told me I'm just seeing what I want to see and I don't have unusual luck. I acknowledged that was possible and said I wanted to keep playing this to try to convince myself he was right. This led to:

		Current		Total
Wins:		118		268
Losses:		105		253
Ties:		102		270

With my win rate having increased from 28.68% to 33.88%, it seemed things were going the way they were supposed to. This batch of results were slightly biased toward winning, but in the long run, everything was averaging out. Then this happened:

		Current		Total
Wins:		74		342
Losses:		72		325
Ties:		55		325

I should point out I can't see a tally of my results as I'm playing. I can't even see how many matches I've played. All I can see is the outcome of each match after I make my selection. That's why the numbers in each run are so uneven. It's also why I was disturbed by these results. A 34.48% win rate isn't huge, but it was somewhat unsettling. I played some more and:

		Current		Total
Wins:		63		405
Losses:		58		383
Ties:		58		383

That the results for losses and ties had lined up like that was entertaining, and I was able to tell myself it was just chance that these results seemed to confirm something was off. I wasn't convinced, but I decided it'd be easy to prove this was just a fluke by playing some more. That's when this happened:

		Current		Total
Wins:		120		525
Losses:		112		495
Ties:		73		456

I'll let what I posted on Twitter explain my reaction:

I took a short break at this point to try to decide how I should feel and if I should keep going. I also did a bit of math. Here's a crude graph I made to show the probability of different numbers of wins, with a (black) line showing the average result and a (red) line showing my results:

11_26_rps_1476

As you can see, it's not entirely unreasonable to get results like these. From a statistical perspective, I believe the results were just barely "significant" at the 95% level. This reassured me so I decided to try some more games:

		Current		Total
Wins:		112		637
Losses:		99		594
Ties:		94		550

With my win rate reaching 35.77% over 1,781 runs, the results could easily be described as "statistically significant." In stunned disbelief, I did one more batch:

		Current		Total
Wins:		85		722
Losses:		74		668
Ties:		93		643

Which brought my record to 35.51% over 2,033 runs. I believe the odds of that happening are less than 2%. I know that's not beyond the realm of possibility or anything, but... still. I don't get it.

These results are "statistically significant." Something is probably off. It appears either my luck is unusual or I've learned to beat the default random number generator in Python. And quite quickly at that. It's not like my code is biased. All it uses for this is:

import random
num = random.randrange(1,4)

So I don't know what to think right now. Is the randrange function in Python just that predictable? Did my mind just manage to pick up on hidden patterns in two different RNG systems and learn to beat them? Or is this just a fluke? Am I making too much out of what is really nothing?

I don't know. I'd love to hear what you think. In the meantime, I've saving all my test results in timestamped files for documentation purposes. I don't know if I'll do any more though. This is what happened in an attempt I did after getting halfway through writing this post:

		Current		Total
Wins:		76		798
Losses:		52		720
Ties:		75		718

Which puts me up to a 35.69% win rate over 2,236 matches. That can be visualized as:

11_26_rps_2236

I'm not sure if I'm doing the math right, but I believe my results are now statistically significant at the 99% level. I'm not sure. I'm not really sure of anything at this moment. I think my only two options at this point are to doubt the laws of probability or make a stiff drink.

26 comments

  1. Nah. FPS games have done downhill since Unreal Tournament. Now, if they'd make a new Twisted Metal game that didn't suck, I'd be all over that. Until then I'll stick with my MOBAs (when my internet connection allows), strategy games and RPGs. And by "strategy" games, I mean strategy games. I'm not talking about those stupid twitch based "strategy" games like Starcraft where a slight difference in twitch reflex can completely outweigh any difference in strategy. I hate that. It's fine for a genre of games, but don't call those games strategy!

    By the way, it's well over 1,500 games now. I was at ~2,200 when I published this post, and I just did some more:

    Wins: 79
    Losses: 67
    Ties: 66

    I don't get it. I think I'm going to have to spend some time on Python boards to see if I can find out what approach it uses for producing its random numbers. Maybe this function is just highly predictable. I know this is probably a waste of time, but this is just so weird.

  2. One way to test the randomness would be to always play the same (or have a program do that). That should have as perfect a ratio as possible between win/loss/draw, or at the very least the results should trend towards perfect balance. If it doesn't do that over a suitably large number (say 10,000 +) of iterations, then you can be sure that the random function does indeed have biases.

  3. The obvious question to ask is, what strategy or strategies did you use, and did you vary the strategy over time. Simple PRN generators are imperfectly random. More sophisticated ones are less imperfect.

  4. I was talking about this experience with a few people online, and during the discussion they proposed a couple simple tests. I ran the program 10,000 times with the user input being hard-coded as 1 (rock) each time. I then repeated that for the other two options. The results were as expected. I then had AI play against itself at a person's request and had expected results once again. I could rerun the tests with more complex strategies if people have any they'd like me to test.

    For additional checks, I've coded the program to log every result rather than just keeping a tally of them. This should make it possible to look for patterns. Before I did that though, I switched the RNG call from:

    num = random.randrange(1,4)

    To:

    num = random.SystemRandom().randint(1,3)

    The functionality should be the same (I have no idea why the indexing on the functions is different) except for the methodology used for creating the "random" numbers. I did some checking and people are warned against using the first function for crytographics due to weaknesses in it. I can't see how those weaknesses could cause results like what I got, but to be safe I switched to latter function call as it uses the os.urandom() for its randomness, a function call that relies on your operating system's RNG. It should be much stronger (though I'm not convinced it is an optimal solution), though it does mean my results may now depend partially on my OS. Switching to a OS could change things.

    Prior to keeping records of individual runs with this function, my results were 206 wins, 192 losses, 191 ties for an average of 34.97%. Interestingly, my first set of results were again outliers compared to my average (71/65/84) lending some small amount of credence to the idea I am learning as I go. Since then, my results have been 352 wins, 308 losses, 283 ties for a win rate of 37.33 over 943 games.

    I don't know what to think, but I'll be uploading the code and result files once I've done some more runs. Maybe there's something obvious I'm missing. Or who knows? Maybe this is just something people can do in general.

  5. By the way, my strategy has been to listen to music, get into the "grove" where I "feel" right and just hit whatever number that comes to me. I then continue going until the song stops (or the video for it lags because my internet connection is slow). Sometimes I do the same with videos or do other things while I'm playing.

    A "proper" test would involve a more rigorous structure with fixed time periods and/or fixed run lengths, but the idea is to keep myself from thinking about what I'm going to pick. I don't know if relying on my subconscious affects my results, but it definitely makes things less tedious. If I had to put thought into each choice I made, I'd never be able to do 4000+ runs.

    Which is where I'm at. Before I switched which PRNG function I used, I had completed 3,351 games. My results were 1,220 wins, 1079 losses, 1052 ties. That is a 36.41%. I believe the odds of reaching a win rate like that over so many games is something like 1 in 10,000. It boggles my mind.

  6. Okay,, this is just creepy. I decided to do a simple test where I reversed the rules of Rock, Paper, Scissors so that Rock beats Paper, Paper beats Scissors and Scissors beats Rock. My results were:

    Wins: 68
    Losses: 108
    Ties: 84

    I did another run and:

    Wins: 84
    Losses: 102
    Ties: 91

    That's a 39.1% "lose" rate when the rules are reversed. There's less than a 1% chance of hitting that. I swear my results are getting more extreme.

  7. If the hard coded results are as expected, then the first apparent conclusion would be your subconcious is quite clever at subtle patternrecognition. đŸ˜€

  8. That's the best interpretation I can come up with, but in theory, these PRNGs shouldn't have patterns in their results. The improved one I'm using now is used for cryptographic purposes. It's possible something is going wrong in this particular implementation of it that causes patterns to show up, but if that's not the case, what should we make of this? Do we conclude the human mind is capable of picking up patterns in algorithms designed to be suitable for cryptography? The implications of that would be worrying.

    I really want some way to disprove all this, but I have no idea how to at this point. Everything I know about these sorts of algorithms tells me these results should not be possible. I think I'm going to have to find a way to try to formalize my tests a bit more to see if the results hold up (and if not, why). Then maybe I can try rewriting the program in a different language using an entirely different RNG.

    And if none of that works, maybe I"ll have myself committed. Or just see if anyone else can replicate my results.

  9. I decided to simplify my test by cutting out the possibility of a tie. That converts Rock, Paper, Scissors into a simple coin flip. This simplification should simplify all analysis, making it easier to discern potential patterns. I've kept track of every guess I've made since converting the code (including a 12 flip run I used to test to see if the code was working). My results so far have been (in order):

    9/3
    62/61
    95/87
    105/88

    For a total of 271 correct calls, 239 incorrect calls. The odds of hitting that level of success are less than 10%. I'm going to upload the code and data for this and my RPS results once I hit 2,000 or so coin flips.

  10. Right after I posted that comment, I started another run. I decided to try to go longer with it so there'd be less room to argue I'm cherry-picking runs (though if I wanted to fraudulently alter my data, I could just manually alter it to say whatever I wanted). The result was:

    Wins: 177
    Losses: 162

    It's not extremely unlikely (~20%), but combined with the other runs, these results have already become "statistically significant" at the 95% level. I think I am going insane.

  11. I went ahead and bundled up all my results so far in a single compressed file and uploaded it. You can find it here:

    http://www.hi-izuru.org/wp_blog/11_28_rng_results/

    The first set of RNG results were from when I wasn't meaning to do anything with this other than do a quick test and see what the results were. As such, I hadn't looked into the PRNG function I was using and didn't log all my individual matches. The second set of results are from after I switched to a PRNG system intended to be suitable for cryptographic purposes (though I haven't verified the implementation's appropriateness given my particular hardware). Shortly after I made the switch, I started recording every match result along with the total tallies. The third set of results is a preliminary set from my current approach in which I've simplified the test to have only two options (making it equivalent to flipping a coin and calling it).

    I've also included a file with code for each set of results. It's fairly simple and straightforward, but it's not clean or neat as I copied much of it from someone else then tweaked it for my purposes. I know there's a ton of room for improvement in it (and have plans to change it). The point of sharing it is just to let people see a functional form of the test I've used for these results. The code should run as is, though you will need to create a folder with the appropriate name to store any results you might produce.

    Oh, and the second set of results has some additional files in it. Several files show the results of tests I performed to look for biases in the code (the names should explain what they are). Another series of files show the results of what happened when I reversed the ruleset. Those results are marked by filenames beginning with "r_" for "reversed." The code provided with those results lets you test either ruleset (or even using a randomized ruleset) by changing one line of code.

    Who knows, maybe after posting this someone will figure out the "trick" behind my results.

  12. Have you thought of trying a random computer vs a random computer? That should, over time converge to the same as a single fixed value vs a random computer.

  13. There's a file in that release with 3,000 runs of the same function call used for the AI being used to play against it. Don't ask me why it was 3,000 runs. Someone I was talking to picked that number so I went with it. It was enough to show the "expected" results. So far the only thing I can do that produces abnormal results is to play the game myself. No matter how many times I try playing this, my results don't revert to the mean.

    I've been wanting to write about other things (and maybe finally get this short eBook finished), but I can't get this out of my head. I'm getting what appear to be impossible results. How do I look at anything else?

  14. Here's something which might be interesting. I've aborted my last two runs of the coin flip test early (while still recording the results, of course) because the results disturbed me too much. Maybe it's all in my head, but I genuinely felt like I knew which answer to guess. Here's the results for the first run, included in the data I posted above:

    Wins: 42
    Losses: 31

    That doesn't seem too bad, but I felt like if I continued on at that point I'd start winning way more than I lost. Here's the results from my next run, done the next day:

    Wins: 55
    Losses: 30

    To make matters worse, I realized what answer I pick is largely unimportant. The computer doesn't make its choice until after I pick my answer. That means I'm not predicting what result it has picked; I'm predicting what result it will pick. The result of each RNG call depends on when the program makes it. That means whether i pick "heads" or "tails" doesn't matter as much as when I hit that Enter key. Being a tenth of a second slower hitting that Enter key can be the difference between winning and losing.

    My next test is going to be to see if I can win a coin flip more often than not while only picking "heads." When the computer does it, that doesn't work. It'll be interesting to see if it does when I do it.

  15. I am happy to report my tests show no evidence I can beat the RNG when I am restricted to picking only heads. I did ~1,400 runs and ended with a ~50% win rate. I don't think my average win rate varied by more than a couple percentage points the entire time (after the first couple hundred flips).

    It is noticeably different when my choices are not restricted. I added up my results so far, and they are:

    911 Wins
    792 Losses

    That's a 53.5% win rate. Given the number of runs involved, that's statistically significant at the 99% level. It's significant at the 99% level even if we account for the potential endpoint bias where my decision of where to end runes isn't entirely random (I might subconsciously decide to end runs on a winning note). The only three explanations I can come up with at this point are: 1) I'm able to beat multiple RNGs; 2) The laws of probability are not working properly; 3) I am committing fraud by manipulating the data.

    I feel like people should think the last of those is the most likely. It seems completely ludicrous I'd put this much work into that sort of fraud (for what point?), but how does any other possibility make any sense?

  16. Had some time to spare and tried your python code (from set 2), but I couldn't reproduce your wins (or your losses with the reverse rules).

    Got 3 files with r_:
    Wins: 262 (33.94%)
    Losses: 261 (33.81%)
    Ties: 249 (32.25%)
    on a total of 772.
    So I won more (slightly) with the reverse rules.

    Got 1 file without r_:
    Wins: 65 (32.18%)
    Losses:71 (35.15%)
    Ties:66 (32.67%)
    on a total of 202.
    So I had more losses than wins (but with only very few games played).

    On a Linux machine with Python 3.5.2.

  17. Hey Michel, thanks for giving it a shot. I decided I'd take a day or two off because the strangeness of my results were really getting to me, but I'll go back to code and try it later today (or tomorrow, depending on time zone). Who knows? Maybe it'll all turn out to be a crazy coincidence.

    Or maybe not. Maybe I'm just better at making random guesses than you đŸ˜€

  18. Could it be that your entry of choice is affecting the random number generator? Try having it do 100 calls,
    then make your 100 picks (without peeking at the computer choices of course).

  19. I'd have to look at the details of the RNG used by my operating system, but I don't think my input should be able to affect the result produced by the RNG. It's worth testing though. It might help determine if the temporal aspect has any effect on the outcome. It would also give a check on the methodology as I could easily save the series the computer would use to be examined/presented to other people. Plus, I do like the idea of having fixed endpoints for my runs. Caveat: If my results are due to an ability to "intuit" patterns, a more rigid testing environment might interfere. Teasing out the different influences could be tricky (assuming the current trends continue to hold at all).

    It will probably be a few days before I try that. A couple online games I play had recent expansions, and that's drawing a lot of my attention. Plus I want to do some analysis for the contest I'm discussing in my last couple posts. And to be honest, I'm kind of afraid to keep testing this stuff. As much as I'm trying to find logical explanations and reasons for my results, nothing I come up with is comforting.

  20. Small, but potentially interesting update. I decided to try my coin flip test again after taking a couple days off to see what'd happen. I was kind of hoping the previous trends would vanish. My initial run gave me hope:

    Wins: 87
    Losses: 95

    But when I was thinking about it, I realized I was doing something differently. During the test where I could only pick heads, I made my pick faster each time since the test was so straightforward. I had continued at the same pace when I did the regular coin flip test. I wondered if that could have any effect and did another run at a (slightly) more thoughtful pace:

    Wins: 94
    Losses: 83

    It could be a coincidence, but across the next three runs, I went:

    Wins: 205
    Losses: 178

    I don't know if the pace had anything to do with this. It could just be confirmation bias. Still, when I went at a pace that was fast but not so fast as to be mindless, it really did seem like I could "feel" something. That could be in my head. It might just be a side-effect of being on a winning streak. Or maybe I am intuiting some sort of pattern It's something I'll have to look into if my trend in winning continues.

    Oh, and for those who don't want to do the math, that's 386 wins, 356 losses so far today. That's a 52% win rate over 742 games. It's not hugely unlikely, but it is not probable either. It seems I can't break this pattern.

  21. You said you were listening to music. Is the music playing separately or on your computer?
    Some RNG take input from the speakers.

  22. I know of RNG systems drawing entropy from hardware, but I had never heard of them using input from the audio systems for it. That's an interesting idea. I use headphones to listen to music off my computer, but I could switch over to an MP3 player to see if it makes any difference. I just need to get a replacement battery for it as it seems the batteries for devices in my house (or at least my bedroom) lose total storage capacity much more quickly than they ought to. I'm wondering if that's another potential coincidence or if some sort of wiring issue is leading to this. The wiring in this place is pretty messed up.

  23. It would be nice to be able to run simple scripts of picking resultsagainst the computer player.

    This way it would be possible to, and also much more quickly, to see if it is possible to formulate winning strategies at random.

Leave a Reply

Your email address will not be published. Required fields are marked *