Have the Laws of Probability Been Broken?

I've been trying to finish my next post involving a case study of the misuse and abuse of statistics to claim to prove global warming skeptics possess certain negative traits (my last post regarding this can be found here). Unfortunately, a number of things are getting in the way. Of special note is it's difficult to talk about statistics as I've largely lost faith in the laws of probability.

I've talked about this before. To make a long story short, I once combated boredom in high school by writing a program for my graphing calculator to let me play Rock, Paper, Scissors. I didn't expect anything to come of it, but it seemed like a decent diversion. I became more interested in the program over time as I consistently won more games than I should have.

A couple months ago, I decided to test those results by writing a new program in a different programming language. My thought was the previous results might have been caused by the calculator having a weak random number generator that led to its "random" choices being predictable.

To test that possibility, I wrote a new version of the program in the Python language. After 3,351 matches, I had 1,220 wins, 1,079 losses, 1,052 ties. That's a win rate of 36.41%, well above the 1 in 3 chance one would expect. Using the binomial theorem, we can determine those results are statistically significant at the 99.99% level. To give you an idea of the odds we're talking about, Here is a visualization of my results after 2,236 matches (798 wins, or a 35.7% win rate):

Here is the same visualization for after I completed all 3,351 matches:

The black curve indicates the probability density function for each occasion. The further any point on the line is from 0, the more likely it is for that outcome to happen. The area under the curve indicates the full range of outcomes and their likelihoods. For the sake of our purposes, we want to know how likely it is to get at least the specified number of wins. That is represented by the area of the curve left of the red line. The area to the right of the line displays the probability of getting more wins than the specified number.

Technical details aside, there is simply no way I should have won as many times as I did. There are several possibilities. One possibility is the one I considered with the graphing calculator - maybe the RNG of my game was flawed in a way that could allow a person to win more often than they would otherwise be expected to. To test this, I switched the algorithm I was using to generate random numbers for a stronger one, one suited for cryptographic purposes. In theory, there is no way it should produce predictable results, but There may be some flaw in the way this algorithm is being used that affects things.

Rather than go on with words, I'll just show you the results for each batch of matches I performed:

Wins:		117
Losses:		120
Ties:		99

Wins:		97
Losses:		115
Ties:		110

Wins:		175
Losses:		177
Ties:		174

Wins:		121
Losses:		111
Ties:		80

Wins:		115
Losses:		107
Ties:		88

Wins:		157
Losses:		122
Ties:		147

Wins:		90
Losses:		65
Ties:		81

Wins:		133
Losses:		94
Ties:		101

Wins:		68
Losses:		74
Ties:		63

Wins:		67
Losses:		72
Ties:		92

Wins:		112
Losses:		93
Ties:		78

Wins:		96
Losses:		94
Ties:		98

Wins:		129
Losses:		126
Ties:		131

Wins:		130
Losses:		135
Ties:		139

Wins:		255
Losses:		199
Ties:		226

Wins:		72
Losses:		68
Ties:		50

Wins:		67
Losses:		61
Ties:		66

Wins:		68
Losses:		62
Ties:		63

Wins:		74
Losses:		65
Ties:		68

Wins:		122
Losses:		111
Ties:		100

Wins:		100
Losses:		101
Ties:		112

Wins:		79
Losses:		71
Ties:		60

Wins:		71
Losses:		65
Ties:		84

Wins:		85
Losses:		77
Ties:		67

Wins:		50
Losses:		32
Ties:		40

Wins:		51
Losses:		40
Ties:		37

Wins:		61
Losses:		64
Ties:		53

Wins:		59
Losses:		54
Ties:		50

Wins:		104
Losses:		84
Ties:		78

Wins:		77
Losses:		66
Ties:		65

Wins:		93
Losses:		75
Ties:		84

Wins:		82
Losses:		80
Ties:		70

Wins:		138
Losses:		128
Ties:		113

Wins:		96
Losses:		103
Ties:		96

Wins:		147
Losses:		123
Ties:		148

I apologize for how long that list is, but I felt it was important to show some of the detail which has gone into my testing. These tests were performed by playing Rock, Paper, Scissors an unfixed number of times, with the total results and individual outcomes recorded in a timestamped data file.

Because subconscious biases might influence when I would choose to stop any particular run, the individual sets of results aren't too important. Some sets have fairly even results, some lean toward losses or ties, but overall, my tests leaned significantly toward wins. Here are the totals:

Wins:		3558		35.55%
Losses:		3234		32.31%
Ties:		3216		32.13%

Total:		10008

To show how unlikely these results, here is the same visualization as before:

The chances of achieving these results by chance are nearly one in a million. It is difficult to explain these results, and no explanation I can come up with is comforting. Here are the explanations I can think of:

1) It's just chance.
2) The laws of probability don't work as expected.
3) I have psychic powers.
4) I am able to beat the RNG of my game.

The first explanation seems silly. It's practically impossible to prove something like this isn't just "luck." At the same time, it's difficult to believe things are just due to "luck" when a person predicts something with a one in a million chance of success and is right. If anyone wishes to pursue this explanation, I can continue playing the game to see if my results hold up.

The second explanation would be world-shattering. If we could prove the universe isn't "random" like it is believed to be, it would have philosophical, theological and metaphysical consequences beyond anything we've ever seen. Every religion and philosophy in the world would have to take these results into account.

The third explanation is perhaps the most worrying for me as I don't believe in psychic powers. Not only would being a psychic change a great deal about how I view myself, it would pose serious problems as I would be (as far as I know) the only psychic in the world who could prove their abilities are real.

The fourth explanation is the one I hope is true. I would like to test it by rewriting this program in other languages with other algorithms used for the "randomness" of the game. Preferably, I would also try it out on different machines with different operating systems. I'd probably want to try other things as well. Perhaps I'd find these results are limited to certain setups.

That wouldn't stop these results from being worrying though. Let's suppose both algorithms I've used in Python and the algorithm I used on my graphing calculator are flawed in a way that makes their results somewhat predictable. Let's suppose I am just subconsciously picking up on the flaws in these algorithms (or their implementations). Think about what that would mean. It would mean I was able to "beat" RNG algorithms after only a few hundred or few thousand attempts.

That shouldn't be possible. Remember, the RNG is selecting only one of three outcomes (rock, paper or scissors) but it does so by producing a large, "random" string/number. This game compresses that long string/number into one of three values. That results in a huge loss in information. If a human subconscious could pick up on patterns after even a few thousand guesses with such a great loss of information, computer analysis should be able to prove the existence of patterns in these RNGs with relatively little trouble.

They can't. We know they can't because this is an active field of study. People try to find weaknesses and flaws in RNG systems all the time. If we could find easily discernible patterns like these results suggest, the RNG systems would lose a ton of value. This would have huge implications on all sorts of things, including any number of security systems.

None of these explanations are comforting. Every single one of them raises serious concerns. I should be freaking out a bit. Maybe more than a bit. The only way these results aren't cause for serious concern is if I've somehow screwed up my program, but no test I've come up with indicates such. People who've tried it for themselves don't get the same results as me, I can't find any patterns in the outcomes,

If anyone wants to check for themselves, I posted the data (and code) of the earlier batch of tests online a while back. You can find a link to them and a description of what files are included here. You'll find results from a couple other tests I performed as well, tests which show the same ability for me to (seemingly?) predict outcomes.

I don't know what more to say. I try to be calm while discussing this, but it is freaking me out. I don't know what I'll do if these results hold up with a different program using a different approach to generating "randomness." I just want things to make sense.

As a final note, I am happy to share any details or information people might like about these tests, including all code and data files. I am also happy to take any suggestions on how to improve any aspect of my approach or to test my conclusions. And finally, if anyone has a better explanation than the ones I've mentioned in this post, I would love to hear it. I feel like I'm losing my mind right now.

61 comments

  1. Brandon -

    ==} People who've tried it for themselves don't get the same results as me, ... {==

    Have other people tried it on your (computer? calculator?).

  2. Joshua:

    Have other people tried it on your (computer? calculator?).

    Nope. Unfortunately, I don't know anyone in my area who would be willing to come over and try playing hundreds of games. I've long since lost the calculator I used in high school so I can't have people try it with that either.

    I would like to try testing this with software that is sure to be consistent between players, but I'm not sure how to do it yet. I'm thinking of making a throwaway web page where anyone can try it and the results will get stored on the server. That'd ensure both consistency of the RNG algorithm for each person and ensure data integrity.

    My worry is latency (how long it takes for traffic to go across the internet) is inconsistent. RNG algorithms generally use the current time as an input parameter when creating "random numbers." Even if a person can recognize patterns in an RNG system, that doesn't mean they can predict patterns in how quickly or slowly internet traffic might travel.

    I have a couple ideas on how to get around that, but I'm still working out the specifics/code.

  3. I hit moderation because of four links...one was a double...so I'll try again:

    I was curious to know what you might find with one of these:

    http://www.ohgizmo.com/2007/06/13/electronic-rock-paper-scissors-mankind-takes-a-step-back/

    Obviously, the random number generator they might use could be faulty (unsophisticated), but at least it might help resolve whether you're a psychic.

    or these:

    https://play.google.com/store/apps/details?id=co.jp.zyankenponfree

    https://play.google.com/store/apps/details?id=jp.co.intri.rock_paper_scissors

  4. Since you re-posted that comment, I deleted the first one rather than release it from moderation. I'd have released it if there had been any real difference between the two.

    For the first link, I can't say I've tried anything like that. It looks like an interesting device, but I can't imagine using it since I'd have to keep track of results manually. When you're doing hundred or thousands of matches, manually recording results is a huge pain. Even worse, you're likely to make mistakes.

    For the other two links, I've looked for an app that'd work for my purposes, but I haven't found any that log results and can be played at a reasonable pace. I've found a couple sites online I could play at (including one where the AI was a learning program rather than making only random guesses), but they were slow enough I couldn't use them. At its fastest, one site was still three seconds a match. Try playing a thousand games at that pace. Even if you have the patience to do it, you won't be able to maintain your focus.

    I'm thinking I might try to move away from the Rock, Paper, Scissors theme. It might be easier to find a program that has logging and the ability to play quickly if I look for other uses of RNG. Maybe there's some sort of matching game or something like that. I might even be able to find something physical. I know people have built devices to log the results of tests like this before. Even if I can't find one for sale, I might be able to find guidance on how to build one. I could probably do it with just a programmable circuit board, a box, some buttons and some lights.

    But I would really like to find something (physical or digital) made by someone else so there's independence to the results. There's always concerns when the person who made the test is also the person who is taking it.

  5. Hey Joshua, thanks for getting me thinking about options I could look to for testing. When I expanded my searches beyond Rock, Paper, Scissors, I came across this little tool. It has to pick one of five options 25, 50, 100, 200, 500 or 1000 times in a row. You're expected to win an average of 1 in 5 times. It's pretty quick, and keeps track of your results for you so you can just copy down your results when you are finished with a batch.

    I did a couple test runs of 25 picks a piece to get a feel for things and test out the various options then jumped into a 100 card run. First try, I got 22/100 wins, an unremarkable result. My next try was 26/100 - fairly surprising. My next try was 27/100 - again, fairly surprising. On their own, none of these runs would be enough to indicate anything, but taken together, they're statistically significant at the 95% level. That could be purely by chance, but it'd be an interesting coincidence given my previous results.

    As a test, I've decided to try a different approach. Instead of guessing to win, I'm now going to do some runs where I try not to win. My thought is if I can consistently create biased results in the direction of my choice, that should indicate I have some manner of control over the results. That would suggest I am either perceiving patterns or demonstrating ESP. I've only done one 100 card run so far, with 14 correct choices. I'll see if I can keep that up.

    Oh, and a warning. The statistical significance test that site uses doesn't line up with the results of mine. I'm not sure what's going on with that. I'm going to have to re-visit my calculations to make sure I'm doing them right.

  6. Brandon-

    =={ Maybe there's some sort of matching game or something like that. }==

    =={ As a test, I've decided to try a different approach. Instead of guessing to win, I'm now going to do some runs where I try not to win. }==

    I was just thinking along similar lines when I was away from my computer, only to come back to see you making those comments. In thinking about the "psychic" possibility (which, I should point out, I don't think is a viable explanation), where it would seem that you would have to know in advance what the computer return from the RNG was going to be and then choose accordingly, in sequence, in order to win an improbable # of times. That would be assuming, of course, that psychic phenomena proceed in some way that fits with my linear way of understanding nature - which assumes that the "psychic" mechanism wouldn't be that you chose a response and then influenced the computer's number generation - (hey, of you're going to go the psychic route, who know?).

    So that's actually two actions that have to take place. I was wondering what might happen in a similar scenario, but where the "winning" or being "correct" element, or even your "influence" was eliminated... where there was some kind of comparison between the computer's random selection and your random selection .... to see if after the fact if there were overlapping patterns. Either of your modifications might be somewhat like what I was thinking about.

    =={ I could probably do it with just a programmable circuit board, a box, some buttons and some lights. }==

    Have fun:

    https://www.google.com/search?q=electronic+rock+paper+scissors+game&source=lnms&tbm=isch&sa=X&ved=0ahUKEwiiivjjxfzRAhXH1xQKHb5NC_cQ_AUICygE&biw=1024&bih=688

    Another interesting test would be for you to play the game with other people to see if there are patterns that play out. Of course, as you discussed above, the logistical obstacles would be significant.

    =={ My next try was 26/100 - fairly surprising. My next try was 27/100 - again, fairly surprising. On their own, none of these runs would be enough to indicate anything, but taken together, they're statistically significant at the 95% level. {==

    It would seem to me that if you are a psychic, there is no reason to assume that your ability to predict outcomes at a rate above chance would remain constant. It would be interesting to see if 10 repeated runs of 100 net the same result as one run of 1000. Of course, that's easy for me to say - you'd be the one spending all this time doing the runs.

    That all said, the following seems a bit tricky:

    =={ As a test, I've decided to try a different approach. Instead of guessing to win, I'm now going to do some runs where I try not to win. }==

    At a more unconscious level, it might be kind of hard to determine what you think is "winning." At a more unconscious level, what is it, really, that you're trying to do?

  7. Two things that I think need to be analysed in order to get a better understanding of what is going on.

    1) What the computer plays (% of each choice and any sequencing patterns that may occur).
    I would expect that over an increasing number of games the the options should converge to 1/3 each (within some bound) - if that is not happening there is an issue, and whether there are any actual patterns in the play - I would expect there to be none.

    and
    2) What you play - % of each choice and any patterns. I would anticipate a human being more likely to inject subtle patterns in their play than a RNG - so that is worth thinking about. Do your plays also converge to 1/3? If your moves are not evenly balanced then what is the pattern?

    Bearing in mind you could potentially play a 100% balanced game and win 100% of the time - if your choices were the correct ones.

  8. Peter Green:

    >Two things that I think need to be analysed in order to get a better understanding of what is going on.

    1) What the computer plays (% of each choice and any sequencing patterns that may occur).
    I would expect that over an increasing number of games the the options should converge to 1/3 each (within some bound) - if that is not happening there is an issue, and whether there are any actual patterns in the play - I would expect there to be none.

    The computer definitely picks each option an even amount of time (given a long enough interval). I tested this some time back by doing 10,000 runs three times, with the program picking all rock, all paper or all scissors. The results were a consistent 33%.

    As for patterns, I haven't been able to find any myself (at least, not consciously), but I can't rule out the possibility some exist. If anyone wants to look into it to see if they can find patterns I can't find, I'm happy to generate the data for them.

    2) What you play - % of each choice and any patterns. I would anticipate a human being more likely to inject subtle patterns in their play than a RNG - so that is worth thinking about. Do your plays also converge to 1/3? If your moves are not evenly balanced then what is the pattern?

    My picks are definitely not random. I am well aware of patterns in my picks. I don't even pick each option evenly. For the 5000+ matches I played yesterday, I probably picked Scissors 1000 times or less. It wasn't a conscious strategy. It was just that with the way my hand was placed on the keyboard, it was easier to hit the 1 or 3 keys instead of the 2 key. My middle finger being longer than the index or ring finger meant it had to bend more to strike the 2 key.

  9. I would be interested in taking a look, just for curiosities sake. Can't promise more than that. A text file of the results in sequence or similar would be suitable.

  10. Joshua:

    I was just thinking along similar lines when I was away from my computer, only to come back to see you making those comments. In thinking about the "psychic" possibility (which, I should point out, I don't think is a viable explanation), where it would seem that you would have to know in advance what the computer return from the RNG was going to be and then choose accordingly, in sequence, in order to win an improbable # of times. That would be assuming, of course, that psychic phenomena proceed in some way that fits with my linear way of understanding nature - which assumes that the "psychic" mechanism wouldn't be that you chose a response and then influenced the computer's number generation - (hey, of you're going to go the psychic route, who know?).

    I've only been looking at the idea I could predict outcomes (whether by pattern recognition, psychic powers or whatever), not influence them. Using computer programs to generate the "randomness" means the outcomes should ultimately be determistic. The "randomness" means we (should) have no way to predict the outcome, but the reality is if we had all the right information, we could predict the outcomes with a 100% success. If you have the information, it's just a matter of feeding it into the RNG system.

    At least, that's how it should work given our understanding of the universe. If you accept the existence of psychic powers, it becomes difficult to rule out anything.

    So that's actually two actions that have to take place. I was wondering what might happen in a similar scenario, but where the "winning" or being "correct" element, or even your "influence" was eliminated... where there was some kind of comparison between the computer's random selection and your random selection .... to see if after the fact if there were overlapping patterns. Either of your modifications might be somewhat like what I was thinking about.

    I could create a list of choices then have the computer do the same and then compare the lists. That would eliminate any possibility of pattern recognition based upon immediate feedback. If I only allowed myself to see the final tallies, not the individual outcomes, pattern recognition would be practically impossible since I'd never see what the computer actually picked. (In theory, I could still pick up on which patterns in my own guesses led to success, but that's beyond impractical.)

    That might be worth attempting just to see, but I'm confident the results would completely in line with the expected outcome. Even if I were psychic, predicting what would be on a list created at some indefinite time would likely be an impossible feat.

    Another interesting test would be for you to play the game with other people to see if there are patterns that play out. Of course, as you discussed above, the logistical obstacles would be significant.

    I don't want to play with other people because people have patterns. That's why we can create learning RPS programs that get better over time. It's also why we there are RPS competitions. To a certain extent, it is possible to predict what people will do just based on their past behavior. I wouldn't know how you could determine which results are "unusual" given that.

    It would seem to me that if you are a psychic, there is no reason to assume that your ability to predict outcomes at a rate above chance would remain constant. It would be interesting to see if 10 repeated runs of 100 net the same result as one run of 1000. Of course, that's easy for me to say - you'd be the one spending all this time doing the runs.

    That's also true if this is nothing more than subconscious pattern recognition. How well the mind works, whether that be with ESP or data processing, can be influenced by many things. Experience strongly suggests to me things like how comfortable I feel impact my results. As an example, I tried turning off my music and anything else that was running so I would have no distractions. I did 3,355 matches that way (in ten batches). I got 1092 wins, 1166 losses, 1097 ties.

    At a more unconscious level, it might be kind of hard to determine what you think is "winning." At a more unconscious level, what is it, really, that you're trying to do?

    That's definitely a major obstacle. One thing I did to address it was to reverse the rules of RPS (so rock beat paper, paper beat sciksors, scissor beat rock) to see how it'd influence my results. After 2,745 matches, I had won 894, lost 978, tied 873. That might suggest that even if I can pick up on patterns (or predict the future), I am not necessarily capable of translating that into desired outcomes.

    Or maybe not. I not entirely certain how to interpret any of this. It all seems so weird. One of the weirdest things is the knowledge I've now played RPS over 20,000 times in my life.

  11. Peter Green:

    I would be interested in taking a look, just for curiosities sake. Can't promise more than that. A text file of the results in sequence or similar would be suitable.

    The results from my runs thus far are spread across numerous files (one per batch of games) and not formatted for easy analysis. Would it be fine if I created a new file specifically for examining the computer's choices? It'd be easy for me to make a file of, say, 10,000 selections by the AI for people to look at. There's just the possibility that somehow the timing of when I make my selections would have an effect on any patterns by the AI. It shouldn't happen, but then, none of this should be happening.

  12. Whatever is easier for you - I can just as easily join a bunch of files (as long as I know what each is) but a single file would also be fine.

    Lets see what you have and then I can ask questions if needed.

  13. Alright. One of the biggest reasons I want to generate a new file is then I can have the output be easier to parse. My current logging uses text, meaning every file would have to be parsed to replace things like "You won" with a number. The computer's choice isn't even stored directly, just the players', so you'd have to take that choice and compare it to the outcome to determine the computer's choice. It's doable, but its more work. I probably should change how I log things.

    Anyway, my main internet connection is down right now due to a storm, but when it comes back online, I'll upload the file. It may not be until morning.

  14. I'm glad to hear that, partially because perl is my favorite language. I don't get to use it often because every group project I'd use it in uses Python instead (which I hate). As another reason, it turns out my internet connectivity issue isn't because of the storm. It has something to do with a cable (though I guess that could have been caused by the storm). Not having my main internet connection is going to interfere with a number of things and waste a fair amount of my time. For now, I'm just going to post up an archive of my results thus far. I'll try to get a test file specifically for this purpose uploaded tomorrow or the day after.

    You can find the archive here. It should be pretty self-explanatory. The files beginning with r_ are files where the rules were reversed. There were also a few (I think three?) results files where I didn't record individual outcomes. They would be the first of the timestamped files. The archive also includes the program I use. Please don't be too harsh about the quality of code in it.

    Now that I think about it, I guess if you wanted more data, it would be easy to modify the program to create it for you. So, yay.

  15. Quick and easy. 😀

    First step:

    Here are the full sequences from the first suitable log file (20161126-215834). This is now a single ordered string of all the ai choices. I included a second string of all the user choices that pair with the ai choices. (123 = RPS as per your original program and no wrap in the string).

    33323113233312321113312223211233312133232221123112112233211132333121332211333232323232223131312112232212333312111333213113123131
    12313321331133233122133333212111323331213321111323321123232313313223132112333122133122133231331333223123132133133213232221333312

    These may even be suitable for loading into R as vectors for better analysis.

    Anyway - I will now concatenate the results of all the suitable log files so there is a single array of all selections for doing some pattern evaluations on.

    Then some pattern analysis ...

  16. Interesting results - this from the first set of logs (where individual games are logged), not the reversed set, not the second RNG and not the 'without music' set.
    I propose to evaluate those sets separately at least at first.

    Number of times the AI played any particular choice:
    Rock = 3121 (33.1%)
    Paper = 3125 (33.1%)
    Scissors = 3186 (33.8%)
    vs the human choices:
    Rock = 3403 (36.1%)
    Paper = 1822 (19.3%)
    Scissors = 4207 (44.6%)
    (both sets total to 9432)

    I then parsed the 9432 games and counted all duplicate sequences (times the AI repeated the same choice, i.e. 11, 22, 33). This would be expected to be about even in the results from a RNG and converge to balanced as the number of samples increases.

    From the AI:
    11 = 789
    22 = 778
    33 = 781
    From the human:
    11 = 533
    22 = 225
    33 = 960

    Then moving on to the number of times the same selection is made 3 times in a row:
    AI:
    111 = 226
    222 = 255
    333 = 255
    Human:
    111 = 140
    222 = 57
    333 = 230

    4 times in a row:
    AI:
    1111 = 66
    2222 = 78
    3333 = 102
    Human:
    1111 = 40
    2222 = 13
    3333 = 61

    5 in a row:
    AI:
    1x5 = 25
    2x5 = 22
    3x5 = 33
    Human:
    1x5 = 9
    2x5 = 5
    3x5 = 23

    6 in a row:
    AI:
    1x6 = 8
    2x6 = 6
    3x6 = 10
    Human:
    1x6 = 4
    2x6 = 2
    3x6 = 8

    7 in a row:
    AI:
    1x7 = 2
    2x7 = 1
    3x7 = 5
    Human:
    1x7 = 1
    2x7 = 2
    3x7 = 0

    8 in a row:
    AI:
    1x8 = 0
    2x8 = 0
    3x8 = 1
    Human:
    1x8 = 0
    2x8 = 1
    3x8 = 0

    There were no 9 in a row sequences.

    There does appear to be a very slight bias in this RNG towards the higher number, but this does converge toward balance as the number of samples rises.
    The human input displays very strong biases toward scissors and against paper which remain fairly consistent.

  17. Reversed results game set:
    Total games = 2745

    AI:
    1 x 1 = 900 (32.79 %)
    2 x 1 = 927 (33.77 %)
    3 x 1 = 918 (33.44 %)
    Human:
    1 x 1 = 821 (29.91 %)
    2 x 1 = 662 (24.12 %)
    3 x 1 = 1262 (45.97 %)
    AI:
    1 x 2 = 205 (7.47 %)
    2 x 2 = 214 (7.80 %)
    3 x 2 = 217 (7.91 %)
    Human:
    1 x 2 = 154 (5.61 %)
    2 x 2 = 106 (3.86 %)
    3 x 2 = 396 (14.43 %)
    AI:
    1 x 3 = 59 (2.15 %)
    2 x 3 = 66 (2.40 %)
    3 x 3 = 65 (2.37 %)
    Human:
    1 x 3 = 47 (1.71 %)
    2 x 3 = 27 (0.98 %)
    3 x 3 = 122 (4.44 %)
    AI:
    1 x 4 = 13 (0.47 %)
    2 x 4 = 18 (0.66 %)
    3 x 4 = 15 (0.55 %)
    Human:
    1 x 4 = 18 (0.66 %)
    2 x 4 = 9 (0.33 %)
    3 x 4 = 46 (1.68 %)
    AI:
    1 x 5 = 4 (0.15 %)
    2 x 5 = 6 (0.22 %)
    3 x 5 = 5 (0.18 %)
    Human:
    1 x 5 = 6 (0.22 %)
    2 x 5 = 3 (0.11 %)
    3 x 5 = 19 (0.69 %)
    AI:
    1 x 6 = 1 (0.04 %)
    2 x 6 = 1 (0.04 %)
    3 x 6 = 1 (0.04 %)
    Human:
    1 x 6 = 2 (0.07 %)
    2 x 6 = 2 (0.07 %)
    3 x 6 = 4 (0.15 %)
    AI:
    1 x 7 = 1 (0.04 %)
    2 x 7 = (0.00 %)
    3 x 7 = (0.00 %)
    Human:
    1 x 7 = 1 (0.04 %)
    2 x 7 = (0.00 %)
    3 x 7 = 2 (0.07 %)

    Similar results.

  18. Without Music:
    Total games = 3355

    AI:
    1 x 1 = 1108 (33.03 %)
    2 x 1 = 1133 (33.77 %)
    3 x 1 = 1114 (33.20 %)
    Human:
    1 x 1 = 1225 (36.51 %)
    2 x 1 = 784 (23.37 %)
    3 x 1 = 1346 (40.12 %)
    AI:
    1 x 2 = 278 (8.29 %)
    2 x 2 = 282 (8.41 %)
    3 x 2 = 281 (8.38 %)
    Human:
    1 x 2 = 368 (10.97 %)
    2 x 2 = 200 (5.96 %)
    3 x 2 = 446 (13.29 %)
    AI:
    1 x 3 = 80 (2.38 %)
    2 x 3 = 84 (2.50 %)
    3 x 3 = 87 (2.59 %)
    Human:
    1 x 3 = 106 (3.16 %)
    2 x 3 = 56 (1.67 %)
    3 x 3 = 125 (3.73 %)
    AI:
    1 x 4 = 22 (0.66 %)
    2 x 4 = 30 (0.89 %)
    3 x 4 = 26 (0.77 %)
    Human:
    1 x 4 = 23 (0.69 %)
    2 x 4 = 14 (0.42 %)
    3 x 4 = 25 (0.75 %)
    AI:
    1 x 5 = 5 (0.15 %)
    2 x 5 = 8 (0.24 %)
    3 x 5 = 12 (0.36 %)
    Human:
    1 x 5 = 6 (0.18 %)
    2 x 5 = 3 (0.09 %)
    3 x 5 = 8 (0.24 %)
    AI:
    1 x 6 = 0 (0.00 %)
    2 x 6 = 3 (0.09 %)
    3 x 6 = 5 (0.15 %)
    Human:
    1 x 6 = 1 (0.03 %)
    2 x 6 = 1 (0.03 %)
    3 x 6 = 1 (0.03 %)
    AI:
    1 x 7 = 0 (0.00 %)
    2 x 7 = 1 (0.03 %)
    3 x 7 = 0 (0.00 %)
    Human:
    1 x 7 = 0 (0.00 %)
    2 x 7 = 1 (0.03 %)
    3 x 7 = 0 (0.00 %)
    AI:
    1 x 8 = 0 (0.00 %)
    2 x 8 = 0 (0.00 %)
    3 x 8 = 0 (0.00 %)
    Human:
    1 x 8 = 0 (0.00 %)
    2 x 8 = 1 (0.03 %)
    3 x 8 = 0 (0.00 %)

  19. MikeN, I haven't tried listening to music that wasn't piped through the computer yet (it turns out the battery on my mp3 player won't hold a charge anymore), but I have tried doing this without any sound at all. I lost an abnormal amount when I did. That was only on one day though, so I should try repeating the experiment to see if the results remain consistent. I also need to see about getting a new mp3 player or something else I can use to listen to music. I want to make sure I use headphones for it so it's consistent with my normal use of music.

    That said, I did dig into the source of entropy used for the RNG calls in my program. I couldn't find out all the sources used as Windows doesn't publish full details on its function for generating the random values, but I was able to confirm it should be impossible for speakers to influence my results. Even if speakers are used as a source for entropy in the process, the other parameters should ensure the results were still "random."

    But I want to stress the word "should." I'm not prepared to rule anything out based upon theory at this point. At this point, I'm only willing to trust empirical results. Given that, I'll definitely test that possibility at some point. It's just a matter of convincing myself to spend the money on a new mp3 player.

  20. Peter Green, I'm glad to hear it was easy. Experience with regex makes a huge difference. I know I could extra the data from the files like you did (probably not as quickly or easily), but a lot of people wouldn't know how. As for R, we could definitely load the data into it. It's really easy if one inserts a line break after each outcome. It's only slightly more difficult if not.

    As for your results, they are about what I would expect based on my memory. I wonder if any of the uneveness seen from the AI in this data would hold up over a larger sample. I'm betting it wouldn't. I know the biases on my end would. Sometimes I "feel" like one answer is correct so I go with that, but otherwise, I'm reasonably consistent with how I play.

    One thing which I did find interesting about those results is the slight bias (which isn't large enough to assume means anything) from the AI toward picking Scissors. I pick Scissors the most, by a large margin. That means if this is a real bias from the AI, it should increase the number of ties I experience, not the number of wins.

    I really want to generate more data to see if the patterns I've experienced so far hold up, but I'm kind of afraid to. Everything I Know tells me there is no way this RNG system should have patterns that I could pick up on, consciously or subconsciously. At the same time, I don't believe in psychic powers. If I can keep winning more than I should, over and over, what does that mean?

    Even worse, what if fear of the possibilities impacts the results? What if I could keep winning more than I should be was so worried about it I failed to? I don't know. I mean, at this point I don't know anything. I wish I had never goofed off with that stupid graphing calculator.

  21. What I am thinking about for a next step is to assess the human response to the AI game.

    This basically assumes that the AI is as random as it can be, and each game has no effect on the move for the next game (The ai choices would be independent of each other). It is my theory that this would not be the case for a human player, and I am curious if there is any pattern to the human move the next game after a specific ai move.

    I'm just pondering how to code that analysis.

    I can give you the full ai and human choice series in a txt file with each element on a separate line if you want. Do you want to keep the reverse and no_music sets separate or combine them all into one?

  22. Peter Green, there's been a lot of work done on examining human patterns for RPS. I'm not overly familiar with it, but I've followed it a bit due to an interest in machine learning. People have programmed AIs to observe players' patterns and adjust their strategies to win more often. I know there's even a website (or perhaps more than one) where you can play against an AI that does that.

    Generally, what these programs do is compile a database of play sequences of X matches long. For instance, they'll examine ever 4-match sequence (1111, 1112, 1113, etc.) and see how often it comes up. Then, after three matches, they'll look into their table to see which play is most likely based upon their previous experiences. By doing this, they can beat most humans at a rate significantly higher than 33%.

    I can give you the full ai and human choice series in a txt file with each element on a separate line if you want. Do you want to keep the reverse and no_music sets separate or combine them all into one?

    I'm happy to take anything you might want to make, but don't feel obliged to make anything on my account. I can take whatever you make (or even the files I have) and parse them for analysis if and when I want to do any. I just haven't seen anything that makes me think looking into the patterns will explain anything so I Haven't spent much time on it.

    Though now that I think about it, I'd say go with whatever you think would be best for people in general. I'm not the only person who might be interested in what could be found. So to answer your question, I'd say combine the sets to enable a total analysis but also make them available individually so people have the option of examining specific scenarios.

  23. By the way, I mentioned the dread I was feeling about any further testing. I decided I needed to get over it so I'd do a few practice batches. I haven't done one in some time, but the idea is to say, before I start, that the matches will not be included in my test results. I obviously can't make that call after-the-fact, but I think it is important I have the option to practice to try to get comfortable without it skewing my results.

    The point is, I don't offer any of this as part of any rigorous testing. My hope was just to settle my nerves. The dread I felt was palpable, to the point I could actually feel my heart rate was elevated when I did my first two batches. This is what happened:

    Wins: 126
    Losses: 133
    Ties: 131

    Wins: 43
    Losses: 55
    Ties: 50

    It's not hugely skewed and the choice of when to stop might have influenced the results a bit, but I honestly felt like I was hesitating/shying back while I played. I took a minute to calm down and clear my head, did another batch and got:

    Wins: 102
    Losses: 85
    Ties: 76

    I won't claim these results mean anything. The next set was certainly not as dramatic:

    Wins: 78
    Losses: 67
    Ties: 79

    So any perception I might have of these results corresponding to my mental state could be due to things like confirmation bias. Given I explicitly set these out as practice runs, not to be subject to rigorous analysis, I can't really say they mean anything. Still, things like this don't help me shake the feeling individual batches and my mental state have a correlation.

  24. By the way, one thing that's definitely worth checking is if I win more matches with a certain selection, proportionally, than others. As in, does me picking Rock have a higher win rate than Scissors even though Scissors would have more total wins?

  25. Here is a link to a zip archive containing a series of text files (windows based).

    Hopefully the filenames are self explanatory but included are the 9432 game set, the No Music game set and the reverse rules game set, plus a combined 'All' games set.

    The files contain one response per line and are in the matched sequence order. AI and User responses are in separately named files.

    http://www.petergreen.id.au/files/Files.zip

  26. Cool, thanks. If you want to be really cool, you could create files with the AI choice, player choice and outcome. Separate each by a space, and then anyone can load the data into R and start making plots with just 2-3 lines of code.

    I've loaded a few of those data files into R, and I get the same results you get (so far as I tested them). There definitely aren't any obvious patterns/biases in the AI's play while there are obvious biases/patterns in how I play. That's not very interesting. What I think would be interesting would be to look for patterns in the outcome (then compare those to the choices that were made).

    Not to be too demanding, I hope. I can always combine the data myself if need be.

  27. I just noticed the AI_All file is 16,142 entries long while the User_All file is 15,532 entries long. I think something might have gone wrong while making them.

  28. Here is a different archive - this contains 3 txt files - one for normal games, one for reversed games and one for the no music set.

    The counts are at the head of the file (easy to remove if that interferes with loading) then a header line with the column names (AI User Result) followed by the 3 columns separated by spaces.

    I decided not to do this as a single file, because the rules change from one set to the next so a combined evaluation risks being distorted. Also it is a simple cut and paste in a text editor to combine them if desired, but separating them is virtually impossible.

    Enjoy!
    http://www.petergreen.id.au/files/Combined.zip

    I also reviewed the coding in your python script, and then re-coded that in Perl and ran a comparison between my hash lookup for the game result and your 'difference % 3' calculation of the result. You will be happy to know that the results for all sets was identical.

    Here is a sample output from that run: (AI User Result 'Calculated Result') and totals at the end.
    3 1 W CW
    3 3 T CT
    3 1 W CW
    1 2 W CW
    1 3 L CL
    2 1 L CL
    CL 5204
    CT 4990
    CW 5338
    L 5204
    T 4990
    W 5338

    A caveat:
    The only potential trap I can still see is that I have used the result and user choice to derive the AI choice - (I used a hash lookup). If that logic has not successfully reverse engineered the actual AI choice, then an analysis of the result could still be misleading. That was the reason for re-coding your result subroutine.

  29. Peter Green, thanks for that. The files are very easy to work with. That let me finally find the motivation to do One of the neat things we can do with this data. Look at this plot:

    http://www.hi-izuru.org/wp_blog/wp-content/uploads/2017/02/2_8_RPS_cumsum.png

    That shows you my net wins over time for my normal testing. We can create this by setting a Win to 1, a Tie to 0 and a Loss to -1 then taking the cumulative sum of the data set. In a random test, we would expect the line to fluctuate around zero. Instead, we can see the line increases with a fair amount of consistency.

    This approach is useful because it lets us quickly identify what results were happening when with just a quick look at a chart. Judging by that chart, I'd say there is a general tendency for me to win more than I lose, but there also appear to be periods where i wind up "on a roll" and far exceed even the generally increased rate of winning. Also, it appears my recent runs may have involved winning more consistnetly. Consider what happens when we fit a linear model:

    http://www.hi-izuru.org/wp_blog/wp-content/uploads/2017/02/2_8_RPS_cumsum_fitted.png

    As you can see, the earlier runs varied around the fitted line more than the later ones. In theory, this might imply improved skill on my part at winning. Maybe not though. There's a lot of stuff we can do to examine this data, and I still don't know how much we should make of anything we might see.

    By the way, according to this file my wins per value chosen are 1: 1209, 2: 656, 3: 1487. My selections were: 1: 3403, 1822, 4207. This gives win percentages of: 1: 35.53%, 2: 36.00%, 3: 35.35%. There's a slight difference in win rate based on value chosen, but it is likely just variance.

    I'll try looking at the data in more detail later tonight. Or I might get raging drunk and try to forget about all this. I'd say there are even odds as to which route I go with, but... yeah, even odds doesn't seem to hold much meaning anymore.

  30. MikeN, that's the next test I want to try, but right now I'm finding it difficult to play the game anymore. I'm at the point where I have to count as one of my best explanations, "I'm delusional and secretly modifying my recorded data without realizing it." I'm only partially kidding. I'm familiar with livestreaming videos, and I have software on my computer which would let me record me taking the test. I'm almost to the point where I want to record all my tests, if not broadcast them live.

    Part of me thinks I might be overreacting to this, but another part of me wonders if you can really overreact to basic laws of reality consistently appearing not to work.

  31. I went ahead and made a new version of the program to do as Mike_N suggested. It's set to create a list of 100 random choices then have me play against the AI looping through them. There's no premature ending, so I must play 100 matches, no more, no less. Any faulty guesses (like typing '12' instead of '1') are disregarded.

    I haven't done any tests to see how I can do under these rules, but I did do ten sets while testing the code. The results were 311 wins, 360 losses, 329 ties. For the first few sets, I didn't even look at the screen while guessing as I just wanted to make sure the code worked. The results wouldn't have shown anything anyway as I initially started the program by displaying a list of all the AI's choices as part of debugging (so in theory, I could have memorized the AI's guesses). Interestingly, during this time my results were 141 wins, 190 losses, 169 ties.

    Once I stopped the program from displaying the list of the AI's future guesses, I started paying a bit of attention to my guesses. I wasn't trying very hard, but I also wasn't guessing without even bothering to look at the screen. During that time, my results were 170 wins, 170 losses, 160 ties.

    I can't say that means anything. I just thought I'd throw it out there. I'll start the real testing later tonight.

  32. Similar to the counts I did with the duplicate number series for the AI, I want to run a count for number pairs (how often the AI follows a 1 with a 2 vs with a 3 for example). Then I want to run a count of how often you respond to a particular AI in one game with a specific guess in the next game. This in because I expect the computer to be essentially independent in each choice, but the user to be influenced from game to game.

    Those two series I'll probably do just on the normal game results to eliminate any rule change bias.

    I'm also curious about the human nature to change to a different response after a few answers because that should feel more random. Not sure yet how to look at that.

    BTW my site seems to be inaccessible at the moment and the phones at the provider are down too.

  33. First pass at sequencing analysis and I think you may find this interesting:

    Total games = 9432

    AI sequences:
    11 = 789
    12 = 1018
    13 = 1069
    21 = 1023
    22 = 778
    23 = 1052
    31 = 1064
    32 = 1057
    33 = 781

    Human sequences:
    11 = 533
    12 = 622
    13 = 2102
    21 = 634
    22 = 225
    23 = 900
    31 = 2089
    32 = 913
    33 = 960

    You have a distinct preference for following a 1 with a 3 and vice versa.

    Far more interestingly, the ai has a distinct aversion to repeating the same number twice compared to the other two alternatives. I would not be at all averse to believing that a human could subconsciously pick up on such a pattern which could indeed favour your odds in the next game. I'll try to run a comparison of your response to the ai's choice in the previous game.

  34. Peter Green, those results would be interesting, but I don't think they're correct. As a quick check, I added the sequence counts you reported for the AI together, and it only came out to 8,631. That's lower than it ought to be. For 9,432 matches, there are 9,431 sequences of 2. Moreover, you report 8,978 sequences of 2 for the player, which doesn't match the number reported for the AI. Here are the results I get for the AI:

    11 = 1034
    12 = 1018
    13 = 1069
    21 = 1023
    22 = 1050
    23 = 1052
    31 = 1064
    32 = 1057
    33 = 1064

    Here is what I get for the player results:

    11 = 679
    12 = 622
    13 = 2102
    21 = 634
    22 = 287
    23 = 900
    31 = 2089
    32 = 913
    33 = 1205

    As you can see, my results match yours except for the paired sequences.

  35. This is the result for comparing what the AI played last game, then what the user played this game. Include is the AI move this game and the result, with a count of all instances. This excludes the first game as there is no reference for it.

    AIlast AInow User Result Count
    1 1 1 T 378
    1 1 2 W 211
    1 1 3 L 445
    1 2 1 L 363
    1 2 2 T 175
    1 2 3 W 480
    1 3 1 W 404
    1 3 2 L 205
    1 3 3 T 459
    2 1 1 T 379
    2 1 2 W 210
    2 1 3 L 434
    2 2 1 L 365
    2 2 2 T 198
    2 2 3 W 487
    2 3 1 W 411
    2 3 2 L 184
    2 3 3 T 457
    3 1 1 T 350
    3 1 2 W 235
    3 1 3 L 479
    3 2 1 L 359
    3 2 2 T 178
    3 2 3 W 520
    3 3 1 W 393
    3 3 2 L 225
    3 3 3 T 446

  36. Re: the paired sequences - it seems that I have counted 111 as only 1 not two, and 1111 as two as it should be.

  37. Fixed - now matches yours.
    Total games = 9432

    AI sequences:
    11 = 1034
    12 = 1018
    13 = 1069
    21 = 1023
    22 = 1050
    23 = 1052
    31 = 1064
    32 = 1057
    33 = 1064
    Human sequences:
    11 = 679
    12 = 622
    13 = 2102
    21 = 634
    22 = 287
    23 = 900
    31 = 2089
    32 = 913
    33 = 1205

  38. It's good to hear our results reconcile now. It is a bit of a shame though. It would have been nice if the explanation for what's going on could have been something to simple.

    For what it's worth, I spent some time testing the approach Mike_N suggested yesterday and today. After 50 100 match sets, my results are 1674 wins, 1658 losses, 1668 ties. That doesn't do anything to support the idea I can win more than I should be expected to. I don't know if differences in the approach to playing the game caused the difference or some other factor. What I do know is I did 15 of the 50 sets today, and the results for today have been 516 wins, 469 losses, 515 ties. If we considered this an independent sample, that'd rise to the ~80% significance level. That's obviously not the case, but this might suggest playing more today might lead to results more akin to what I had in the previous approaches.

    I'm going to give it a shot and see. I don't know how many more sets I'll do though. If I go far enough with this approach with no notable results, I might have to go back to the previous approach and see if the results disappear for it as well.

  39. I did another five sets under this approach, and I noticed something I had noticed during the previous sets. It seemed to me my general result was that I was winning more than expected but this was being offset by cases where I lost super hard. I had chalked that up to some sort of perception bias, but after losing 45 out of 100 times for the second time (in 55 sets), I decided to check to be sure. Give it a look yourself:

    http://www.hi-izuru.org/wp_blog/wp-content/uploads/2017/02/2_9_5500_hists.png

    That graph shows four histogram charts, one for wins, one for losses, and one for the combined totals. Given enough data, all of these should have the same distribution. 55 data points isn't enough data though. As such, I don't know that these charts tell us anything of value.

    However, I can't help but notice all three outcomes have a distinct pattern right now. The Ties have the closest of any outcome to a normal distribution. The losses are skewed toward the low end but have a long tail on the high end that balances this out. The wins are skewed toward the high end but have a long tail on the low end that balances it out.

    I don't know what, if anything, this means. I mostly wanted to show the remarkable lack of 33s in my results. In 55 sets of 100 matches, I won 33 matches only twice. I never lost 33 matches, but I tied 33 matches in seven sets. Results I hit more often than 33 are 28, 29, 31, 32, 34, 35, 36, 37 and 38. That seemed unusual enough to be worth sharing.

  40. Alright, I'm taking a break for a few hours. I just pushed myself up to 70 sets of 100 matches apiece (7000 matches) under the new approach. I'm now at 2,365 Wins, 2,291 Losses, 2,344 Ties. That doesn't seem to mean a lot, but a more detailed examination might. Here's an update on those histograms:

    http://www.hi-izuru.org/wp_blog/wp-content/uploads/2017/02/2_9_7000_hists.png

    I wish I knew what the odds of winning, tying or losing 33 times. I know how to figure out the odds of winning 33 out of 100 times are in a fair game (I think ~8%), but winning/losing/tying counts aren't independent of one another, so I don't know exactly how to combine them. It seems weird I've never won 33 times and only lost 33 times twice (I hit 33 ties eight times). I don't know how unusual it is given I've only done 70 runs though.

    What I do know is I'm freaked out again. As I've mentioned, the results I've gotten with RPS in general have imbued in me a sense of dread. Using Mike_N's idea, those results seemed to go away. That made me happy. I thought I had found a test I couldn't beat. After 5500 matches, I was almost ready to call it a failure except I wanted to test the strange lack of 33s in my outcomes.

    In this happier mood, I did 15 sets of 100 a piece with barely a thought. I was just going through, cruising with nothing on my mind by, "How many 33s will show up?" Let me show you what happened:

    http://www.hi-izuru.org/wp_blog/wp-content/uploads/2017/02/2_9_7000_wins.png

    I. Hate. Life.

  41. After 100 sets of 100:

    http://www.hi-izuru.org/wp_blog/wp-content/uploads/2017/02/2_9_10000_wins.png

    My win rate over all 10,000 matches is only 33.67%. If we only use the last 5,000 matches, it is 34.58%. I don't know how comfortable we should be subsetting the data like that, but there's less than a 1 in 20 chance of being able to reach that win rate over 5000 games. The last 5,000 games are definitely not "normal."

    And I really need to take a break now. As much as this is driving me crazy, I do have other things I need to get done. Like eating. I just realized I've eaten ~20 peanuts in the last two days, nothing else.

  42. I think the repeated play is skewing your results- any skill you have will be degraded.
    There was a memory game at EPCOT, with sets of 16 binary digits in a 4x4 square. The max score was 255.
    I did well early on and beat the max easily, not knowing it was the max. Trying to get exactly that, I kept failing
    and ended up playing for a long time what should have been a very easy task.

  43. MikeN, that is definitely one concern about interpreting results. Fatigue and boredom can degrade any skill. Combine that with a lack of motivation (or even a motivation to not do well), and any real "skill" can disappear entirely.

    Though if factors are diminishing my skills, they aren't doing so by enough to stop them from being apparent. In the long run, I am still winning more than I lose. The only thing that might be changing is how big that margin is.

    I suspect I would do "better" if I could find a good source of motivation though. I'm curious how well I'd do with some decent motivation at ~2,000 matches a day. I bet I could maintain a 35% win rate.

    And yes, I know that shouldn't be possible. I'm not psychic. I'm alsonot willing to deny plain eevidence that's right in front of my eyes. Right now, the evidence overwhelmingly indicates I can win more than I should.

  44. I just did another 20 sets of 100. I don't know if I'll do any more tonight, but here is how things are progressing:

    http://www.hi-izuru.org/wp_blog/wp-content/uploads/2017/02/2_10_12000_wins.png

    The current standings are 4,057 Wins, 3,951 Losses, 3,992 Ties. This is only a 33.81% win rate, which is not significant. However, if we cut off the first 50 sets like I mentioned before, the win rate jumps up to 34.56% and is "statistically significant" at the 95%. Given that subsetting, the standings would be 2,419 Wins, 2,325 Losses, 2,356 Ties. I think it is reasonably safe to say there is a breakpoint of sorts around that 50 set mark.

    Even if we don't use that subsetting though, I think it's difficult to argue these results don't indicate something is unusual.

  45. My guess is that the RNG is leaving some sort of anomaly in the plays generated. A couple of places to look would be after strings of repeats or strings of no shows. It probably wouldn't be too hard to write a program to check. It could run through millions of plays automatically to see if results converge on .3333... There's probably a lot of subtle patterns that could be checked.

  46. I've tested the RNG a bit, though there are always more tests that could be performed. Right now, I know there are no biases for which option gets picked overall (each option definitely gets picked 1/3rd of the time). And as Peter Green and I were discussing, each two-game series of picks shows up the same as the rest.

    If there are any patterns, it must be in three run or longer series. I can tabulate results for longer series, but I don't expect to find much. With a good RNG, you shouldn't be able to find patterns that easily.

    Of course, even if there are no biases in what gets picked overall, there can still be patterns.

  47. The odd thing with this is that we are not just looking at how random the AI is. The AI could be perfectly random and if the player makes the correct choices, he could _possibly_ win 100%. The plays could still be totally balanced at 1/3 each and the win could be 100%. Conversely for the appropriate choices, either Tie or Lose could also be 100%. What seems to be interesting here is that over longer sequences, Brandon manages to achieve win ratios higher than expected, meaning he manages to make the right choice more often than not. There are some definite patterns to Brandon's play compared to the AI, though whether that alone is sufficient to justify the result is not really clear.

    Even if the human only played 2 out of three options, the potential W/L/T ratios are still the same, so it actually matters what you play and when, not just being balanced.

  48. Peter Green, yup. As a human, I expect there to be patterns in my play. I don't fight that. In fact, I do the opposite. I sometimes "feel" like a particular pattern I'm used to using will work well in a situation so I go with it. Somehow it seems to work. It is possible that's because the AI has some sort of pattern to its picks that I am subconsciously picking up on, but in theory, it could happen without the AI having any patterns. It'd just require something weird like psychic powers. Looking for patterns in the AI potentially lets us rule such things out.

    At this point, I think it may be best if I find a new system to do my tests with. I'm not sure how much repeating these same tests and getting (reasonably) consistent results can tell us. I'd like to keep doing the 100-run batches since the pattern in them hasn't been as strong so far, but after that, I might see about using something else. I just have to figure out what it'll be. I might go with rewriting this in a different programming language, using something like the website I talked to Joshua about or trying a physical test.

    The physical test seems like it could be the most interesting, but it's also the trickiest one. One thing I've worked on in moments of boredom is how to win coin flips. I've gotten to where I can flip a quarter,* catch it in the air then call heads/tails correctly ~75% of the time. The trick is cheating. Once the quarter is in your hand, it's possible to feel which side of the coin is which. In theory, you could even roll the coin over in your hand to ensure you got the outcome you desired. Stuff like that is why physical tests of ESP are so tricky. There are just so many ways to cheat in the physical world.

    Anyway, I've uploaded the data for my RPS testing in batches so far. You can find it here:

    http://www.hi-izuru.org/wp_blog/wp-content/uploads/2017/02/rps_results_batches.zip

    *With the old quarters, at least. Because the US started releasing state quarters with a different design for all 50 states, there can be dramatic differences between quarters. Some of the new quarters make it easier to feel which side is which. Others make it harder.

  49. Brandon -

    =={ Once the quarter is in your hand, it's possible to feel which side of the coin is which. In theory, you could even roll the coin over in your hand to ensure you got the outcome you desired. {==

    I would suggest that short of a paranormal explanation, it might be relevant that we all likely have many powers of observation that work at extremely subtle levels, many of which are very far beyond our direct and conscious levels of awareness/understanding. And of course, some people might have those powers that are not shared by many others.

    I think of this story that was making the rounds a while back:

    http://www.cnn.com/2011/11/09/tech/innovation/daniel-kish-poptech-echolocation/

    Related, I assume that you've read about the study described here?:

    https://www.psychologytoday.com/blog/one-among-many/201010/why-i-dont-believe-in-precognition

    I found it fascinating when I first read about it, but haven't followed up at all since.

    As for your conundrum... I still think it would be interesting to see if you constructed a test for your "psychic" abilities that might transfer such skills (if you did actually have them - which I think I need to reiterate is something I think is not even remotely likely) to another similar, but also different context; the idea being that you would try to eliminate context-specific variables.

    I have been thinking about this periodically since first reading your post. The notion that you could predict, with statistical significance, random events is certainly interesting. But in the end, I don't get terribly disturbed by your results because I assume that there must be some factor that remains as of yet unobserved. One advantage to not being particularly smart is that it is relatively easy for me to accept that there are many aspects of the world that are true even though they lie well beyond my perception and comprehension.

  50. Joshua:

    I would suggest that short of a paranormal explanation, it might be relevant that we all likely have many powers of observation that work at extremely subtle levels, many of which are very far beyond our direct and conscious levels of awareness/understanding. And of course, some people might have those powers that are not shared by many others.

    I've mentioned this possibility above, suggesting that I am somehow perceiving patterns on a subconscious level and using that to win. The problem with that explanation is it shouldn't be possible. The RNG algorithm I am using is designed for crytographic purposes. Assuming it (and my implementation of it) is working properly, the sample space needed to find patterns in it would be much, much larger than a hundred trillion. Including all my experimental and other test runs, I probably haven't played 100,000 times.

    If we had a record of all the picks the AI has made (we do have a record of ~20,000 picks now), no amount of analysis should let us come up with a strategy to win more than 1-in-3 matches. If we can, the RNG algorithm (or my implementation/use of it) is terribly flawed. I haven't found any evidence to indicate that is true, though I obviously can't prove it isn't right now.

    Related, I assume that you've read about the study described here?:

    https://www.psychologytoday.com/blog/one-among-many/201010/why-i-dont-believe-in-precognition

    I found it fascinating when I first read about it, but haven't followed up at all since.

    I actually hadn't. I've never really looked into "psychic research" and the like as the examples I have seen have always had obvious problems. Combine that with the fact I don't believe in psychic powers, and I've never seen much point. From the bit of reading I've done just now, it seems like those results were... not convincing. Specifically, I think you have to rely a bit on cherry-picking to claim the results were "significant." That said, I'm not overly impressed by that article discussing it either. Consider:

    I remain unconvinced. I am not only bothered by the lack of a positive theory, but also by the contradictions between psi and basic scientific assumptions. The conventional view assumes that events can causally affect other events that have not happened yet. When the arrow of time is depicted as pointing to the right, causes [C] lie to the left of effects [E]. This view also assumes that any number of intermediary causes can be inserted between C and E. A typical JPSP article-hence not Bem's-sports at least one such mediator, although their number is theoretically infinite. Each event is a cause with respect to later events and an effect with respect to earlier ones. What we have then is an infinitesimal chain of causally connected events that run from the past through the present to the future.

    It is true we (basically) assume the future does not affect the present. However, there is nothing which says a person predicting the future must rely upon information from the future. Just like it is conceivable a person could perceive patterns in the output of an RNG algorithm, it is theoretically possible a person could perceive something in the present that lets them predict the future in a way no physical phenomenon could explain. Similarly, the article says:

    By skirting the issue of how the future acts on the present, belief in psi boils down to belief in processless causation. We are not only asked to believe that the future can influence the present (and by extension that the present affects what we think of as the past), but also that it does so without intermediate steps. In other words, the future leaps back across time to affect the present; it does not flow back through a chain of retroactive causes along an inverted temporal arrow.

    But it does nothing to establish this must be true. It isn't. That psychics might only be able to pick up on some "signal" from the future at a specific point when performing a particular test doesn't mean that signal only exists at that particular point. We could even assign a location to that signal by saying it is present in waveform collapses of quantum decoherence. The details of that aren't important. I'm not suggesting it really is an explanation. I just making the point quantum mechanics can only describe (aspects of) reality in a probabilistic/statistical sense. We assume variance in quantum mechanics is random, but what if it isn't? What if there are patterns in it?

    I'm going to stop that train of thought there. I don't believe in psychic powers, and I don't really want to spend a lot of time explaining why many of the arguments against the potential existence of psychic powers are wrong. I find people often make bad arguments against psychic powers because they can't take the subject seriously enough to come up with good ones.

  51. Joshua:

    As for your conundrum... I still think it would be interesting to see if you constructed a test for your "psychic" abilities that might transfer such skills (if you did actually have them - which I think I need to reiterate is something I think is not even remotely likely) to another similar, but also different context; the idea being that you would try to eliminate context-specific variables.

    I want to do this. It's just a matter of designing the tests. The anomalies in the tests so far have been small, so to test them, I need to be able to do (and record the results of) thousands and thousands of tests. That creates some obstacles. Then I have to also try to ensure the test is such that I cannot cheat (even accidentally). That's especially important for any tests that have physical components.

    Right now, I'm trying to come up with a way to do testing where I never have any proximity to the "random" component. In the case of my programs, the RNG algorithm is run on my machine. That's true of the website I linked to above for card picking. The ESP test there is on a web page, but your internet browser downloads the code for the test then runs it on your machine. What I would like is to come up with a test where I couldn't possibly have access to the "random" part so there's no possible fear of information leakage.

    An idea was suggested to me which I really like but am not sure how to implement. The idea was rather than creating tests for ESP (or whatever), I should examine my interaction with RNG in other things. As in, I could play a game where RNG is a factor and see if my ability to predict outcomes in it is abnormal. The idea is to see if results hold up in a practical situation which is more "natural."

    I just don't know what game I could do that in. I play a number of card games, but there are so many factors that go into picking the right plays in them that I don't think I could separate out the RNG component.

    I have been thinking about this periodically since first reading your post. The notion that you could predict, with statistical significance, random events is certainly interesting. But in the end, I don't get terribly disturbed by your results because I assume that there must be some factor that remains as of yet unobserved. One advantage to not being particularly smart is that it is relatively easy for me to accept that there are many aspects of the world that are true even though they lie well beyond my perception and comprehension.

    As part of being an IT person, I have always been interested in security issues. That's been combined with my love for math to give me a healthy interest in cryptography. When I consider the sorts of weaknesses people look for in modern RNGs to try to be able to beat encryption and things like that, the gap between that and what i do is... I don't even know how to describe it.

    If I can beat a reasonably strong RNG with this weak an approach, the implications are astounding. That's why I really hope this is due to some flaw in my code or design. Unfortunately, there's no evidence to support that right now. Looking at the data I've generated so far, there's no obvious patterns I could be exploiting.

    By the way, when I mentioned using a hundred trillion data points before, that much data shouldn't be enough to find patterns in this RNG system if we were using the raw output of it. We're not though. Remember, RNG systems create a long string of random characters as their output. The code I use simply converts those strings into one of three options. A huge amount of information is lost in the process. That means I not only have a small sample size (for trying to find patterns in an RNG system), but the data I do have is very coarse. There is no way I should be able to win by picking up on patterns in the AI's picks. Assuming there is no serious flaw in my approach, that would be impossible. In fact, it would be so impossible psychic powers are more plausible simply because they don't contradict known physical processes.

    I just don't believe in psychic powers so I am doing everything I can to avoid that conclusion.

  52. I'm going to go curl up in a ball, wrap myself in a blanket and hope when I wake up I find this was all just a bad dream. I had taken a couple days off playing RPS, and tonight, I thought I'd go ahead and play it a bit to see what would happen. I figured I'd 20 batches of 100 matches a piece and stop there so I wouldn't spend too much time on it. I did them in sets of five, and my results were (Wins/Losses/Ties):

    172/161/167
    169/167/164
    173/158/169
    191/167/142

    The last of those was such an extreme I didn't want to stop on it. I've been making graphs showing my results. I didn't want to end on a extreme outlier. That could mislead people about just how much "skill" I am showing at this game. So I did a thousand more:

    173/159/168
    188/164/148

    I was determined not to end on extremes like these so I decided to do a thousand more:

    167/157/176
    163/157/180

    I'm going to stop here for the night so I don't go overboard. Here are my net wins with the latest 4,000 matches added in:

    http://www.hi-izuru.org/wp_blog/wp-content/uploads/2017/02/2_14_16000_wins.png

    My "skill" is starting to become apparent in the distribution of outcomes as well. Here are histograms showing how many Wins/Losses/Ties I got for each set of a hundred:

    http://www.hi-izuru.org/wp_blog/wp-content/uploads/2017/02/2_14_16000_hists.png

    The distributions for Wins and Losses are unquestionably skewed. Even worse, it seems the more I do this, the better I get at it. Maybe that's not the case though. Maybe today was just a "lucky" day. I don't know. What I do know is my overall win rate since implementing the new approach is 34.08% while my win rate for today is 34.90%. I can't explain it.

  53. I mentioned a website with a simple ESP test upthread. To try a break from my recent annoyance with Rock, Paper, Scissors, I decided to try a systematic test of that site's game. I decided I would do 40 sets of 50 matches, figuring 2,000 data points should be a good start since the test has you pick one of five options, with one being a winner and the rest being losers. My results were 434 Wins, 1,566 Losses. Here is a visualization of my progress:

    http://www.hi-izuru.org/wp_blog/wp-content/uploads/2017/02/2_14_2000_Wins_5_Card.png

    I know a win rate of 21.7% when the expected rate is 20% isn't sexy, but statistics say these results are statistically significant (at the 95% level). It appears this may be another test I can beat. I don't know what to make of that. I'll do another 2000 matches to see if the results hold up (tomorrow), but I don't hold out much hope.

    One thing that's unfortunate about this tool is we can't examine the data in fine detail like we can with the programs I wrote. I chose to do 50 match batches because the tool only records totals for how many cards you guessed correctly (it keeps a tally per card type), not the outcomes of individual matches. This means we can't do much to try to find patterns in what the AI chooses. I am keeping what the tool provides though.

    I wonder what the people who made the tool would think about these results.

  54. In the hope it makes it easier for people to check my work, I've decided to convert the Python code I'm using to an executable file. This is easy enough to do, but the change in format means I wanted to make a few changes. The main change is I've added code to the program which will, after every run, loop through all of your result files and tally them up to figure out what your total results are. Those results are then dumped into a file for you to view whenever (in addition to the result files themselves).

    While troubleshooting this code, I discovered two problems. First, the new code was inadvertently switching the entries for Ties and Losses. It took me about 20 minutes to figure this out because of the second problem. The second problem is it turns out I had failed to include the results of one of the sets I had done whose results were:

    Wins: 30
    Losses: 35
    Ties: 35

    The results were stored in a data file like they were supposed to be, but somehow, when collecting my stats for the charts I've posted above, this one got left out. It doesn't change anything in any meaningful way, but it was something that needed to be fixed. To do so, I've added it to the end of my current list. I'm doing this so I don't have to dig through my ~180 sets of results so far to figure out its proper place. The result is the charts showing how my results have progressed over time will have a 5 point drop at the the current point instead of somewhere earlier.

    This sort of thing is inconsequential to any conclusions we might draw, but given how baffling my results are, I want to be up front and clear about all things regarding the data. Also, I'm annoyed it took me twenty minutes to figure out I had typed 'l' when I meant 'w.' Debugging programs sucks sometimes.

  55. I'll probably write a post about this if the current trend in my results holds up, but it was suggested to me I use the website http://www.random.org to create the random numbers for my matches. I decided to give it a try. My first period of testing got messed up because a mistyped line in my code meant no results or data were recorded. I got that fixed and did another 4000 matches. My results were:

    Wins: 1398
    Losses: 1283
    Ties: 1319

    I had planned to stop at 4,000 data points, but the last few sets had been so extreme I didn't want to stop on them. I did another 1,500 matches, and now my results are:

    Wins: 1893
    Losses: 1781
    Ties: 1826

    From what http://www.random.org says about how they produce their random numbers, there is no way I should be able to beat its RNG. It looks like I may be able to though. That, or maybe there's a bug in my code screwing things up?

Leave a Reply

Your email address will not be published. Required fields are marked *