Do the Laws of Probability Mean Anything?

Hey guys, as you may have picked up on from my last couple posts, I was fairly sick this week. I'm not completely over it, but I have had the energy to do more than lie around all day doing nothing. Naturally, one of my top priorities has been playing Rock, Paper Scissors (RPS).

I'm not going to re-visit the history leading up to today's post. You can read the last post I wrote on this subject here. The short version is it seems no matter what I do, I keep beating a computer opponent that makes random choices. This shouldn't be possible. The odds of winning, losing or tying in RPS should be 1/3 when one opponent picks options at random.

Today's post is about an update to my methodology and the results it leads to. I've played 10,000 matches after the update, and I have won 3,454 of those matches. That gives me a win rate of 34.54%, a result that is "statistically significant" at the 99% level.

There are three key details to the methodology I am using. First, I am doing my tests in batches of 100. That is, I will start a run and play RPS exactly 100 times and record the results of that run in a data file for that run. For completeness, I also store the outcome of each individual match.

Second, I generate the computer's selections prior to starting my run. What this means is at the start of each run, there is a list of 100 choices the computer will cycle through. This means any action or guess of mine cannot influence what comes later.

Third, I do not generate the random choices myself. This is the major update to my testing methodology. In previous posts and comments, I've discussed potential concerns that the two random number generators (RNGs) I've used might somehow have patterns in their output a human could pick up on in order to win at RPS. This seemed highly unlikely to me, but it was the best idea I had. To address it, I've started using a feature provided by the site www.random.org. You can see it in action by clicking this link:

https://www.random.org/integers/?num=100&min=1&max=3&col=1&base=10&format=plain&rnd=new

Clicking that link will provide you a list of a hundred selections ranging from 1 to three (inclusive). The site using physical properties of the atmosphere to help it generate its RNG values. There is no way there should be any discernible pattern in its output. It should be impossible to beat a computer opponent using it to play RPS. I've done so anyway:

This chart provides a simple examination of my results. It shows the number of wins I've had minus the number of losses in each set. These results are summed together as the series progresses, giving a cumulative (or net) number of wins I've gotten. Since the outcome of this game should be completely random, we would expect this chart to show a line fluctuating about zero. It doesn't.

As a quick aside, I want to point out the last twenty sets in this chart were done earlier today. I hadn't played RPS in a week, and I'm still sick. It is interesting to see this timing coincides with my results flattening out. It may be a coincidence. I just wanted to put it out there so people could consider it for themselves. It could be similar to how under my last methodology (which handled the RNG on my computer), I got these results:

The first five thousand or so matches under that methodology showed no trend. Over time, it seemed my level of "skill" improved. One could choose to interpret the flat periods in these two charts as indicating periods where something was throwing off my ability to play RPS well. I'm not going to guess at that right now. Instead, I'd like to try using a slightly more in-depth analysis. Consider:

This is a histogram showing the outcomes of the 100 runs I've done using www.random.org for my RNG. In theory, the distribution of wins, losses and ties should be equal (save for random variance). That's because in a random selection, you are just as likely to get 33 wins as you are 33 losses.

That is not what we see in these histograms. In these histogramsk, there is a clear positive skew for winning. it is similar to what we saw in the histograms for my last methodology:

That shows results for 185 runs. That the number of wins is skewed toward the right even after 100/185 runs is remarkable. It is not something which should happen if the RNG is truly random. I don't have an explanation for this. It is simply not plausible that I can subconsciously pick up on patterns in three different RNG systems and beat them like I have.

At this point, the only explanation I can come up with is the program I'm using has a bug in it. As such, I've uploaded the code I've used for the recent tests and the data files for my last 100 runs here. I would like it if someone could check my code and verify there is no major bug that could be causing these results. For instance, please verify that the list used for the computer's choices is updated every run. Obviously, if the same set of choices were used over and over, a human could pick up on that.

The code is in Python, and it is somewhat haphazardly put together. You should be able to follow it, but if you have any questions, feel free to ask. I believe you will need to install the requests package for Python to make it work (and if you don't have it installed, Python as well). Just remember to make the directory to store your data files.

Assuming there is no major bug in my code, I'm not sure what to do. I have data from over 40,000 RPS matches I've played (across three different RNG systems). Across all of it, my win rate has consistently been higher than it should be. What should I do about that? Can I really ignore what appears to be a flagrant violation of the laws of probability and go on with my life like nothing has happened?

I don't know. One suggestion I've received from several people is to record, and perhaps even broadcast live, my test runs. This would provide some manner of evidence that I am not cheating. Would that be worthwhile? Would anyone watch? Would anyone care?

I don't know. What I do know is I'm at my wit's end. All I want is for the randomness around me to be random.

46 comments

  1. By the way, these histograms reminded me of a math problem I was curious about. Given 100 matches of Rock, Paper, Scissors, how would we figure out the probability the player would win, lose or tie exactly 33 times? I know how to figure out the odds of getting 33 for one specific outcome (its a simple binomial distribution), but the outcomes aren't independent of one another. That means you can't just take three draws at an X% chance of getting 33 whatever.

    If anyone can explain how to solve that problem, it'd be great.

  2. The probability can be modeled using multinomial distribution. Assuming equal chance of each (i.e. 1/3 for win, 1/3 for loss and 1/3 for tie), then the answer is simply 100!/33!33!34! * (1/3)^100, which in other word is only 0.8%

    This may sound ridiculous because if you consider that on average you will reach 33 win 33 loss and 34 tie (or 34,33,33 or 33,34,33, totaling up to about 2.4%), then you would think it is highly likely. But in fact there are so many possibilities, then it is unlikely that any one of them has a high probability (in fact 0.8% is the maximum of any particular combination).

  3. Thanks Peter Tsul. I didn't realize it would be that easy to calculate. I guess that's how a lot of things in math are though. When you don't know how to do them, they can seem impossible but once you learn how to do them, they're easy.

  4. I've switched out the charts showing net wins as I messed up their titles. Nothing has changed except the description at the top of the charts.

  5. You should post this to a poker forum, namely twoplustwo. I'm sure everything you're seeing is well within the normal bounds of a probability experiment.

  6. Steve, I'm happy to discuss this anywhere I can get feedback as I'm at a total loss as to what to do now. It amazes me people keep saying things like this "is well within the normal behavior" of things. Exactly how far do I have to push this before people start acknowledging it's abnormal? Assuming there is no bug in my code and the choices the AI makes are random (enough that it shouldn't influence the results), the odds of my results are beyond one in a billion now. I believe we've reached the point they are beyond one in a trillion.

    Is there some tipping point where people will do more than hand-wave these results away? Do I need to maintain an elevated win rate over 100,000 runs? 200,000? 500,000? What if I could push the win rate higher? What if I could push it to 35.5% or maybe even 36%? Is there some value that would make people acknowledge how improbable my results are?

    I'm not trying to single you out Steve, but judging by the responses I've gotten in general, I don't think any set of results would affect what a lot of people think. A significant portion of the people I talk to seem hell bent on finding any excuse to dismiss this. I guess one's instinct might be to dismiss things that challenge fundamental beliefs out of hand, but I don't get how people can come up with the excuses I see. It's like listening to Young Earth Creationists arguing against evolution. You can tell they aren't even trying to get past their biases.

    This shouldn't need to be stated again, but I don't want these results to be true. I want there to be some simple explanation involving an external factor introducing a bias in the test. I'd love to hear there's a bug in my code that screws everything up. I would love anything that would let me move past this as it is seriously damaging my mental (and perhaps even physical) health. I just want the explanation to be right.

  7. Sven, I'm afraid that contest is, as that article quotes Richard Dawkins as saying, insincere. I wouldn't even be able to apply for the contest as I don't have enough notoriety, but even if I did, I wouldn't trust them to behave in a fair or reasonable manner. Just try finding a clear and explicit definition of what qualifies as "psychic powers" for that contest. You won't be able to. They don't have one. James Randi created the contest for rhetorical purposes, not a genuine interest in pursuing the truth.

    It's funny you bring this up. I spent some time looking into that contest a few years back because I was curious about something. Under our current medical understanding, a human shouldn't be able to "feel" the energy in light, radio waves and things like that. If a person were able to do so, would that be considered paranormal? I won't go into how that came up (unless someone is interested). The idea was we're constantly bombarded by energy from many different sources on many different wavelengths (and of many different intensities). What would it be like if a person could see, feel or otherwise observe such? (For what it's worth, this has been a topic of discussion in some fantasy settings involving magic.)

    The problem with this question is the fact we wouldn't have a medical explanation for such a phenomenon doesn't necessarily mean one couldn't exist. It's possible there would be some physical mechanism which hasn't been identified yet. A person could theoretically have such a "power" due to some sort of physical abnormality we don't understand right now. Given that, should we consider it a paranormal power?

    We can build from that question to one directly relevant to my current topic of interest. Suppose the RNG systems I've used were strong enough rigorous analysis couldn't find any patterns in them. That would not prove the absence of patterns in the data. In theory, it would be possible such patterns are so subtle our analyses aren't capable of identifying them. Theoretically, human intuition might be able to do something to identify patterns the mathematical analyses we've designed so far couldn't identify.

    These sorts of counter-explanations are highly improbable, and perhaps even completely implausible, but they offer ways scenarios in which things that appear to be "psychic powers" have identifiable physical causes. How could we ever know for sure something was caused by paranormal powers? Even if I could consistently beat any RNG system I was presented with, would that truly count as evidence of supernatural powers? Would I win the contest?

    If the contest were genuine and sincere, we would have clear guidelines that could help us answer these questions. We don't. The guidelines we have are of little use.

  8. Sorry for the long-windedness of that last comment. This is something which has bugged me for a while now. Is there really a physical reason why a human shouldn't be able to "feel" something like infrared light when we know the body can and does absorb energy from it? I get why humans might not "feel" things like that. The nervous system has to handle a constant barrage of signals from all over about all sorts of things. Many things could be expected to fall below the "noise floor" of the system and get ignored.

    But if that's all it is, then theoretically, a human could learn to "feel" such things. Is that possible? If not, why not?

  9. The poker community will give you in depth answers for all of your questions. They are probability experts. Please register and post at twoplustwo. You'll get your answers there. They're a bunch of probability nerds. Good luck!

  10. I had already found that. I'm just waiting to post there until this evening when I'll have more free time to engage in discussions.

  11. Brandon, you might consider adding a step in the random number generation process by setting up an array of, say, 30 elements, then assigning ten values of 1, 2 and 3 apiece to the elements.

    When you call the random function for a number, shuffle the values in the array 30 or more times by calling two random numbers (from 1 to 30 (or 0 to 29 if Python uses zero-based arrays)) and swapping the array elements corresponding to those numbers. Then choose the value in Nth array element as your computer pick. In this way you should be able to break up any serial correlations in the RNG output. I used to use this shuffling method in the early 1990s while performing gambling simulations for blackjack, video poker and other casino card games.

    With respect to http://www.random.org, it certainly sounds as though their numbers are truly random, but you might still consider adding the step outlined above, just in case.

  12. That's an interesting idea Doug. I'll give it some thought. Even if I didn't want to incorporate it as part of the general approach, it could certainly be a useful thing to test. If I could maintain the same win rate despite doing what you describe, that would suggest I'm not just seeing patterns in an RNG (unless one believe I'm working out patterns in multiple RNG systems at the same time). If I couldn't maintain the same win rate, then that might suggest I am just picking up on patterns in the RNG.

    I need to be careful about how I code it though. When you randomize order by swapping element positions, it can be easy to make things less random than you think. It's like how there are ways to shuffle a deck of cards that seem random yet will put the deck back in the exact same position after X repeats.

    Thanks for the feedback. I'll definitely give it a try.

  13. Come to think of it, for 52-card decks of cards, I ran a FOR loop for each of the 52 elements in my array, then swapped the i'th element with another element determined by one call to the RND function. This ensured that each element was swapped at least once.

    The swap routine used a temp variable to hold the current array element. Then the RND function chose the element with which to make the swap.

    After that, the value of the randomly chosen element was moved into the current element space, and the temp variable was moved into the randomly chosen element space. Easy peasy, and I was able to run billions of simulated hands to obtain probabilities that were correct to at least four decimal places, when compared to published calculated probs. This was in the days of the 486 microprocessor, with pseudorandom number generators that had (I seem to recall) much shorter period lengths.

  14. You could always generate 100 choices and then I'll run those against your RNG on my computer. That would eliminate any prediction capability if you generate your picks first.

    In fact send a file of picks any number you want - and lets compare that.

  15. Hey Peter Green, I appreciate the offer, but I don't think it's necessary. I have done some tests where I generate a series of guesses before the AI generates its. There was no signal. That might have some meaning to it, but honestly, I think it's a bad test. It's too boring. There's no feedback as to whether I have won or lost so it's not like I'm playing a game. I'm just sitting at a screen pressing one of three keys. Being bored makes it so I can't focus on trying to beat anything. Whether it's psychic powers, me subconsciously seeing patterns or something else, my tests make it clear I have to try to win in order to win. If I guess at random, I won't win.

    Right now, what I'm planning to do is setup an online version of the test. I had been mulling over how to make a webpage for this before, but I was recently given an idea I like more - chatbots. Whether it be via IRC (yes it still exists), Discord servers or whatever else, it is fairly easy to setup a bot in a chat room that can play RPS. The bot can be setup to keep score for each player using it. This would let anyone whose interested try the test for themselves while establishing some measure of documentation of the results as the chat log would save every guess. Plus, it would let people monitor the test live if they were interested.

    Assuming it will work as I am hoping, I plan to have that setup two weeks from now. It'd be sooner but I need to take a break from RPS to try to work on some other projects.

  16. For your online test, you might consider reducing the number of trials in each run from 100 to 30. Thirty is a very manageable number of trials for a run, and it allows you to set the expected number of hits for each run to 10, which is a nice round number that's easy to work with.

    The other reason for using a shorter run length has to do with "position effects". For instance, if ESP has been a factor in your past success with the test, it is likely that, on average, and with enough runs, you'll find your hits are distributed non-randomly throughout the runs. More specifically, you'll probably discover better performance during earlier segments of the runs than later segments. You might also find more hits than are expected toward the very end of the runs. There are psychological reasons for such patterns, having to do with emotions like enthusiasm and motivation.

    For more information on the subject, see this article by Dr. Robert H. Thouless:

    "The Pattern of ESP" http://www.survivalafterdeath.info/articles/thouless/pattern.htm

  17. Doug, I'm not sure I'd want to have a chatbot do the games in sets. That seems like it'd be too cumbersome an approach for a live-chat feature. Given there's no apparent difference in my results whether I do things in batches or not, I think it should be fine to do it as a stream.

    Though it would be worth checking to see if the results do have an uneven distribution like you suggest. I should probably load the data from my runs into R and see if i can find any patterns like that.

  18. Okay, I misunderstood the chatbot format. I see your point now.

    For your personal data, I have to say that runs of 100 trials apiece might be too long to detect good evidence for position effects. That's because awareness of the starting and ending points for a run could be considerably reduced with so many trials, particularly if you do several long runs in rapid sequence. In any case, it would be best to use only the data from those runs whose length you stipulated in advance would consist of 100 trials.

    You probably don't have enough data to adequately resolve your hit probabilities for100 individual trial positions, so it'll be necessary to bin the results into segments. You might try binning them into ten segments of ten trials each, or five segments of 20 trials. At some point, you might also switch to runs of 30 trials each instead of 100. πŸ™‚

  19. Or I could just do more runs. Tons more runs. Like, 1,000 more. I mean, that'd only be... a million matches.

    While I say that as a joke, part of me is serious about it. I've had so many people tell me this is just a "fluke" and that I shouldn't care about it. It baffles me. Exactly how what would it take for people to agree there is something bizarre going on? How many matches, what win percentage, what RNG? Every time I ask anyone these sorts of questions, they have no answer. I'm all for being skeptical of results like these, but skepticism doesn't mean blindly rejecting everything that fails to match expectations.

    Maybe I should retry the blind tests again. Who knows, if I can stay motivated, purpose I can beat the RNG even when I receive no feedback from the program. If so, it should be pretty much impossible to dispute there is cause for concern.

  20. "Or I could just do more runs. Tons more runs. Like, 1,000 more. I mean, that'd only be... a million matches."

    I know you're joking, but I think your enthusiasm for the test would probably wane long before doing one-tenth that number of matches. If ESP *was* a factor in previous tests, it'll fade at that point and you'll obtain average results. Thereafter, you too might manage to brush off your previous stellar results as a fluke after all. Cognitive dissonance affects most of us at one time or another.

    "It baffles me. Exactly how what would it take for people to agree there is something bizarre going on? How many matches, what win percentage, what RNG? Every time I ask anyone these sorts of questions, they have no answer."

    You'll never get them to agree, so forget about trying. I can't say that I blame them, however. Their lives are every bit as complicated as yours, and they've had to adopt a worldview that excludes psi phenomena in order to cope with their circumstances. You're not going to nudge them even a little bit from that worldview. As a late critic of parapsychology, C.E.M. Hansel, once stated, β€œIn view of the a priori arguments against it we know in advance that telepathy, etc. cannot occur.”

    "I'm all for being skeptical of results like these, but skepticism doesn't mean blindly rejecting everything that fails to match expectations."

    It does for those who have no interest in the anomalies you may have uncovered. To them your results are merely a curiosity. Apart from an understandable lack of inclination, they don't have the time or energy to immerse themselves in your work.

    "Maybe I should retry the blind tests again. Who knows, if I can stay motivated, purpose I can beat the RNG even when I receive no feedback from the program. If so, it should be pretty much impossible to dispute there is cause for concern."

    I wouldn't recommend doing this. You might kill your hypothetical psychic ability doing thousands of trials without feedback, and not be able to get it back again.

  21. " I've had so many people tell me this is just a "fluke" and that I shouldn't care about it. "

    You would need to demonstrate the effect in some kind of monitored setting with the data collected in an independent fashion, if you want people to take it seriously - surely.

  22. Doug, Szilard, I don't ask people to believe I'm psychic. I don't believe that. People could dismiss this as nothing more than a hoax if they wanted (though it would be a bizarre hoax). What baffles me is people who cannot even envision a situation which would change their mind. Similarly, there are people who latch onto any excuse to "discredit" these results, no matter how ridiculous.

    It's one thing to assume there is an explanation you aren't aware of; it's another thing to not require there be one. I don't get how people can be so close-minded they cannot imagine their beliefs are wrong.

    Szilard:

    You would need to demonstrate the effect in some kind of monitored setting with the data collected in an independent fashion, if you want people to take it seriously - surely.

    If I wanted to convince people I had psychic powers, I would definitely expect I need a more rigorous form of testing. I don't ask people to believe anything in particular though. I'd be fine if people assumed I was lying. They could still discuss it as a hypothetical of, "If somebody actually did get these results, what would they mean?"

    If people are this reflexively dismissive of any results, I doubt there is any amount of rigor which would change their minds. That's worrying. Imagine if it turned out I did have psychic powers. Would I be able to convince anybody? I have to wonder.

    On the upside, if I did have psychic powers, I might be able to make some money off them via gambling. That'd make people's disbelief a bit more tolerable.

  23. What baffles me is people who cannot even envision a situation which would change their mind. Similarly, there are people who latch onto any excuse to "discredit" these results, no matter how ridiculous.

    It's one thing to assume there is an explanation you aren't aware of; it's another thing to not require there be one. I don't get how people can be so close-minded they cannot imagine their beliefs are wrong.

    Brandon, who are these people? Are they friends, family members, colleagues at work? I've been intently following both of your blog posts on the subject since the beginning, and I can't recall coming across any comments of the type you describe. Have you deleted such unhelpful comments? Until I began posting here recently, I didn't visit the blog every day, so perhaps I missed them.

    In any case, at this point it certainly looks to me as though the staggering cumulative results with your test could be due to psychic ability. That's because of the way your reported mental/emotional states seem to correlate so well with your session outcomes. Similar results were reported by Rhine in the 1930s and '40s with his Zener card experiments, and it was well known even then that test performances correlated with mental/emotional states.

    If you'd like to branch out a bit into independent online psi testing, I think you'll do well to visit the "GotPsi?" website at:

    http://psiresearch.com/

    The test that most closely resembles the Zener card tests you've tried at Michael Daniels' website is the Card Test, but there are six additional tests to choose from. As far as I know, they all run smoothly with no program glitches.

    The GotPsi tests have been around since 2000, and they've attracted hundreds of thousands of visitors over the years. The neat thing about the site is that all user results are saved, so it's easy for users to accumulate lots of data for later analysis. Additionally, there are daily "Hall of Fame" pages for each test, displaying the usernames and test summary data (trials, hits, etc.) for each subject (providing they've done a certain minimum number of trials that day). The Halls of Fame start anew with blank tables each night at midnight, Pacific Time.

    One of the founders of the site, Dean Radin, wrote an explanatory paper about the tests in 2002:

    http://52.24.227.126/gotpsi/html/articles/GotPsi-public.pdf

    It's apparent from the paper that a PRNG was in use during the early years, but I believe that it was replaced by a true RNG before long.

    Dean Radin is also one of the very best communicators around when it comes to psi lab research. You might find his Google Tech Talk (2008) worth watching, especially since it deals in part with the refusal of so many of his scientific colleagues to take his (and others') research seriously. The title of the talk is "Science and the taboo of psi", and it can be found here:

    https://www.youtube.com/watch?v=qw_O9Qiwqew

  24. Doug, I thank you for a very informative comment. The psi test seems like something everyone would want to take at some point. Has there ever been anyone who produced results better than a 95% significance? Can they do it on a reproducible basis?

    Brandon, you know that extraordinary claims require extraordinary proof. This is part of the reason why the climate propoganda is to normalize the default assumption as catastrophic warming is certain and thus make the extraordinary claim be that it is not since proof is difficult.

    On the upside, if I did have psychic powers, I might be able to make some money off them via gambling. That'd make people's disbelief a bit more tolerable.

    If you falsely convince yourself you have psychic powers that could be expensive. If you have controllable psychic powers it would be a good question to ponder what you would do. Making it known might make you famous or gain a visit from some government scientists interested in taking a sample of your brain. Say no. On further thought make sure that the "gotpsi" site does not get to see your results if you would not want the world and the CIA to know.

  25. Hey Doug, I think you might have missed a post or two about this topic on this blog, but I haven't deleted any comments. You probably haven't missed any discussion that's too important. What I'm talking about has been in other locations, including things like Reddit and a couple chat rooms.

    Anyway, thanks for the link to that site. I'll have to check out its tests. I don't believe in psychic powers, but given my results so far, I'm certainly open to trying new tests. For what it's worth though, the phrase "true RNG" is not accurate. There is no such thing. The phrase refers to RNG systems that use certain types of physical processes as seed values in their algorithms. They're good. They can be so good we can't distinguish them from being true RNGs. They aren't perfect RNGs though, and given enough time, humans will likely find non-randomness in them.

    And that's ignoring implementation issues. Using a physical process for a seed value in an algorithm doesn't guarantee issues won't crop up due to limitations in the algorithm itself. Additionally, those physical processes aren't necessarily random. One major problem with using hardware-based RNG systems is the hardware can fail. Often, that failure won't be instantaneous. Instead, the hardware can degrade over time and introduce gradual biases that won't be noticed unless one actively look for them.

    One of the more interesting hardware based RNG systems I saw used lava lamps for its seed values. It's a cool idea, but I'm fairly certain one could work out patterns in the flow of lava lamps.

  26. Ron Graf, I don't expect people to believe my results. What I'd like to expect is for people who discuss them to do so in a manner that is sensible. For instance, on Reddit a couple people have said I'm dishonest because I ignore a "flaw" in my test - that if I don't have the program tell me whether I win/lose, my results go away.

    That's not a flaw in the test though. These tests were not designed to determine if I have psychic power. I started these tests because in high school I was able to win more than I should have on a graphing calculator. I wanted to see if the same thing would hold true now, with stronger RNG systems. That's all I've ever stated these tests as being for. That doesn't matter to them though.

    If you falsely convince yourself you have psychic powers that could be expensive. If you have controllable psychic powers it would be a good question to ponder what you would do. Making it known might make you famous or gain a visit from some government scientists interested in taking a sample of your brain. Say no. On further thought make sure that the "gotpsi" site does not get to see your results if you would not want the world and the CIA to know.

    That's the worst part of all this. If I could prove I had psychic powers, doing so would probably ruin my life. The publicity and attention would be nightmarish to me. I'd hate it. That's not my worry though.

    My worry is what it would mean if I don't have psychic powers yet these results are replicable. Think about what it would mean if I could beat RNG systems which are considered stronger than is necessary for cryptographic purposes. As in, imagine what it would mean if a human could somehow find patterns in RNG systems used for security of all sorts of data. Encrypted internet traffic, database passwords, security key codes. Practically all digital security is dependent upon "random" number generators.

    I know being able to beat the computer at RPS doesn't mean I could break the encryption use for bank transfers. On the other hand, if I can pick up patterns (whether it be by subconscious pattern recognition, psychic powers or whatever) in a "true RNG," then it should be possible for humans to pick up on patterns in any RNG system. That implications to that are frightening.

    Which is one of the reasons I really, really want to find an explanation that is tied to a fault in my code/methodology. That's the only type of explanation I can see which shouldn't scare me out of my mind.

  27. I went to the website Doug suggested and tried the sequential card test. In it, you are presented five cards and have to pick out the one that is different from the rest. On average, it should take you three guesses. For my first run, it took me ~25 guesses to get 10 cards right. I immediately felt a sense of dread at the possibility beating another test. I then did another 30 trials and am now at 3.23 guesses per card.

    Maybe it's a coincidence, maybe it's not. I don't know. What I do know is I don't want to have psychic powers or anything else like that. Can I please find tests that I'll fail?

  28. Okay, I'm confused. After those 40 trials, I set out to do ten times that many (400 total). I did, and my average was ~3.1 guesses per trial, slightly elevated. I just checked the hall of fame board though, and ti says I've only done 300 trials.

    On top of that, I somehow managed to do only 78 runs in a card draw test (take one guess at which of five cards is "right") despite all the batches I did being in multiples of five. Then, I did a 100 trial run, and I wound up with 181 trials. I don't get it.

  29. Okay, I'm confused. After those 40 trials, I set out to do ten times that many (400 total). I did, and my average was ~3.1 guesses per trial, slightly elevated. I just checked the hall of fame board though, and ti says I've only done 300 trials.

    I've found your 300 trials for the Sequential Card Test. The raw data for the SC Test is too complicated for me to eyeball 300 trials to try to figure out what happened. To see what I mean, go back to the start page for the test (where you choose the number of trials per run) and click the "Results Summary" button at the lower-right of the page. Once there, click the date on which you did the trials to view the raw data.

    On top of that, I somehow managed to do only 78 runs in a card draw test (take one guess at which of five cards is "right") despite all the batches I did being in multiples of five. Then, I did a 100 trial run, and I wound up with 181 trials. I don't get it.

    That's the Card Test, the one I'm most familiar with. I see that, indeed, the HOF shows you did only 181 trials. If you visit your Card Test results summary page (see instructions above), you'll obtain your raw data for that session. You should save the page and post the HTML file to the blog (after first deleting your username!). I'll look it over and let you know what I find.

    I need to add that, except in extremely rare cases where I suddenly lost my connection, or a flaky mouse managed to fool the server with a double-click (that counted as two trials instead of one), I've never encountered a miscount in the number of trials done per run in any of my sessions. But having said that, I've overwhelmingly stuck with runs of 25 trials for all the tests. I guess it's possible there could be problems with runs of 5 and 100 trials. Or...maybe there's a weird server problem in which the trial totals in your dataset aren't syncing properly with your real-time, onscreen totals. Anyway, email me the html file or post it to the blog when you can, and I'll take a look at it.

  30. Hey Doug, I think you might have missed a post or two about this topic on this blog, but I haven't deleted any comments. You probably haven't missed any discussion that's too important. What I'm talking about has been in other locations, including things like Reddit and a couple chat rooms.

    Thanks Brandon. It makes sense that you'd receive lots of moronic replies to posts on Reddit.

    For what it's worth though, the phrase "true RNG" is not accurate. There is no such thing. The phrase refers to RNG systems that use certain types of physical processes as seed values in their algorithms. They're good. They can be so good we can't distinguish them from being true RNGs. They aren't perfect RNGs though, and given enough time, humans will likely find non-randomness in them.

    Sorry, but I must disagree. TRNGs (true random number generators) based on thermal noise or radioactive decay have been in use for decades, at least among parapsychologists. The late Helmut schmidt built the first TRNG for psi research in the 1960s:

    http://deanradin.com/evidence/Schmidt1990PK.pdf (see page 234)

    A Dutch company named Orion produced (until 2010) TRNGs that used diode-based noise sources:

    http://web.archive.org/web/20080119133904/http://www.randomnumbergenerator.nl/rng/home.html

    Here's a paper from 2014 that discusses a variety of TRNGs:

    http://cs.ucsb.edu/~koc/cren/docs/w06/trng.pdf

    Additionally, those physical processes aren't necessarily random. One major problem with using hardware-based RNG systems is the hardware can fail. Often, that failure won't be instantaneous. Instead, the hardware can degrade over time and introduce gradual biases that won't be noticed unless one actively look for them.

    Point taken and appreciated. In order to correct for first-order biases, Orion (see link above) recommended inverting every odd-numbered bit (from 0 to 1, and 1 to 0). They also recommended XORing the TRNG bytes with those from a PRNG. They claimed doing so would guard against "higher order bias effects." For more information about the XORing processes of Orion and Mindsong (another commonly used TRNG) look for "S.8 XOR processing for Mindsong and Orion RNGs" about halfway into this paper by Peter Bancel:

    https://www.researchgate.net/publication/295400221_Searching_for_Global_Consciousness_Supplementary_Materials_-_Experimental_and_analytical_details

  31. Hey Doug, thanks for explaining how to pull out the raw data. I didn't see that feature. It's cool. It's also helpful in identifying the problem. There are a number of entries in my results like:

    79: cadaeibf, 23:37:33, 3, 3*, 1/ 100, 1, 74
    80: cadaeibf, 23:37:33, 3, 5 , 1/ 100, 0, 0

    And:

    135: cadaeibf, 23:40:51, 2, 4 , 55/ 100, 10, 4
    136: cadaeibf, 23:40:51, 2, 4 , 55/ 100, 10, 0

    Which show I somehow managed to submit multiple answers for the same trial (1/100 and 55/100). In some cases, it's quite strange:

    20: cadaeibf, 23:34:00, 1, 2 , 6/ 10, 2, 1
    21: cadaeibf, 23:34:00, 1, 4 , 6/ 10, 2, 0
    22: cadaeibf, 23:34:00, 4, 3 , 6/ 10, 2, 0
    23: cadaeibf, 23:34:00, 5, 4 , 6/ 10, 2, 0
    24: cadaeibf, 23:34:01, 1, 5 , 6/ 10, 2, 1

    For this trial, I managed to pick five choices somehow, including the same choice twice (4). The target value was three different things (1, 4 and 5) and the time of the trial spanned 2 seconds. I'm not sure what I did that would lead to results like these, but I'll try making sure I play the games more slowly in the future. Maybe that will help.

    As for there only being 300 results for the one test, I'm going to just assume it's my memory being faulty. Maybe I somehow thought I was approaching 400 when I was really approaching 300. I would have sworn that wasn't the case, but...

  32. I just realized you had a comment stuck in moderation Doug. I've released it. You say in it:

    Sorry, but I must disagree. TRNGs (true random number generators) based on thermal noise or radioactive decay have been in use for decades, at least among parapsychologists. The late Helmut schmidt built the first TRNG for psi research in the 1960s:

    But I'm not sure what the disagreement is. We cannot know any physical property is random. Many that have appeared to be random have turned out not to be, and there's no way to prove a lack of underlying properties influencing the outcomes in some deterministic manner.

    Anyway, I decided to play the sequence game some more today. I picked that one because I didn't have any seemingly-bugged results like in the other. I decided to go with 300 trials since that is how many I did yesterday, but I think that's too many for the future. Having to use the mouse in that game (and potentially having to click 5 times in a trial) makes 300 trials tiring. I think I'll try to do 200 a day for the next few days instead. In the meantime, here are the results in 100 trial sets:

    http://www.hi-izuru.org/wp_blog/wp-content/uploads/2017/03/3_6_hists.png

    I mostly posted them in 100 trial sets because I wanted to show the results for the last 100 trials. After the first 200 trials for the day, I felt drained and didn't want to do any more. I pushed through because I had set 300 as the target in advavnce so I couldn't change that. Maybe this is just a coincidence, but take a look:

    http://www.hi-izuru.org/wp_blog/wp-content/uploads/2017/03/3_6_sub.png

    I'm not saying this means anything. I just thought it was interesting. If I had stopped when I felt tired, my results for today would have almost perfectly mirrored my results for yesterday.

  33. Hey Doug, thanks for explaining how to pull out the raw data. I didn't see that feature. It's cool. It's also helpful in identifying the problem. There are a number of entries in my results like:

    79: cadaeibf, 23:37:33, 3, 3*, 1/ 100, 1, 74
    80: cadaeibf, 23:37:33, 3, 5 , 1/ 100, 0, 0

    And:

    135: cadaeibf, 23:40:51, 2, 4 , 55/ 100, 10, 4
    136: cadaeibf, 23:40:51, 2, 4 , 55/ 100, 10, 0

    Thanks for posting these results, Brandon. These are what I had in mind when I wrote about double-clicks in a previous post. In each instance above, the second trial is completely spurious and should be disregarded.

    Whenever I've obtained outcomes like this, it meant it was time to get a new mouse. It seems that for whatever reason, the contacts beneath the left-click button of an old mouse sometimes wind up touching multiple times within a span of so many milliseconds, thus triggering a spurious trial. The late Richard Shoup, the other designer, and maintainer of the original site (https://web.archive.org/web/20150213201842/http://www.boundaryinstitute.org/bi/index.html) acknowledged the problem several years ago but had no solution for it. For me, a new mouse always solved the problem.

    Which show I somehow managed to submit multiple answers for the same trial (1/100 and 55/100). In some cases, it's quite strange:

    20: cadaeibf, 23:34:00, 1, 2 , 6/ 10, 2, 1
    21: cadaeibf, 23:34:00, 1, 4 , 6/ 10, 2, 0
    22: cadaeibf, 23:34:00, 4, 3 , 6/ 10, 2, 0
    23: cadaeibf, 23:34:00, 5, 4 , 6/ 10, 2, 0
    24: cadaeibf, 23:34:01, 1, 5 , 6/ 10, 2, 1

    For this trial, I managed to pick five choices somehow, including the same choice twice (4). The target value was three different things (1, 4 and 5) and the time of the trial spanned 2 seconds. I'm not sure what I did that would lead to results like these, but I'll try making sure I play the games more slowly in the future. Maybe that will help.

    Wow, these results are bizarre! Four spurious trials with three different target choices! I don't understand how the target choices could change like that.

    I'd suggest emailing the tests' current maintainer Arnaud DeLorme, but he's seriously buried with work not related to them. After Richard Shoup died in July, 2015, Dean Radin moved the tests to a new site, and the code for the monthly and yearly halls of fame was suddenly broken. In addition, the user results summary pages used to conveniently display summary data for each day of the month on pages that are now blank.

    Personally speaking, I'm grateful that the tests still exist (I believe there were plans to shut down the original site in late 2013, over lack of funding), so I don't like to pester Arno with too many complaints. He and a programmer Alton Moore are probably working for free, donating what little spare time and energy they have left after attending to their normal duties.

    As for there only being 300 results for the one test, I'm going to just assume it's my memory being faulty. Maybe I somehow thought I was approaching 400 when I was really approaching 300. I would have sworn that wasn't the case, but...

    Yes, that seems a reasonable explanation for what happened. I hope the number of glitches you experience on the site drops as you gain more familiarity with it.

  34. Ron Graf wrote:

    Doug, I thank you for a very informative comment. The psi test seems like something everyone would want to take at some point. Has there ever been anyone who produced results better than a 95% significance? Can they do it on a reproducible basis?

    Ron, while I understand the concept of statistical power, I've never applied power equations to my or anyone else's data. I've been content to simply eyeball the cumulative p-values (expressed as odds) of users who appeared to show promise in the monthly and yearly halls of fame (which, sadly, no longer work).

    For promising long-term users of the Card Test, I combined the results of two or more years' worth of data (as found in the yearly halls of fame) to see if their overall results rose in terms of statistical significance. As far as I can recall, they did not. But this doesn't necessarily mean they demonstrated no psychic ability.

    The reason I say this is that for most long-term users with too much time on their hands, the tests can become highly addictive. It's hard for them to discipline themselves to avoid testing on days when they don't feel well or are fatigued or preoccupied with pressing issues. It's also often very hard to quit a session after boredom or fatigue sets in and results begin trending toward chance levels. There's always the hope that the next run will be better. Consequently, multiple runs of bad results get mixed in with multiple runs of good results, and the overall averages wind up at chance levels. Short of using EEGs or fMRIs to monitor brain states, and hopefully achieve a high degree of correlation between those states and test outcomes, I can't see an easy way to untangle the issues inherent to online psi testing.

  35. I'm not saying I'm psychic because I don't believe I am. I don't even believe psychic powers are real. But what am I supposed to do when practically every test comes back the same? I took a couple days off RNG stuff to try to focus on catching up on science-y stuff I've been wanting to cover on this blog. Today was the first day I did any testing since my last comment on this page. My plan was to do 200 runs with the card sequence test like I said above, and I want to ensure I don't cherry-pick endpoints by stopping after a good/bad set of runs. Here are the results:

    http://www.hi-izuru.org/wp_blog/wp-content/uploads/2017/03/3_10_sequencing.png

    In the sequence game, you have to pick from five cards until you pick the "correct" one. The expectation is you'll need each number of guesses the same as the rest. That is, you should have a 20% chance of needing 1 guess, 2 guesses, 3 guesses, etc. That means the distribution in my histogram should be even. The net number of guesses, shown on the right, should fluctuate about 0. That is not what happened here. My results were heavily skewed toward picking the correct card on my first or second guess. They were skewed enough that I made it to the top of today's leaderboard for the game:

    http://www.hi-izuru.org/wp_blog/wp-content/uploads/2017/03/3_10_seq_leaderboard.png

    That is notable enough on its own, but the odds posted in that leaderboard are based solely upon the average number of guesses. They don't account for the distribution of guesses. I believe this means my results should be considered even more unlikely than the site says. That's not important though. This is one of the first set of tests I've run with this site, and I've already gotten highly elevated results. It's creepy.

    I get this might be a coincidence. I'll try more runs and see what happens. I don't know what I'll do if I keep winning though.

  36. Well, I don't think you are psychic. in fact I believe I may have some evidence you aren't though your pattern of play has proven effective against a computer RNG.

    Here is what I have done:
    I have your results from previous files of all the games you played, with your moves and the computer moves. 15,532 games. I have your moves and the computer moves.

    If we play a series of games what result do we expect? RNG vs RNG should have a normal distribution centered around 33.33%.

    So I have done 3 things All these consist of running your 15,532 games 10,000 times to build up a really good sample base.
    1) Played new RNG vs newRNG (in my computer) and plot the result. In this cycle I substitute both your and the computer moves with newly generated random selection.

    2) Played oldRNG vs newRNG (my computer vs your computer's previous moves) - I replaced your moves in the games with a newly generated selection via my RNG.

    3) Played BrSc vs newRNG (Your previos selections against new RNG options) - I replaced your RNG opponent with a new RNG opponent randomly generating moves against your previous selections. I think this is the most important step.

    I have recorded Win/Loss?Tie percentage for each of the 10,000 runs through the 15,532 games, and then provide the High, Low, Mean and Sd for each set.

    Here I will only report on the 'Win' being either my RNG (v my RNG), my RNG (v your RNG) or You (v My RNG).
    1) My RNG W v My RNG 10,000 cycles of 15,532 games:
    Win Max 34.74%, Win Min 32.00%, Mean 33.337% Sd 0.384 Looks like a very normal distribution.
    2) My RNG v your previous RNG choices, 10,000 cycles against 15,532 previous choices:
    Win Max 34.82%, Win Min 31.68%, Mean 33.335% Sd 0.378 Looks like a very normal distribution.
    3) Your previous choices vs my new RNG choices, 10,000 cycles of 15,532 games:
    Win Max 34.72%, Win Min 31.91%, Mean 33.331% Sd 0.38 Looks like this distribution is very slightly skewed to the right (i.e. Win)
    What I mean by that is that in the histogram frequency table, the first few values higher than the mean are very slightly higher than their corresponding value below the mean, and the tail goes a little further right (more outliers). The difference is very slight, but it has repeated over several runs.

    Since you made those choices in response to your own RNG - it would seem to me to be far more likely that some artifact of the patterns in your play (which we have already established) are enough to tilt the odds very slightly in your favour. Certainly there is no way that a psychic prediction over a series of 15,500+ games could continue to hold up for 10,000 sequential repeats against a different random number generator.

    I have saved the sets of runs (the resultant % W/L/T for each type of 10,000 runs) if you are interested in running them through R yourself.

    Anyway, FWIW, I certainly don't believe you are psychic, but I do believe in the power of the subconscious to observe subtle patterns. πŸ˜€

  37. Peter Green, I'd be interested in looking at your results. Without being able to see them, I can't see any indication your idea is correct.

    In the meantime, I'm going to do some more runs of the sequence card game. I don't have high hopes for it. I had a bit too much to drink last night and am still recovering.

  38. When I'm right, I'm right. Check out these results:

    http://www.hi-izuru.org/wp_blog/wp-content/uploads/2017/03/3_11_sequencing.png

    I know this might be a coincidence, but it's an interesting one if so. Last night I was having fun, was energetic and in a good mood. Last night, my win rate was notably elevated. Tonight, I'm not in a bad mood but am tired and not really up for having fun. Tonight, my win rate is notably depressed.

    It's not like I just failed to win today. I lost by a significant margin. That's weird.

  39. Interesting factoid - the series played by Brandon against the RNG is the only set with a negative skew. All the others are either 0 or positive.

    Having said that, the skews are all very very small (0.0x typically).

  40. Hey Peter Green, sorry about not commenting on your data sets earlier. To be honest, I've been trying not to think about this stuff lately. For what it's worth, I get somewhat different values than you summarized. There might be more, but a couple exampls:

    3) Your previous choices vs my new RNG choices, 10,000 cycles of 15,532 games:
    Win Max 34.72%, Win Min 31.91%, Mean 33.331% Sd 0.38 Looks like this distribution is very slightly skewed to the right (i.e. Win)

    The maximum I find for this data set is 34.78, not 34.72. The run with a 24.78 win rate is 2712. I get the same minimum you get, but my mean is 33.32. This one might be explained by the 33.33169 getting truncated to 33.331 instead of rounded to 33.332, but I don't know why are maximums are so different.

    In any event, here are histograms for the wins, losses and ties in that scenario (forgive the lack of proper labeling):

    http://www.hi-izuru.org/wp_blog/wp-content/uploads/2017/03/3_16_hists.png

    If there is any skew in the distributions, I'm not seeing it. One column (directly over 33.5) in the losses histogram is notably lower than in the wins or ties, but the very next column is notably higher. I suspect that's just random variance. Other than it, I'm not seeing much of anything.

    I had meant to do some further examination of your data before commenting, but since I haven't gotten around to it so far, I figured I should at least post up this.

  41. I think the skews are really trivial (only discernible at 2 decimal places) so the results, whilst for a single run are interesting, over many runs, even RNG v RNG are within the realms of possible.

  42. Ah, okay Peter Green. If you're talking about that small a skew, I wouldn't be surprised if I missed it. I didn't look at the numbers. When the histograms came back the way they did, I didn't see anything that'd make me

    Anyway, for the next week or two I'm probably not going to do much, if anything, with RNG. I've been spending a lot of time delving into the type of work I discuss in my latest post. I've managed to find over two dozen papers based on the same fundamental abuse of correlation tests, and there are a ton more problems in these papers than what I've discussed so far. It's like a large portion of a field of "science" is based upon completely bogus methodologies that can only be used because the authors of these papers have absolutely no understanding of the math they're using.

    It's kind of annoying. Plus, I was supposed to have an eBook written about this a couple months ago. The RNG stuff has already pushed the book back quite a bit. I'd rather not have it postpone the book forever.

Leave a Reply

Your email address will not be published. Required fields are marked *