My last couple posts have examined how it appears data used in two scientific papers, making up a significant portion of a PhD dissertation by Kirsti Jylha, has been tampered with. I don't want that issue to dominate the discussion though. While data tampering would obviously be a serious problem, I want to remind people this work was complete nonsense even without concerns of data tampering.
In my last post, I asked for help explaining correlations between Rater IDs for people who took a survey and the responses they gave to that survey. The order in which people take a survey should not affect how they respond to the survey, yet according to a data set I was examining, they do.
Today I'd like to go further and show even more inexplicable results. I don't like accusing people of fraud or tampering with data, but I can't come up with any other explanation. Perhaps someone else can help me come up with one.
I do not like making accusations of dishonesty. I have done so plenty of times, but each time I did, I first put significant effort into trying to find an alternative explanation. Today's post is for that. I have encountered data with properties I cannot explain. I am hoping someone can find an explanation for me that isn't, "Someone fabricated data."
Hey guys. It's time to resume the series of posts I'm writing about a series of papers, and a PHD dissertation based on them which got halted because I've been playing too many games of Rock, Paper, Scissors (if you want to know why I've been playing that, see here). Today I will be discussing how not only are the results the authors published based upon a inappropriate methodology, but fail a basic sanity check.
Today I'd like to take a break from my recent topics of discussion and look at an example of why people shold be skeptical of the messaging by global warming advocates. This post isn't about science. I'm not going to argue about any facts or theories. I'm not going to question or put forth facts or evidence.
None of those things matter today. Regardless of what one believes about global warming, everyone should be able to agree on a basic principle: Results should be presented in an accurate manner that does not create a misleading impression of what the results show. And based upon that principle, everyone should be able to agree this display is rubbish:
I'm not questioning the data used to make this display. The data doesn't matter today. What matters today is the data is being displayed in a misleading manner.
Hey guys, as you may have picked up on from my last couple posts, I was fairly sick this week. I'm not completely over it, but I have had the energy to do more than lie around all day doing nothing. Naturally, one of my top priorities has been playing Rock, Paper Scissors (RPS).
I'm not going to re-visit the history leading up to today's post. You can read the last post I wrote on this subject here. The short version is it seems no matter what I do, I keep beating a computer opponent that makes random choices. This shouldn't be possible. The odds of winning, losing or tying in RPS should be 1/3 when one opponent picks options at random.
Today's post is about an update to my methodology and the results it leads to. I've played 10,000 matches after the update, and I have won 3,454 of those matches. That gives me a win rate of 34.54%, a result that is "statistically significant" at the 99% level.
I've been trying to finish my next post involving a case study of the misuse and abuse of statistics to claim to prove global warming skeptics possess certain negative traits (my last post regarding this can be found here). Unfortunately, a number of things are getting in the way. Of special note is it's difficult to talk about statistics as I've largely lost faith in the laws of probability.
I've owed you guys a post for a little while now, and I apologize for how long it's taken. I just can't get past a certain problem. As you may recall, I recently discussed how "correlation is meaningless" in relation to a paper which claimed to demonstrate climate change "deniers" possess certain characteristics. For a quick refresher:
The reason the authors can claim there is a "statistically significant" correlation between these two traits is they collected almost no data from anyone who "denies" climate change. The approach the authors have taken is to draw a line through their data, which is how you normally calculate the relationship between two variables, then extrapolate it out far beyond where their data extends.
There are a lot of ways of describing this approach. When I've previously said correlation is meaningless, I used an example in which I demonstrated a "statistically significant" correlation between belief in global warming and support for genocide. It was completely bogus. I was able to do it because I used the same approach the authors used. Namely:
1) Collect data for any group of people.
2) Determine views that group holds.
3) Find a group which is "opposite" the group you study.
4) Assume they must hold the opposite view of the group you studied on every issue.
This will work with literally any subject and any group of people. You can reach basically any conclusion you want because this approach doesn't require you have any data for the group of people you're drawing conclusions about.
Today I want to move beyond simple correlation coefficients and get into some of the more complex modeling the authors performed. There's a problem though. You see, the results the authors published are impossible to achieve.
I've written a post titled, "Correlation is Meaningless" once before. It makes the same basic point I made in a recent post discussing the PhD dissertation by one Kirsti Jylhä. I'm going to continue my discussion of Jylha's work today to examine more of a phenomenon where people misuse simple statistics to come up with all sorts of bogus results. In Jylha's case, it undercuts much of the value of her PhD.
A couple months ago I contacted a scientist asking to examine the data used in three papers which made up the bulk of her PhD dissertation. The initial response contained this:
Thank you very much for your email and interest in our publications.
We follow ethical guidelines from the American Psychological Association, and we are happy to share our data to other competent researchers. Would you please indicate your background and outline how you plan to use the data?
Which struck me as odd as I have no idea how one would determine which people are "competent researchers." I was pessimistic about this response as it seemed like this might be used as an excuse for not sharing data with me, but fortunately, the issue of whether or not I am a "competent" researcher never came up again.
After examining the data for these three papers, I came to the conclusion the papers were fundamentally flawed in a way which invalidated their analysis and conclusions. I informed the author of this thesis of my concerns and tried to give her time to examine the issue privately. I believe several months was long enough so now I'd like to discuss the matter in public. Hopefully, this will demonstrate I am in fact competent.