I Hope I'm Dreaming

I remember back when he was going under the name wottsupwiththatblog, the blogger Anders responded to a person criticizing the infamous hockey stick graph by saying:

That graph was misleading and didn’t fairly represent the science.

I think you need to expand on this. I’ve been reading other, more recent, papers that seem to produce temperature reconstructions that, broadly, match that produced by MBH98. I have a recent post with comments from some experts (including Rob Wilson) who acknowledge that the overall picture hasn’t changed much since MBH98. So, in what way did it not fairly represent the science?

The post he links to says things like:

I’m aware that there are some issues with this work but, overall, it has been heavily scrutinised and the conclusions are, mainly, that these issues don’t influence the results particularly significantly. Furthermore, it appears to have been replicated in many other studies using many different proxies (although, I believe, that tree rings do dominate).

And:

Essentially, I can’t find anything that makes me think that there is a real concern that proxy reconstructions of our past temperature history are fundamentally flawed. There may be better ways to produce the reconstructions, but that’s progress and doesn’t imply earlier work is fundamentally flawed.

But just now I read a post he wrote yesterday referring to issues in another paper saying:

I believe I’ve seen Richard argue that the errors were not significant because the updated result is statistically consistent with the result in the original paper. Personally, I would argue that a new result being statistically consistent with a result that is wrong, doesn’t really allow one to suggest that the original conclusions stand. If it did, you could publish a paper with lots of errors and draw erroneous conclusions. You could then publish another paper that corrects some of these errors but not enough to make the new result statistically inconsistent with the original result; hence arguing that the original conclusion stands. You then publish another paper correcting other errors, but not enough so that the second update is statistically inconsistent with the results for the first update; hence arguing that the original conclusion still stands (since it is the conclusion for the first update). You keep doing this until the final result is completely different to the original result, but in such a way that each update is statistically consistent with the previous update and hence that your final conclusions are the same as your original conclusions. It’s clever, but not really correct 😉

This is what the original hockey stick looked like:

1-14-MBH

This is what Anders says confirms it:

1-14-confirmation

It's similar in design to a graph created by the the Intergovernmental Panel on Climate Change for its Fourth Assessment Report. Because I have the data for that graph handy, I'll use it instead. And instead of throwing a whole bunch of lines together in one image which hides the differences, I'll plot how each reconstruction "confirms" the original Micahel Mann hockey stick (MBH) by showing it in black and the "confirmation" along with it. I'll also exclude the 1900+ of the reconstructions as they're all calibrated to the modern instrumental record and thus non-independent in the modern period. The effect extends further back than 1900, but after 1900 is when it is most notable.

The first comparison is with Mann & Jones 2003:

1-14-MJ03

The two are fairly similar, but there are a lot of similarities in how they were created (as one might guess by Michael Mann being the lead author on both). Briffa et al 2001 is more different:

1-14-BOS2001

There is no particular similarity between the two outside the portion affected by calibration. Briffa 2000:

1-14-B2000

Is closer. Jones 1998:

1-14-JB98

Is similar at least in the portions where neither reconstruction has any real variation. Esper et al 2002:

1-14-ECS02

Is practically nothing like MBH. Rutherford 2005:

1-14-RMO

Is fairly similar, but not only are all three authors of the original MBH hockey stick authors on this paper as well, this paper uses the same data as MBH. It's hardly relevant people can get the same results by using the same data. It's far more interesting to look at what results other people get when using different data, such as in Moberg 2005:

1-14-MSH05

Which again, doesn't confirm much of anything. Neither does D’Arrigo 2006:

1-14-DWJ06

Nor does Hegerl 2006:

1-14-HCA06

In each case, what we find in each case is the further one gets from the original MBH hockey stick, the more the results change. The change is gradual, based on how different the authors are, how different the data used is and how recent the work is. It's only by mixing all of the different reconstructions together one can pretend they "confirm" one another. But doing so isn't right. As Anders put it:

I would argue that a new result being statistically consistent with a result that is wrong, doesn’t really allow one to suggest that the original conclusions stand. If it did, you could publish a paper with lots of errors and draw erroneous conclusions. You could then publish another paper that corrects some of these errors but not enough to make the new result statistically inconsistent with the original result; hence arguing that the original conclusion stands. You then publish another paper correcting other errors, but not enough so that the second update is statistically inconsistent with the results for the first update; hence arguing that the original conclusion still stands (since it is the conclusion for the first update). You keep doing this until the final result is completely different to the original result, but in such a way that each update is statistically consistent with the previous update and hence that your final conclusions are the same as your original conclusions. It’s clever, but not really correct

The only thing he missed out on is the part where you use the fact all the results are calibrated to the same thing (the modern temperature record) to make them look like they agree more than they actually do.

I hope I'm just dreaming this all up.

11 comments

  1. Diogenes, glad to hear it!

    By the way, sorry your comment landed in moderation. I really don't like having to approve people's first comments. It's just the most effective way I have for combatting spam at the moment. I am looking into using a CAPTCHA system instead though. If I do, it won't be for every comment. It'll just be for the first a person makes.

    The other option would be to just let people register an account. Either way, I'd like to have it where a first-time commenter doesn't need to wait an unknown amount of time for me to spot their comment.

  2. I don't think so HaroldW. You can't adhere to a standard while switching between different standards!

  3. "You can’t adhere to a standard while switching between different standards!"

    Well I have to say Brandon S for one who has been following climate science for so many years you remain remarkably ignorant of the acceptable machinations within the science. From my own empirical observations it is perfectly OK to use two different standards if the first standard hasn't given the required results. The rationale is quite simple, science is a moving feast, so if you do have to use different standards it's as a result of knowledge gained in the use of the first standard i.e. it hasn't given the right result, if the second standard gives the right result it's because it's merely an adjustment to the first standard and therefore both standards are the same, the second one being the most recent instantiation of the standard. Once you've done this and interpolated the missing data that was giving the wrong result you get the right result even if you're two 15 year olds working out of a garage in Oldham you will be acclaimed as moving the science forward with shouts of "the latest Wannabe and Newbie paper proves that the observations have been wrong all this time" from all and sundry. And Julia Slingo proclaiming, "We're out of the woods."

    On a less serious point is ,Anders seriously a scientist/phycisist? He came onto my radar last year, we had a brief twitter exchange during which he abused me up hill and down dale before storming off and blocking me for abusing him. So I went to his site and drew the conclusion that he really wants to be accepted as a player in the warmist world but can't seem to get through the door into the gang hut that is SkS. In itself that's remarkable as it appears to me anyone prepared to hurl abuse or tell lies in the cause gets through that door with consumate ease. I wonder what they know that we don't?

  4. No prob! It always bothered me that so many people said later work "confirmed" the hockey stick when newer work looks nothing like a hockey stick yet nobody had actually bothered to demonstrate that, so I figured, why not? I wish I were more graphically oriented though. The visuals could be made so much better.

  5. 1) What you claim is the original hockey stick isn't the original hockey stick. That went back only to 1400.
    2) When you compare it to the other hockey sticks you don't crop it to the same 1400.
    3) When you plot the graphs, you don't compare like to like, including tropical or only extratropical and so on which would be like with like.
    4) No error bars. On any of them. Fail.
    5) The hockey stick demonstrated that the recent warming was far faster and probably higher than at any time in the period recorded in the graph. EVERY SINGLE ONE SHOWS THAT.

    So, apart from all THAT, what has the hockey stick done to show itself right, huh?

  6. Wow, I think you're confused:

    1) What you claim is the original hockey stick isn’t the original hockey stick. That went back only to 1400.

    Michael Mann, Raymond Bradley and Malcolm Hughes published a paper in 1998 claiming to reconstruct temperatures back to 1400 AD. In 1999, they published another paper extending their results back to 1000 AD. It is this extended reconstruction which is widely known as the hockey stick, having been the one which was prominently displayed by groups such as the IPCC.

    It would serve no purpose other than to be pedantic to call the 1998 reconstruction the "original hockey stick." You certainly wouldn't be correct to, as you suggest:

    2) When you compare it to the other hockey sticks you don’t crop it to the same 1400.

    Truncate it to 1400 AD. Even if one wants to call the 1998 paper the "original hockey stick" and the 1999 follow-up an "extended hockey stick," that is purely a matter of semantics. It has nothing to do with the issue being examined here - how well the widely popularized temperature reconstruction matches with other temperature reconstructions. As for your remark:

    3) When you plot the graphs, you don’t compare like to like, including tropical or only extratropical and so on which would be like with like.

    I am not the one who said these reconstructions should be compared to one another. The purpose of this post was to examine a common response from people who defend the hockey stick, such as that by the blogger Anders which I quoted, which subsequent work replicated the hockey stick. By saying reconstructions replicated its results, he is the one who said these reconstructions should be compared to one another.

    In fact, the reconstructions I used in this post were plotted together in a paper which Michael Mann (and 16 other people) co-authored. He clearly felt it was appropriate to compare them to one another. A very similar graph was included in the IPCC AR4 report. This means you are criticizing me for doing something widely accepted in the paleoclimate field. If you would like to criticize the people of the field who make these graphs as well as people like the blogger Anders who make these comparisons, you're welcome to. However, it is strange to criticize me for merely showing the results of the comparisons these people say should be made.

    4) No error bars. On any of them. Fail.

    You can claim this is some damning criticism with no further explanation than, "Fail," but that won't convince anyone. The claim made was subsequent work replicated the original hockey stick. Whether or not these papers fell within the uncertainty intervals of one another does not address that claim. Uncertainty intervals are useful for seeing if results contradict one another, but they are not very useful for seeing if results replicate one another.

    Additionally, I'll note comparisons such as these are often made without showing uncertainty intervals, a point shown in this post which included a figure where many reconstructions were plotted together with no uncertainty intervals included.

    5) The hockey stick demonstrated that the recent warming was far faster and probably higher than at any time in the period recorded in the graph. EVERY SINGLE ONE SHOWS THAT.

    When reconstructions are calibrated to the modern instrumental record, the fact they show a significant amount of warming in the modern portion is largely irrelevant. There is much more to be said about the issue, but that is not the topic of this post. This post was not about whether or not recent warming is unprecedented. It was, quite simply, about whether or not the original hockey stick was replicated by subsequent work. Replicating the original hockey stick requires doing more than just getting a warmer modern period than previous periods. it requires getting a hockey stick - flat shaft; curved blade. None of these do.

    One could easily decide subsequent work did not replicate the hockey stick as it gave very different shapes yet also decide that work showed recent warming was unprecedented. I happen to think that position is wrong, but it is a position which would be perfectly in line with the point of this post.

    So, apart from all THAT, what has the hockey stick done to show itself right, huh?

    You mean, apart from you insisting we ignore the first 400 years of the hockey stick for no particular reason? From you insisting we must include uncertainty intervals when determining whether or not reconstructions replicated the shape of a previous reconstruction? From subsequent reconstructions having shapes nothing like a hockey stick...?

    I suspect you just made this comment of yours to score points, but if you really want to have a discussion, I think it'd be better if you changed your approach.

  7. This is an interesting post, thanks for the work you do on this site.

    As a layperson trying to make sense of this entire issue, and having some amount of experience in modeling complex datasets (albeit of financial phenomenon, not "scientific" ones), it certainly seems that there has been quite a bit of liberty taken in the reconstructions.

    1) If error bars are so wide as to be able to tell two opposing stories with respect to historical temperatures, then the entire endeavor of reconstructing past temperatures has more work to do, full stop. This shouldn't be remotely controversial. This is an "if" statement, as I've not done a comprehensive survey of every graph with error bars, but I've seen a few dozen, and the bars seem ridiculously wide.

    2) If one smoothes out historical temperatures, one must smooth out modern temperatures as well. In some of the graphs, this doesn't appear to have been done. I don't think we can help the "splicing" issue except to make sure that data sets are properly disclosed. Regardless, consistent presentation is relevant as well.

    3) Updated proxy information would be helpful, now that we do have a brief thermometer-based history of temperature. I've sniffed some blogs indicating this exists, but haven't seen them myself, perhaps there's a good source for updated proxies? I'm guessing tree rings are still being counted somewhere.

    Thanks again, I just found your blog, and wish I had the time to analyze some of the data myself, but at present, only have time for armchair quarterbacking 🙂

  8. Justin, it's good to be appreciated. I agree with your general points, but I'd like to elaborate:

    1) One thing to keep in mind is the uncertainty listed by authors is generally not the true uncertainty. Some of the uncertainty calculations used for proxy reconstructions have been so flawed as to be bizarre. For instance, in one case uncertainty in results shrank as one went back in time. In another case, uncertainty was 0 for a particular period in the past (due to aligning proxies by that period). I don't think any reconstruction has actually shown the true uncertainty in its results. For instance, none address the uncertainty caused by the choice of which proxies to use. Whenever there's a reconstruction using various proxies, one needs to ask, "What other proxies are there which weren't used?" Unless one can be sure the choice of proxies was unbiased (hint: it never is), the stated uncertainty will always underestimate the true uncertainty.

    2) I've never seen the modern temperature record smoothed to account for the difference in temporal resolution between it and reconstructed temperatures. In fact, even when authors explicitly acknowledge the difference in temporal resolution (e.g. the Marcott reconstruction), they still show the two series together even though doing so creates a very misleading impression. In one case (again, the Marcott reconstruction), the authors explicitly acknowledged their reconstruction lacked the temporal resolution needed to compare past temperatures to current ones while going to the press and getting tons of attention... for the uptick given by comparing modern temperatures to past ones.

    3) Getting more up-to-date proxy data would definitely be good. There is some effort to collect new proxy data, but it is often unheeded. In some cases authors have intentionally used older proxy data rather than the updated versions, with the apparent explanation being the updated versions didn't give them the answer they wanted (as seen by comparing the different versions). In other cases, proxies from 20+ years ago have been used in a dozen or more temperature reconstructions. Again, these proxies match the "expected" results. The normal approach to this sort of field, I would think, would be to compare and contrast all the different proxies available to see what similarities and differences there are. That's generally not done.

    Anyway, I'm glad you found the site. I hope you continue to enjoy it!

Leave a Reply

Your email address will not be published. Required fields are marked *