Mann's Screw Up #2.5 - 5%

In response to my last post in this series, a user (Dr. C) asked me an insightful question. This made me happy because I had intentionally left the issue open for my next post, and that showed some people noticed.

You'll remember that post discussed sensitivity testing Michael Mann did which proved his results were not robust as he had claimed. He stored those results in a directory (infamously) titled CENSORED. Dr C. asked:

Apart from that, how does the censored directory play into this at all? According to Mann, the dataset on which the HS depends was one from Jacoby & D’Arrigo. But the CENSORED directory only included datasets from Graybill & Woodhouse.

This question is on the money. The directory Michael Mann referred to showed the results of removing twenty series from his data set. However, they do not include a series by Jacoby & D'Arrigo. As I explained to Dr. C:

As you note, the “one set of tree ring records” is from Jacoby and D’Arrigo. If you remember, my original list included an item about the (misuse of the) Gaspe series. The Gaspe series is from Jacoby and D’Arrigo. That is the series Mann referred to in that quote.

I intend to write my next post (#2.1) on the Gaspe series, and I plan to delve into this matter more in it. But basically, Mann’s test found if he removed the 19 Graybill and 1 Woodhouse series, he could still get a hockey stick thanks to the Gaspe series. That’s why it was “of critical importance in establishing the reliability of the reconstruction,” not the results of the reconstruction.

In other words, if Mann removed 20 series, he could still get a hockey stick because he included a 21st series named Gaspe. I want to discuss that Gaspe series, but before I do, it's important we understand how Mann et al handled their data.

To begin, Mann et al had 415 data series. Many of these series were from similar areas. To better extract a signal, series from similar areas were combined via a method known as principal component analysis (we'll discuss how Mann implemented it incorrectly in a later post) which reduced the total number of series used to 112. Of these, 31 were created via PCA. The other 81 series were used on their own.

Now then, some of these 112 series extended further back in time than others. To address this, Mann et al broke their temperature reconstruction into periods, using series for whatever periods they covered. If a series didn't extend back to the beginning of a period, it wasn't used in that period.

70 (out of 212) North American tree ring series covered the 1400-1450 period. They were combined via PCA into two series. Another 20 series were used along with those two (including one other created via PCA). Would you be impressed to hear Mann's hockey stick depended entirely upon 21/70 = 30% of the data used in 2/22 = 9% of the series?

What if I told you it's worse than that? You see, the Gaspe series doesn't actually extend back to 1400. It only extends back to 1404. As such, it was part of the 212 North American tree rings (as cana036), but it shouldn't have affected the 1400-1450 period at all. It shouldn't have "saved" the hockey stick at all.

So how did it, you ask? Simple. Mann cheated. If you look at the list of the 22 series used in the 1400-1450 period, you'll see one of the series is named treeline11.dat. Here's a graph of the two (black - cana036, red - treeline.dat):


As you can see, they're both Gaspe. That means Mann used Gaspe twice. But if you look closely, you can see it is worse than that. The treeline.dat line has a bit of flatness at its start absent from the cana036 version. You can verify that by checking the data file which shows exactly four years of extra data was added was added to treeline.dat - the exact amount needed to extend the series to 1400.  That's the exact amount needed for it to be included in the 1400-1450 period.

Mann found if he removed 20 series, he could still get a hockey stick because of one other series. That series was an arbitrarily extended (no other series was extended like it) version of a series already included in his data set. The extension of that series was originally undisclosed, acknowledged only after Mann was forced by his critics to publish a corrigendum saying:

For one of the 12 ‘Northern Treeline’ records of Jacoby et al. used in ref. 1 (the ‘St Anne River’ series), the values used for AD 1400–03 were equal to the value for the first available year (AD 1404).

Even after it was acknowledged, no explanation was provided for the extension, and there has never been an acknowledgment of the fact the series was used twice.

Why was the series used twice? We don't know. Why was one use of the series arbitrarily extended back in time? We don't know. Why did these two unexplained decisions result in saving the hockey stick for the sensitivity testing Mann did? We don't know.

Was it fraud? I don't know. What I do know is that is a heck of a lot of coincidences which led to results Mann liked. One could certainly be excused for believing they weren't coincidences.

And for the record, 21/415 = 5%. That's how much of his data Mann says you need to remove to get rid of the hockey stick.


  1. I should point out the two data files I uploaded are not in their original formats. For some reason WordPress doesn't allow you to upload .txt or .dat files. It doesn't change anything other than the filename, but it is annoying.

  2. The Gaspe series is a cedar tree ring series. Cedar is also strip bark. There are some 1000-year old strip bark cedars near Toronto. Cedars grow best in cool moist summers. The Gaspe series used in Mann et al 1998 is atypical. I learned through backchannels that it had been updated but the update was neither archived nor reported. It did not have a HS shape. Jacoby and D'Arrigo refused to provide the data when I asked for it. It was eventually archived only a couple of years ago.

    Note that the plots shown here go to 1984, while the Mann reconstruction goes only to 1980.

  3. It was finally archived? Interesting. I don't know if I ever knew that. If I did hear about that, I must have forgotten.

    Anyway, my plan right now is to highlight how little data is responsible for the hockey stick, focusing on statements from Michael Mann himself. That's to show Mann's work is dependent upon very little data. Then I'll show how Mann manages to give so much weight to that little data. Once I've done that, I'll show that data is questionable, at best. My instinct is that is the clearest way to progress through the issues. I don't think you can combine any of them without confusing people.

    Of course, that's just my impression. I'm always open to suggestions.

  4. Dont't forget to note that the Gaspe series starts in 1404, but with just a single tree core between 1404 and 1421, then a grand total of two trees until 1474. Truly remarkable.


  5. Bruce, I intend to. Right now I'm just showing how little data was relevant. Next I'll show how Mann managed to use so little. Then I'll show how that little data is questionable (if not worse).

    I don't want to bog down a post with a bunch of different details. I think most people would be bothered by what is said in this post even if they believe the 21 series referenced are acceptable for use.

  6. I feel like an idiot for the way I worded my comment, now. It should have read, " But the CENSORED directory only excluded datasets from Graybill & Woodhouse." But clearly you knew what I was getting at. I did NOT realize, as Mr McIntyre points out in a different comment, that there were multiple CENSORED directories.

    Brandon, I would also think that it's important to make the distinction between sloppiness and actual deceit. (I think by laying out these issues chronologically, you are implicitly doing this. But it needs to said explicitly as well.). I think it is fair to characterize MBH98 as (perhaps) novel andinteresting, yet fatally flawed due to sloppiness. I see now that the actual deception (and therefore fraud) begins with MBH99. (Previously, I had thought that the deception began with the CYA activities after MM03 came out.)

  7. Dr. C, the problems with MBH98 were not just ones of sloppiness. We get a hint of that in this post as we see Gaspe was artificially extended back in time without any explanation (or even notification). However, it becomes incredibly apparent once we get to the issue of verification scores. Mann et al published favorable r2 verification scores in the 1820 step yet hid adverse r2 verification scores in earlier. That wasn't caused by sloppiness.

    That won't be covered for a few more posts though (I'm thinking it's now third in line for this series). In the meantime, the important thing to realize is lies build. You tell a little lie, and that leads to a bigger one, which leads to a bigger one, etc. The chain can even start with a mistake rather than a lie. That doesn't change anything though. Once a chain of lies gets long enough, it stops mattering what started it.

    By the way, MBH98 was only "novel" in that it made several inexcusably wrong decisions which greatly exaggerated its results. It deserves no more credit than any other really bad paper.

  8. While I've been following the theory of Global Warming for a decade or more, I've only recently investigated the science in detail. As someone who has used statistics in the conduct of my business for the last thirty years, I remain baffled by the use of tree rings as a proxy for temperature. I am willing to accept that my opinion may be wrong on that subject, but I have yet to be convinced otherwise. I find myself perplexed when I learn more about the data normalization involved and the arbitrary acceptance and rejection of data sets.

  9. Fabi, there is nothing inherently wrong with using tree rings as a proxy for temperature. They can be just that. The problem is in the details. You need to ensure you can actually extract temperature information from them. That's true of any proxy. The problem is people rarely bother to try. There is no rigorous testing of data or set standards everyone follows. It's pretty much all ad hoc.

    What I find most interesting isn't the questionable data which gets included. What I find interesting is all the data that isn't included. Suppose you looked at 100 tree ring series, and you found you could extract temperature data from 50 of them. That sounds like a pretty good number, but what about all the tree ring series you didn't look at? How do you ensure the sample you're using is representative?

    Basically, the idea is fine. It's just far more complicated than people in paleoclimatology act.

  10. Brandon: Thanks for clarifying that. I should have mentioned, as you did, representative samples. That's a very important aspect and one of many ways to introduce bias into the process. And I don't necessarily mean intentional bias, but sample bias. From a scientific standpoint, there should have been a detailed and open examination of the data long before entities moved forward with model results. Perhaps that's idealistic, but given the high-level exposure and potential impacts to global economies - or, possibly the planet itself, as is claimed by some - it also seems indicated.

  11. Here's an example of the ridiculous way that Mann went about cherry-picking his proxies, From the Climategate series:

    This is a version we gave Tim Osborne when he was visiting here, and since Tim hasn't used
    it, and we haven't compared results from that code w/ our published results, I can't vouch
    for it--it may or may not be the exact same version we ultimately used, and it may or may
    not run properly on platforms other than the one I was using (Sun running ultrix). Scott
    Rutherford (whom I've cc'd on this email) has worked with the code more frequently.
    The code is not very user friendly unfortunately. For example, the determination of the
    optimal subset of PCs to retain is based on application of the criterion described in our
    paper, which involves running the code many times w/ different choices. So the "iterative"
    process has to be performed by brute force.
    The method, as outlined, is quite straightforward and others have implemented it
    themselves. SO you might prefer to code it yourself. That would be my suggestion. But you
    are, of course, free to use our code.
    That having been said, we have essentially abandoned that method now in favor of a
    somewhat more sophisticated version of the approach, which makes use of the RegEM method
    for imputing missing values of a field described by Schneider (J. Climate, 2000).

  12. "You need to ensure you can actually extract temperature information from them"

    What does that mean? Given there also seem to be tree ring studies that extract rainfall information, how does one take a set of tree rings and in one case extract temperature, and the other rainfall? Seems impossible given there is a single value, the width of a ring, or am I missing something obvious?

  13. Fabi, before using data, one should always take reasonable steps to ensure that data is appropriate for what one is doing. Even if someone doesn't believe it's necessary for the initial work, surely there's been plenty of time in the 10+ years since Mann's original hockey stick was published. Instead, we get reconstruction after reconstruction using mostly the same data without anyone having ever established that data is appropriate.

    My favorite example of this are the reconstructions by Christiansen and Ljungqvist a couple years ago. They made new reconstructions basically under the principle of, if other reconstructions have used this data, we'll use it too. I actually e-mailed Bo Christiansen about one of his series as it was published as a precipitation proxy, not a temperature one. I was told people believe proxies in that area can work as both, but when I looked at the paper he provided for this claim, it showed the relationship he claimed existed was often the opposite of what actually existed. I didn't get a response when I pointed that out.

    Paleoclimate reconstructions inevitably rely largely upon a small amount of data in their earlier periods. If that data is good, we should focus upon it rather than mixing it in with a bunch of non-informative data. The problem with that is no author will ever say, "Finding 95% of our data is unimportant, we now look at the 5% which actually support our resutls."

    Basically, paleoclimatological reconstructions are in their infancy and they're being done poorly. And because of the public impact Michael Mann's reconstruction had, there's every reason not to try to do them right. Why would anyone go through the extra work when shoddily done work gets fame and publicity? Establishing the validity and reliability of proxies won't get that, and if it calls the popular work into question, it'll cause problems for the authors.

  14. climatebeagle, many different things affect tree rings. The key is trying to find situations where we know the dominant effect. For example, suppose there is an area that gets enough rain for trees to survive, but no more. Tree rings in that area will not show much of a precipitation signal. Either the trees will die from a lack of rain, or they will survive. Temperatures could easily be the dominant signal in such a tree.

    Dendroclimatologists have done a great deal of work figuring out what things affect what sorts of trees. In most cases, it's impossible to extract a pure temperature signal. However, there are tons of trees. It being impossible in most cases still means there are lots of cases where it is (or at least may be) possible.

  15. Dr C writes: I did NOT realize, as Mr McIntyre points out in a different comment, that there were multiple CENSORED directories.

    All of the CENSORED directories only consider Graybill chronologies, but there are different directories for different time steps.

  16. I'm not following your 5% comment. Do the other 20 proxies similarly produce hockey sticks?

  17. HughMcdonough, assuming the "other 20 proxies" you're referring to are the 20 used in the 1400 step that weren't discussed in this post, the answer is no. They basically don't do anything.

  18. Brandon, actually the "other 20" proxies show an early 15th century that is warmer than the close of the Mann reconstruction. This was the effect that we observed in our 2005 papers (and even 2003). But, as you say, one can equally say that they don't anything. If one replaces the other 20 proxies with red noise, one gets a reconstruction that has similar verification statistics to MBH.

  19. That is a good point. The same is true of Mann 2008. Some of Mann's defenders like to downplay the fact the no-dendro/no-Tiljander reconstruction fails verification, saying we shouldn't be surprised by that since so much data was removed and it still nearly passes. They ignore (or are unaware of) the change in shape you get when you remove that data.

  20. The Mann 2008 paper is what convinced me of the existence of a Medieval Warm Period. He does all that algorithmic trickery, and he still needs upside-down or bristlecones to produce a hockey stick. Means the data was against him.

  21. "Brandon Shollenberger

    Fabi, there is nothing inherently wrong with using tree rings as a proxy for temperature."

    Er, yes there is. There is no reason, a prior, to suppose that tree ring width will correlate with temperature. Indeed, the whole reason that tree rings are truncated in reconstructions is that tree ring growth does not correlate well with temperature.
    Trees are very complex biological entities and have evolved to optimize their breeding strategy. One hypothetical strategy would be to invest a larger amount of resources in generating seeds, rather than wood, when environmental conditions are very good for seedling. If this is a viable evolutionary strategy, then a plot of temperature vs ring thickness would be a pair of overlapping Gaussian distributions, like these;

    Imagine temperature as the 'x'-axis and tree ring growth as the 'y'-axis. A tree is not in the business of getting bigger for the sake of it, it is in the business of have many progeny.

  22. "he still needs upside-down or bristlecones to produce a hockey stick."

    This is what Brandon and Steve want you to believe, but it isn't true.

  23. DocMartyn, if you wish to say it is inherently wrong to use tree ring data as a proxy for temperature, you can. I won't argue the point. I'll merely state there are thousands of people who say otherwise while spending much of their life studying trees, and if you want to convince me they're wrong, you'll have to rebut the work they've done to show they're right.

    Boris, it is true. It is even acknowledged by Gavin Schmidt and Michael Mann. If you wish to say they are wrong though, have at it.

  24. "Boris, it is true. It is even acknowledged by Gavin Schmidt and Michael Mann."

    You're confusing non-dendro with non-strip bark bristlecone. Two different things.

    And of course we're ignoring Salzer's work showing that there is no difference between strip-bark and non-strip bark bristlecone pines in the modern period. Since that was the reason the NAS report gave for avoiding strip bark BCPs, it is reasonable to include these trees in reconstructions.

  25. I'm not confusing anything, though yes, Mann et al tested their results by removing tree rings, not just the tree rings under dispute. That's great for people like you who need any excuse to defend Mann's work.

    "They removed all tree ring data, so your criticisms are invalid."
    "They did the tests because we criticized specific tree ring data."
    "That doesn't matter. That's not what they removed!"

    (And your paragraph about Salzer is just as wrong.)

  26. So you don't know if removing bristlecones and tiljander from Mann 2008 causes the reconstruction to fail to validate? Interesting.

  27. Do you know what happens to the reconstruction if you remove strip-bark bristlecones and Tiljander? How far back can it validate?

    And my paragraph about Salzer is right--as you'll see when you get around to reading it.

  28. Ok Brandon, but may I just draw your attention to Linah N. Ababneh's Ph.D. thesis (page 79, APPENDIX II) where she examines the strip-bark and whole-bark chronologies of bristlecone pine from the White Mountains of California, with respect to temperature and precipitation.

    Garcıa-Suareza, Butlera and Baillie (2009) is worth a read too.

    Chena, Yuana and Wei (2011) demonstrate that even trees in mountains are water, not temperature limited, with respect to tree rings.

  29. DocMartyn, you can try to draw my attention to whatever you'd like, but I cannot begin to guess why you'd want me to look at those sources. At least one of them explicitly contradicts your view.

  30. "Boris, given you made an issue of people providing definitions, you’re going to have to say what “the reconstruction” is."

    The reconstruction of past temperatures in Mann '08.

    Do you know what happens to the reconstruction if you remove strip-bark bristlecones and Tiljander? How far back can it validate?

  31. I do have an answer. In fact, I have two. I have one answer for the CPS reconstruction in Mann 2008, and I have another answer for the EIV reconstruction in Mann 2008. Given there are two different reconstructions in the paper, and you've repeatedly referred to "the reconstruction" in Mann 2008, I can't possibly know what you're asking after.

    And I can be reasonably confident you have no idea what you're talking about. After all, if you don't know a basic point about a paper, that it has two different reconstructions, you couldn't have read it. You couldn't have even read the summary of it.

  32. Brandon,

    I concede that you can beat me at whatever language games you are trying to play. I'm not really interested in playing them. You have said that this statement:

    "he [Mann] still needs upside-down or bristlecones to produce a hockey stick"

    is true. You have stated that Gavin Schmidt and Michael Mann have admitted it. Of course, that isn't true because they were speaking about reconstructions that removed all tree ring proxies--not just the ones that you claim are invalid. So your claim that Michael Mann and Gavin Schmidt have admitted that they need "upside-down proxies and bristlecones to produce a hockey stick" is false.

    This is where you began to pretend to not know what a reconstruction was.

    You have shown no evidence that the statement is question is true. Perhaps you have evidence and continue to keep it to yourself. Perhaps you have no evidence at all.

  33. Boris, I have never pretended "to not know what a reconstruction" is. All I've done is mock you for not knowing Mann's 2008 paper contained two separate temperature reconstructions, a point you've still not acknowledged. Even now, you've failed to state which temperature reconstruction you are asking about, implying you still believe there was only one.

    You've made an issue of people providing definitions for the terms they use. You asked a question about what happens to "the reconstruction" when certain data is removed. It is perfectly reasonable for me to expect you to define which reconstruction you're referring to when you say "the reconstruction."

    So please, stop making things up. Alternatively, do a better job of it. At least then there'd be something to discuss.