Very Small Differences

As our last post discussed, one of the key issues in this whole Gergis et al affair is how one should screen proxies to decide which ones to use and which ones not to use. In 2012, Gergis et al claimed to have screened proxies because:

For predictor selection, both proxy climate and instrumental data were linearly detrended over the 1921–1990 period to avoid inflating the correlation coefficient due to the presence of the global warming signal present in the observed temperature record. Only records that were significantly (p<0.05) correlated with the detrended instrumental target over the 1921–1990 period were selected for analysis.

They wanted to take steps to avoid a problem commonly known as the Screening Fallacy. Of course, as it turned out they hadn't actually detrended their data like they claimed, and if they had, their results would have been very different. This gives rise to an important question, namely, why do they say this in their latest paper:

Our results also show that the differences between using detrended and raw correlations to screen the predictor networks, as well as between using field mean and local correlations, are minor (Figs. S1.3 and S1.4 in the supplemental material). Given this insensitivity, local detrended correlations were used to select our final set of temperature predictors (see theoretical discussion in section S1 in the supplemental material).

According to their newest paper, it makes very little difference whether one detrends or not. They also say it makes little difference whether you use "local" temperatures for correlation testing or temperatures of the region, but I'll discuss that in a little bit. Just understand the quotations marks I place around the word "local" are important.

For now, the main question is how do the authors conclude it doesn't matter whether or not you detrend your data before correlation testing? That's the exact opposite of what they concluded when they were forced to withdraw their 2012 paper.

Before I give you the answer, I should point out there is another option they tested. One problem with doing correlation testing is a thing called autocorrelations. Autocorrelation is the thing which accounts for how last year's temperatures influence this year's temperatures and things like that. It can artifically inflate correlation scores, so it should be removed. Gergis et al did in both their 2012 paper and this latest one. The method they use isn't important for today, but when looking at their figures, you'll see it referred to as "AR1."

With that in mind, look at this figure from Gergis et al's Supplementary Material:

7_15_Gergis_Fig_S1_4

As this shows, whether or not you detrend these proxies prior to screening to determine which ones to use makes little difference when you use "local" temperatures. (Again, note those quotation marks. They'll be important in a minute.) The reason for this is simple. Gergis et al only use two proxies going back to 1000 AD. In fact, they only use two proxies going back to 1430 AD if you don't count this proxy:

7_15_Palmyra

With how much missing data the Palmyra proxy has, it has very little influence before 1430. Other than it, there are only two proxies that extend before 1430 - Mount Read and Oroko Swamp. This figure basically just shows they both pass either form of screening when you use "local" temperatures. That is not exactly a surprising conclusion given Oroko Swamp actually has instrumental temperatures spliced onto it after 1957:

The only exceptions to this signal-free tree-ring detrending method were the New Zealand silver pine tree-ring composite (Oroko Swamp and Ahaura), which contains logging disturbance after 1957 (D’Arrigo et al. 1998; Cook et al. 2002a, 2006), and the Mount Read Huon pine chronology from Tasmania, which is a complex assemblage of material derived from living trees and subfossil material.

It is interesting the two proxies Gergis et al have that cover the 1000-1430 AD period were both processed differently than the rest of the proxies they used. I don't know whether or not that matters though, and I'm not going to worry about it today. The point is, if you only have two proxies to cover the 1000-1430 AD period, it doesn't matter what test you use as long as both of those two proxies pass it.

But what if they didn't both pass? That would be a problem, right? Not according to Gergis et al. You see, both proxies do not pass the detrended correlation test if you use the regional temperatures instead of "local" ones. Remember, that was the test used in Gergis et al's 2012 paper.* If we apply it for their new paper, the reconstruction becomes:

7_15_Gergis_Fig_S1_3

See that blue line? The legend for it says "AR1 detrending fldmean." That means it is what you get when you correct for autocorrelation (AR1), detrend the data prior to screening (detrending) and compare the proxies to regional, rather than "local" temperatures. That was what Gergis et al did in their 2012 paper.* Look at what happens when they do it again. I'll zoom in:

7_15_Gergis_Fig_S1_3_Enlarged

The blue line shows noticeably greater temperature variations, and it goes up quite a bit around 1600 AD, well beyond the uncertainty margins of the reconstruction. Maybe one could consider that "minor." The part I don't see how anyone could consider "minor" is that the blue line ends in 1577.

Maybe it's just me, but when you're making a 1000 year reconstruction, the difference between having and not having a reconstruction for 577 of those 1000 years seems more than a a little "minor." I think the appropriate word to describe it would be "major." As in, this paper could not possibly be published if Gergis et al had detrended its data and compared its proxies to regional, rather than "local" temperatures - the approach they had used in 2012.*

That makes it clear the 2012 approach wouldn't work. But maybe the 2016 approach is just better. Maybe one should use "local" temperatures for screening proxies rather than regional temperatures. After all, it seems reasonable to believe a proxy should describe the temperatures of the location it's at better than the temperatures of an area say... 500 kilometers away?

Each proxy was compared to local HadCRUT3v grid cells in the study domain (10°N–80°S, 90°E–140°W), defined as grid cells within 500 km of the proxy’s location

Oh. Um. The HadCRUTv3 grid cells used by Gergis et al are 5°x5°. That means you can find four to nine grid cells within 500 kilometers of any given location. That means when the authors say proxies match "local" temperatures, they really mean proxies match temperatures "Over there or over there or over there or over there or over there or over there or over there or over there or over there."

So if Gergis et al didn't use the "let's compare proxies to every temperatures location we can find" approach and had detrended its data "to avoid inflating the correlation coefficient due to the presence of the global warming signal present in the observed temperature record," they wouldn't have been able to publish this paper.

The only way they Gergis et al could manage to publish this paper is to inflate "the correlation coefficient due to the presence of the global warming signal" when doing their screening tests and/or to test each proxy as many as 27 times, as they even threw this extra feature into their tests:

To account for proxies with seasonal definitions other than the target SONDJF season (e.g., calendar year averages), the comparisons were performed using lags of -1, 0, and +1 years for each proxy.

To try to show their proxies match recorded temperatures, Gergis et al checked up to nine different grid cells, testing each under three different scenarios. Naturally, they didn't adjust their "significance" scores to account for any of these additional tests.

But hey, it's all irrelevant because the difference between having a 523 year reconstruction and having a 1000 year reconstruction is "minor," right?

July 18th, 7:30PM Edit: I've added asterisks in a few spots where I referred to the Gergis et al 2012 paper having used the test they wrote that they used but didn't actually use. Please note when I say the paper used the test, I mean that's what the paper said was done and what the authors intended to do. They didn't actually do it though.

6 comments

  1. Before anyone comments to point it out, yes, I am aware Gergis et al likely are relying on a semantic trick when they say:

    Our results also show that the differences between using detrended and raw correlations to screen the predictor networks, as well as between using field mean and local correlations, are minor (Figs. S1.3 and S1.4 in the supplemental material). Given this insensitivity, local detrended correlations were used to select our final set of temperature predictors (see theoretical discussion in section S1 in the supplemental material).

    They carefully say there is little difference "between using detrended and raw correlations" and there is little difference "between using field mean and local correlations." They never refer to what happens when you combine these tests. If you carefully refuse to look at detrended field correlations, the test used in Gergis 2012, you can claim there is only a "minor" difference.

    I just refuse to accept such an obvious trick has any validity. If someone wants to say this post is wrong because of it, they can, but I pity anyone who tries to resort to such a fig leaf.

  2. "That was what Gergis et al did in their 2012 paper."

    What they *said* they did but didn't actually do, right?

  3. Er, yeah. That should say something like "claimed they did." I'll try to update the post to fix that in a bit. Doing it from my phone could be trixky.

  4. I think the discussion on this at attp is quite interesting. In my view the site is
    an apologist echo chamber at the other extreme of WUWT (equally partisan
    and just as bad) - but the linguistic gymnastics that some of the regulars
    have performed in order to try and justify this is just insane.

  5. I still think the best part is how Anders basically wrote a post promoting the article Joelle Gergis wrote then turned around and said it didn't matter to him if she had lied - it's bad for people to call her a liar regardless. I'd like to think even he realizes how absurd that is.

    Oh well. It'll be interesting to see if people keep defending/promoting this paper over the next few weeks. There are definitely problems with it. Strangely, one of the problems is they didn't do what they claimed to do. It's eerily reminiscent of what happened with the last version of this paper. (Hopefully I'll have a post on this within the next few days.)

Leave a Reply

Your email address will not be published. Required fields are marked *