The tweet which led to my last post also led to an exchange on Twitter which I found somewhat peculiar as it involved things like being told I was promoting conspiratorial nonsense. Despite my best efforts, I was unable to find out what this nonsense was, and alas, it appears we may never know what conspiracy theories I have been espousing.

That mystery aside, the exchange allowed me to state why I think people deriding the pursuit of e-mails from climate scientist via legal means like Freedom of Information requests are in the wrong. It's not that I care about the e-mails themselves. I don't. However:

There has been a long history of climate scientists involved in the global warming debate refusing to share information/data. One of the most famous examples was when climate scientist Phil JOnes responded to a person asking for data by saying:

We have 25 or so years invested in the work. Why should I make the data available to you, when your aim is to try and find something wrong with it.

This sort of reaction is not limited to climate science. Examples of it have been discussed in many fields. I think it's silly. If you're a scientist who believes in his/her work, you should have no problem with people examining it. Refusing to share information/data with people, especially because of perceived traits you believe that person has, is completely unscientific.

When scientists behave in such an unscientific manner, I see no problem with people trying to get access to information they were denied in other ways, such as using the legal system. I don't think that's a remarkable view, but my tweet above led to this response:

I think that question is silly as it seems it should be easy to see at least some examples of what i referred to. If a person publishes a paper and refuses to archive the data used in it, the lack of such an archive can often be apparent. If an author of a paper fails to describe steps they took in their analysis, that can often be apparent. So forth and so on.

That said, the person I was exchanging these tweets asked me to state what data I cannot find several times so I offered to write a post highlighting some examples. What comes next will be a list of just a few examples of data and/or information I have wanted to examine but been unable to because researchers refuse to share it.

Lonnie Thompson - I don't think any discussion of a failure to archive paleoclimate data could be complete without mentioning the Thompsons. Lonnie Thompson's wife has a far worse record on archiving data than he has, but his data has been far more influential for multiproxy temperature reconstructions. Three proxies seen in many such reconstructions which come from his work are Quelccaya 1984 (not to be confused with his later expedition there in 2003), Dunde, Dasuopu and Guliya,

Until 2012, practically no data was archived for these proxies. The latest of the expeditions was carried out in 1997, and the first data for them was archived only in ~2004. Even then, it was just decadally averaged δ18O (oxygen-18 isotope) data from a few cores. In 2012, more than 20 years after much of the data had been collected, more data was archived but even now, the archives contain very little information. When coring ice in expeditions like this, thousands of samples are taken with many different variables being measured. Thompson's current archives contain only a tiny portion of the data he collected

And that's when he archives data at all. For at least one expedition, Thompson never archived anything. Back in 2002, Thompson went on an expedition to a place called Bona-Churchill to collect data. To this day, he has not released any of it. Some information about it is available due to it having been used (such as )here. As a result, despite no actual data for the expedition being available, there is enough information to let interested parties see the data did not support conclusions Thompson was fond of promoting.

That seemingly "bad" data which fails to confirm results isn't publicized is worrying, but it also creates a strange situation where some people have access to some data while others do not. Thompson shared data with specific individuals which he did not archive, but because there was no public archive people could rely on, numerous different versions of proxies created from his work showed up in scientific publications. I know of at least four different versions of the Dunde proxy which have shown up in multiproxy reconstructions, including at least two different versions used by papers Thompson was an author of, in the same year (2006). Even worse, at least one version of the Dunde proxy goes back hundreds of years further into the past than the archived data.

There is far more to say about Thompson's archives and how inadequate/non-existent they are. I could also talk about data problems with them we can't examine because of the lack of archiving (don't get me started on the Dasuopu proxy), but I would like to also point out an example of him hiding information. In the well known movie An Inconvenient Truth, by former Vice President Al Gore, he repeats a claim made in the book it was based on:

The correlation between temperature and CO2 concentrations over the last 1000 years – as measured by Thompson’s team – is striking. Nonetheless the so-called global warming skeptics often say that global warming is really an illusion reflecting nature’s cyclical fluctuations. To support their view, they frequently refer to the Medieval Warm Period. But as Dr Thompson’s thermometer shows, the vaunted Medieval Warm Period (the third little red blip from the left below) was tiny in comparison to the enormous increases in temperature in the last half-century – the red peaks at the far right of the graph. These global-warming skeptics – a group diminishing almost as rapidly as the mountain glaciers – launched a fierce attack against another measurement of the 1000 year correlation between CO2 and temperature known as the “hockey stick”, a graphic image representing the research of climate scientist Michael Mann and his colleagues. But in fact scientists have confirmed the same basic conclusions in multiple ways with Thompson’s ice core record as one of the most definitive.

While standing in front of a charge he claims proves the "hockey stick" is sound. This chart, which is explicitly identified as being taken from Thompson's work, is in fact that very "hockey stick." That is, Gore showed the original "hockey stick" while incorrectly attributing it to Thompson and claimed it proved the original "hockey stick" correct. Gore called Thompson his friend in An Inconvenient Truth, Thompson was a scientific advisor for the movie, but to this day he has refused to make any public statement correcting this error.

Gergis et al. - I'll try to keep this section more brief. Back in 2012, a paper, Gergis et al. (2012) which had been accepted for publication was taken down and never formally published. The paper had claimed to screen paleoclimate proxies against detrended instrumental temperature data in order to avoid the risk of choosing proxies simply because they went up in the modern period. This claim was false. The authors of the paper had inadvertently failed to detrend their proxies, and ultimately, the journal refused to publish the paper because of the error.

Four years later, the paper resurfaced. In the new version, the authors claimed to used detrended screening on their proxies. They claimed to do so over the 1931-1990 period. As a first step of examining the reconstruction, I tried to replicate their screening. When I did, I found out the authors had failed to archive important data. You see, the authors archived proxy data and temperature data for the region they were examining (basically, Australia), but they didn't archive the gridded temperature they used. This was a problem as their primary results were based on screening proxies by that gridded data.

Of course, the authors did say what data set they used. However, that data set had changed over the years since they accessed it, and the contemporary data set gave results which did not match those the authors published. This made it impossible to replicate/verify the authors calculations involving the gridded data.

I examined other aspects of what they did, finding a number of data errors (I believe six or seven proxies had the wrong locations listed), but then I made a remarkable discovery. It turns out the authors had not used the 1931-1990 period for screening like they claimed. Instead, they had used the 1921-1990 period. This meant the paper's calibration and validation periods (1921-1990 and 1900-1931) overlapped with one another. And because the authors chose not to archive their gridded temperature data (or provide it when requested), it was impossible for me to check to see what effect this mistake had on their results.

Kevin J. Anchukaitis - Okay, this section isn't really about Kevin. I just thought it was interesting an example I wanted to use involved two papers he is an author on as he is the person I was talking to on Twitter. I didn't realize that when I discussed the papers before. Small world. Anyway, a 2012 paper I read while trying to track down some data said:

The tree-ring chronology network available for reconstruction of summer temperatures is shown in Fig. 1 as a series of filled red triangles. It is comprised of 323 annual tree-ring chronologies previously identified as potential predictors of the Palmer Drought Severity Index (PDSI) over monsoon Asia (Cook et al. 2010a) and 99 new treering records contributed by members of the PAGES Asia2k project, for a total of 422 chronologies.

This confused me as later the same paper said:

Although neither property is desirable for our specific application, and almost certainly limits the quality of the temperature reconstruction that is currently achievable, this new tree-ring network is actually denser than the 327-chronology network used to successfully reconstruct drought over monsoon Asia (Cook et al. 2010a).

Citing the same (Cook et al. 2010a) paper for the data in question even though one paragraph said it contained 323 series while the other said it contained 327. When I looked at the cited paper, it said there were 327 series. No explanation was provided as to why the network of 327 series was pared down to 323. This naturally raised the question, "Which series were removed, and why?"

At the time, that discrepancy was just a mild peculiarity to me. Later, I looked at how that 2012 paper influenced a much larger project (PAGES2k). It turns out that project used the 2012 network for its temperature reconstruction of Asia. It took the 323 series I mentioned, add 99 more then removed 193 which did not extend back beyond 1600, and that's what it went with.

To their credit, all 229 proxies can be found in the PAGES2k data archive. However, I have been unable to find the full 422 network or any listing of the 323/327 series networks. Additionally, while the data for the 229 proxies is provided, these are all tree ring chronologies. They are created yb taking a number of measurements and combining them via a mathematical process. There is no universally accepted process for making tree ring chronologies, and differences in the methodology used can impact one's results. Verifying this sort of work requires access to the measurements that go into the calculations, not just the results of the calculations.

Doran and Zimmerman - This example isn't paleoclimate related, and I apologize for dragging this post out even more (good god, I did not expect this to go on so long). However, I think this is an important example. In 2009, Peter Doran and Maggie Kendall Zimmerman published a paper titled Examining the scientific consensus on climate change. It reported results for a survey of 3,146 people.

This survey asked people if they felt temperatures had risen or fallen since 1800 or remained relatively constant. Those who said temperatures had risen or fallen were then asked if they thought human activity was a significnt contributing factor. People who said temperatures remained relatively constant were filtered out and not counted. The result was:

Results show that overall, 90% of participants answered “risen” to question 1 and 82% answered yes to question 2. In general, as the level of active research and specialization in climate science increases, so does agreement with the two primary questions (Figure 1).

Though without knowing how many people were excluded, it is difficult to interpret what that "82%" means. This can be seen in the authors' reporting of results for active climate scientists:

In our survey, the most specialized and knowledgeable respondents (with regard to climate change) are those who listed climate science as their area of expertise and who also have published more than 50% of their recent peer- reviewed papers on the subject of climate change (79 individuals in total). Of these specialists, 96.2% (76 of 79) answered “risen” to question 1 and 97.4% (75 of 77) answered yes to question 2.

As if you don't filter two people out with the first question, the consensus would be 75 of 79, or 94.9%. That is a trivial detail. What is crucial is the authors report these further results:

The two areas of expertise in the survey with the smallest percentage of participants answering yes to question 2 were economic geology with 47% (48 of 103) and meteorology with 64% (23 of 36).

And no more. Those are survey results for 218 people from a survey of 3,146. Results for 93% of the respondents were not published. They were not even published in the thesis written on the survey's results (which I bought in the hopes of finding such information). Even worse, a follow-up paper seeking to do a meta analysis on the "consensus" on global warming had these researchers as authors. That paper lists three three subsets of people highlighted above as the only "consensus estimates" from this paper. Two dozen other groups of people responded to the survey, yet because Doran and Zimmerman never published their data, they are now capable of co-authoring papers which pretend 93% of their data doesn't exist.

I think this is more than long enough so I'm going to stop here. There are many more examples I could discuss, but the reality is anyone involved in climate science (or any number of other fields) should know full and complete data archives are often not available. Indeed, climate science is hardly the only field where some researchers have refused to share any data at all.

I don't think me saying this makes me a conspiracy nut either.

  1. As a word of caution to readers, I tried to be careful while writing this post, but there was a lot of different details which went into it. As I noted in this post, sometimes data has been archived more than a decade after it was collected. My experience is when that happens, it is generally not announced clearly or prominently. Given that, and and given how many details are involved, it is possible someone will find some data archived that I was unaware of.

    I did try to check to make sure that wasn't the case as I wrote this, but for every detail I mentioned in this post, there were half a dozen more I considered discussing but cut for space. There's a lot of material, and I'm not perfect. Please keep that in mind before going, "Gotcha!"

