2012-03-21 10:14:15Why HadCRUT3 is wrong
Kevin C


OK, this is a bit weird, so I'll explain what I'm doing first.

There are lots of ways to convince. No idea how many. But here are 3:

1. Evidence based reasoning. However we know that the information deficit model doesn't work, so there is limited scope here. I think it's a valid approach for SkS beacuse I think that SkS probably serves more to equip climate communicators than to address deniers directly.

2. Personal testimony/narrative journey (what evangelical Christians call 'sharing your testimony'). Works because it's more human, more personal, and you can't argue with experience. Andy S's recent post is an excellent example. Barry Bickmore does it well (Mormon influence?), and Admiral Titley too.

3. What I call the Dan Brown/Da Vinci Code approach. Yes really, hear me out.

DVC was really effective in establishing a nutty conspiracy theory in the heads of lots of otherwise reasonable people. How? Party by effective storytelling. But also because he continuously reveals just enough of where he is going for you to leap ahead and get there first. That gives you ownership of the idea. It's not his idea, it's yours, which creates a stronger attachment to it.

In the post below, I'm experimenting with this idea. I don't know if it's workable. Also I can't judge the tone at all - I need to know your impressions on reading it. But I think it's worth a try.

2012-03-21 10:17:55
Kevin C


Why HadCRUT3 is wrong

The UK Meteorological office have for many years published estimates of the global mean surface temperature record from 1850. Over the last decade it has been noted that this record has shown little or no warming. Foster and Rahmstorf (2011) have shown that two natural cycles: the El Nino Southern Oscillation (ENSO), and the solar cycle have contributed to this apparent slowdown in global warming. However, there is another very significant factor:

HadCRUT3 is wrong

Well, to be fair, all the records are wrong. That's the nature of data in every field of science. However, most of the time we get by with data which is 'good enough'. HadCRUT3 fails that test; it is obviously and provably wrong, which is why the Met office are updating it.

Why is HadCRUT3 wrong? There are a couple of reasons. This post highlights one of them; to see why all that is required is one statistical principle and two pieces of data.

The statistical principle: Sampling a stratified population

Suppose you want to determine some statistic on a large dataset, say the average height of the population of the USA. You could simply measure everyone. But that would be impractical. So normally you would measure the heights of a representative sample group. If the group is large enough, the average height of the sample group will give a good estimate of the average height of population as a whole.

Or will it? Suppose three quarters of your sample group are women. Women make up approximately half of the population as a whole. But women are on average shorter than men. If women make up three quarters of your sample group, then the average height of the sample group (the 'sample mean') will be lower than the average for the population as a whole (the true 'population mean'). The sample group is not representative of the population, and as a result produces a biased estimate.

The problem is that the population is stratified - it is divided into groups with different statistics. A representative sample from this population must be both 'big enough', and contain appropriate proportions of the different strata - in this case men and women.

Now consider a more complex case. Samples are to be taken a decade apart to determine the rate at which the population are growing taller. The first sample consists of 50% men and 50% women. The second sample, a decade later consists of 25% men and 75% women. The first sample is unbiased, the second is biased low. The resulting trend may erroneously suggest that the population are growing shorter, when in fact they are growing taller.

Two pieces of data concerning HadCRUT3

1. Land surface temperatures have been increasing more quickly than sea surface temperatures, as would be expected given the higher heat capacity of water. The following figure shows the area-average temperature anomalies from CRUTEM3 and HadSST2:
(Alternatively, look at this figure from GISTEMP.)

2. Land coverage in the HadCRUT3 record has been declining over the past 50 years. The following figure shows the proportion of the HadCRUT3 global sample drawn from land measurements. The actual proportion of the Earth's surface covered by land is about 29%


(You can get a reasonable estimate this graph based on the coverage figures given on alternative lines of the CRUTEM3 and HadSST2 data files, however the figures themselves are slightly peculiar: The land and ocean coverage exceed the fractions of the surface covered by land and ocean, and in some cases add up to more than 100%. This is due to the coarse 5 degree grid, and the fact that coastal cells are treated as both 100% land and 100% ocean. The figure above is a more accurate estimate based on the gridded datasets and a high-resolution land mask.)

Putting it together

The proportion of land readings in the HadCRUT3 sample has been dropping since the 1960's, and has dropped from ~25% to less than 23% since 1995. Over the same period the land temperature anomalies have been increasing faster than the sea surface temperature anomalies, with the greatest differences occurring since 2000.

The resulting bias in the HadCRUT3 data due to this effect should be clear. This is however only one source of bias in the HadCRUT3 data. Another source will be examined in a future article.

2012-03-21 11:34:06
Tom Curtis


Kevin C, I think the approach is a good one.  A couple of general points:


1)  I assume the surge in the proportion of land based temperature measurements in approximately 1914-1920 and 1940-1950 are primarily a result in a reduction of SST coverage due to the disruption of shipping in WW1 and WW2.  This is not directly germane but is probably worth mentioning as it may otherwise be a distraction.


2)  From a denier point of view, the first thing they will look at is 1998 which is the hottest year in both CRUTEM3v and HadSST2.  It follows that even with adjustments, for the change in distribution, 1998 would still be the hottest year in the record.  Deniers will argue that if follows that there has been no significant warming since 1998 and hence that your argument is a simple distraction.


3)  To counter the denier argument above, it would be nice to have a corrected HadCRUT3v generated using UEA's methods but modified by weighting land stations inversely to the land representation in the record to compensate for the effect, showing at least annual values and the trend.  Incorporating such a corrected HadCRUT3v into the trend calculator might also be usefull.


4)  It would be very nice to have a statement of the actual effect on the temperature trend of this biasing, even if the suggestion in comment (3) is to difficult/time consuming.


5)  Also given given the argument mentioned in (2), I think it is desirable to have a follow up post mentioning some of the other problems with HadCRUT3v which are biasing it low, particularly as some of them that you have mentioned are carried forward into HadCRUT4v (if I remember correctly).  One of them is lack or spatial coverage which has been partially corrected with HadCRUT4v, and so can be discussed in the post which will greet its release.


6)  Given your stated objective of letting people leap ahead and get ownership of the idea, it seems to me that your first section, suitably reworded belongs at the end of the post, not the start.  For that purpose, the title should probably also be changed to not present the key conclusion before people have reached that conclusion themselves.  Unfortunately I can make not suggestion of a a suitable replacement.


Finally, as you know I have been very interested in your analysis of HadCRUT3, and am very excited to see it finally reaching the point of public presentation.  Well done.  Climate science communicators are going to be in your substantial debt for this work.

2012-03-21 17:57:12
Kevin C


Thanks Tom, that's exactly the sort of feedback I was looking for.

I agree with all of it I think. I'll try moving all of the first section from 'HadCRUT3 is wrong' to the end. And think of a new title.

 HadCRUT3: does it need fixing?

 HadCRUT3: If it's not broke, don't fix it?

I had no idea how to write the ending, and so left it dangling, but you're right, I'll add a simple equation for the bias to show how trivial it is, plot a bias graph, give some indication of how the results change, and look foreward to the second post, which will be on latitude bias.

However I don't want to present the intermediate results in the trend calculator until I can show the results from the second step too.

What I'd like to do is release data and methods as well - but to get this out quickly (and I think current events in the deny-o-sphere make that germane), that may have to wait for a follow up post of it's own, like Tamino's data and methods post.

(It would still have been nice to do this in a paper, but events overtook me.)

2012-03-21 21:50:52
Tom Curtis


Kevin, I'm not sure events have overtaken you.  It is generally known that HadCRUT3 has poor coverage of the Arctic, which has not been fixed with HadCRUT4.  But it also has poor coverage of the sahara, the location of many of the 19 national temperature in 2010, a major reason why 2010 did not set a record in 2010.  There is also the land/ocean balance which you have identified in this post, the hemispheric averaging issue and (apparently) the latitude problem (which you will have to refresh me on).  So of several significant problems contributing to the low trend, HadCRUT4 has fixed one based on the information I currently have access to.  That means much of your analysis is still topical and publishable in a scientific journal.


One thing that has changed, is that with the prospective release of HadCRUT4, there is going to be a major denier attack on the UEA and HadCRUT4.  Given that, where earlier I would have suggested you not place your chance at publication at risk by doing a blog post, the clarity you bring to the issues with HadCRUT3 is to important tactically to not blog about now, IMO.  But to the extent that you are able to, I would still recomend that you publish a paper.


With regard to the title, "If it ain't broke" doesn't sell it to me.  Also, if you do go with it, I woud keep it shorter to make it punchier,  Something like "HadCRUT3:  If it ain't broke ..." (or indeed, just, "If it ain't broke ...").  Personally I would prefer something along the lines of "Trends and Temperatures".  I am even fonder of "Gone to water", but that's just because I am an inveterate punster.


A very small point, switching colours on your two trend sample graphs would maintain a greater visual consistency in the analogy.

2012-03-22 01:54:33
Dana Nuccitelli

I think the title should be something about HadCRUT improving their data set.  It's not a matter of being broken or not, just incomplete (in the case of CRUTEM) and needing an adjustment (in the case of SSTs).

Maybe 'HadCRUT4 Corrects for Known Biases' or something like that?

2012-03-22 02:42:16
Doc Snow
Kevin McKinney

OK, this is where a naive question (or rather, set of questions) comes in.

I'm not really familiar with the inner mechanism of HADCRUT--I've read the description of the algorithm/process in the past, but the details have receded in memory.  I know that the high Arctic is largely left unsampled, and that that gap is left largely unfilled in any way.

Are you saying that land grid cells are simply omitted when there is no data available?  (E.g., the Saharan data mentioned in comments above.)  And there's no area-weighting function at all?  Surely that would be too obvious (and large) a bias to be overlooked hitherto?--after all, that's roughly a 10% decrease in land coverage since '95. 

Deniers will be all over that--I'm remembering 'Chiefio' here, with his fortunately very careless claims about 'cut' stations and resulting bias.

I expect I'm confused--but if I am, I probably won't be the only one.  So I think the HADCRUT part of the article will need to be clarified, and probably expanded.

That said, the setup portion of this article is really excellent--if one fails to grasp sample bias after this exposition, then--!

2012-03-22 08:16:35
Tom Curtis


dana, I don't think HadCRUT4 does correct for this bias.  We won't know for sure until the paper launching it is released and we can see detailed discussion, but so far pre-launch comment has focussed on the inclusion of a large number of new Arctic stations, and on better corrections for the transition from bucket based to engine intake methods of measuring SST in the 1940s.  Therefore I don't think the title should refer to HadCRUT4 at all.  Nor do I think discussion of HadCRUT4 should be included in discussion in this blog post.

If Kevin C can get the land-ocean sample ratio over time for HadCRUT4, which presumably has improved, and if he can calculate the effect of that improvement on the trend by itself, then a brief discussion of the effect of HadCRUT4 at the end would be appropriate.  If he is not able to do both, discussion of HadCRUT4 just confuses an elegant discussion of a significant, and little thought of problem.

As noted above, I think a follow up post discussing some of the other problems with HadCRUT3 is desirable, and I think Kevin has agreed to do one.  Depending on how general that post is, discussion of HadCRUT4 would be appropriate in it. 

2012-03-22 08:38:48
Kevin C


Current plan:

Post 1: Land/ocean bias in HadCRUT3. This is really simple and obvious, and the bias term is very clear. (Will update the version above to show this as soon as I can). However it doesn't on it's own make 2010 hotter than 1998. Trying to get a metric which makes a better narrative. Will also try and get a private version of the trend calculator up for my adjusted data before bed.

Post 2: Latidude bias in HadCRUT3. This is messier - it is a cooling bias over the past 15 years, but before that shows big wiggles. But if you put the two corrections together, 2010 is hotter than 1998.

Post 3 or 4: Methods and code post. No real science communication, it's a formality needed to complement the rest. Better not to detract from the communication posts, and it won't be ready anyway.

Post 3 or 4: Apply the same methods to HadCRUT4, and show that it either has or hasn't dealt with the bias problems. I won't know until they release the gridded data.

2012-03-22 09:11:30
Tom Curtis


Doc Snow, HadCRUT does simply omit land cells with no data.  Everybody who has examined the difference between GISTEMP and HadCRUT has noted that, and has further noted that the effect is equivalent to treating those land cells as having the same temperature trend as the global temperature trend.  So far as I know, Kevin C is the first to notice the aditional effect of biasing the sample over time as discussed above.

As I understand it, Kevin C and others have found the following problems with HadCRUT3:


1)  Sample bias over time (discussed above).

2)  HadCRUT3 obtains a global average by taking the mean of their NH and SH indices.  Because of the very different Land/Ocean ratios in the two hemispheres this introduces a low bias over time.

3)  HadCRUT3 weights the different cells for area in taking a mean, but does not weight them based on the number of cells with data in each latitutde band.  As temperature increase is partly a function of latitude, and missing cells are not disproportionaly located at high latitudes, ie, areas of rapid warming, this introduces a low bias over time.

4)  HadCRUT3 has data in very few high Arctic stations, and as the High Arctic is the latitude with the most intense warming.

To extend Kevin's example above, (2), (3) and (4) introduce a bias by having a constant but underrepresented proportion of boys, with the boys growing faster.  Taking a simple mean will under represent the growth of the overall population.  Because (2), (3) and (4) are related, increasing the number of Arctic cells (ie, correcting (4) as is done by HadCRUT4) will reduce the bias introduced by (2) and (3), but not eliminate it.  Of course, if HadCRUT had truly global coverage, (2) and (3) would not be a problem.

There are other problems in addition to (1)-(4):

5)  HadCRUT3 has low coverage in other sparsely inhabited regions of the world, including central Australia and, most importantly, the Sahara.  Because these regions have temperature trends close to the global mean, they do not introduce a significant long term bias (unlike the lack of Arctic stations).  However, because annual temperature variations are significant on a regional level, so this can introduce a significant error in year to year comparisons over the short term.  In particular, North Africa was exceptionally hot in 2010 (almost as much so as Russia), whereas it was not unusually hot in 1998.  The result is that HadCRUT3 (and 4) understates the temperature of 2010 relative to 1998.  As noted, however, this effect would be inconsequential for long term (30 year plus) trends because it would average out.

6)  There is a problem with how HadCRUT handle cells which contain both Ocean and Land.  I am not up on the details of this one at all, however, so I will leave further explanation to Kevin.

7)  There is an additional latitude based problem in that HadCRUT cells are defined by latitude.  That means that in areas of sparse coverage, a single station near the equator will have its temperature extrapolated over a much larger area than a single cell at high latitudes.  This is compensated for by area weighting, but it also means that there will be more cells with no data for a similarly sparse network at high latitudes when compared to the low latitudes.  This problem exacerbates (1)-(4) rather than introducing an independant bias by itself.

Please note that the order of listing is not the order of the strength of the effect.

Kevin C is of course far more up on this than I am, and I made that list from memory.  Consequently if he could extend it (and correct it where I am wrong) I would appreciate it. 

2012-03-22 09:29:38
Kevin C


Doc, I'll try to clarify.

HadCRUT3 has two bias problems: The land/ocean problem and the latitude problem. The missing Africa data is probably also a bias, but not one I can pick up with anything I've done do so far.

Fixing these problems should (and I think does) put HadCRUT somewhere between NOAA and GISTEMP. Adding more data is better, but adding more data and not using biased methods is best. I think they should use the BEST method, with GISS and a poor-man's approximation.

HadCRUT weights correctly by cell area (i.e. with latitude). The fact that the cells vary in area with latitude is a bit silly though. It exacerbates the latitude bias problem, but is corrected by startified sampling (my unwirtten post #2).

I don't think the NH/SH problem is a HadCRUT3 issue - or if it is it is minor because NH/SH coverage is roughy equal once you have land and SSTs. I ought to check what they do. However the NH/SH problem is a huge issue for the CRUTEM3 land index. It's not wrong per se - it's just not what you would expect and is not comparable to other indices.

The lack of a high resolution land mask similarly - it means that the land and SST data are not really true land-only and ocean indices, which is only really a prolem if you are comparing with NCDC or BEST, which are. And if you are calculating coverage. However when you compute a global land/ocean index it largely dissapears.

So there are lots of little gotchas when comparing datasets. But the big bias issues in HadCRUT3 are due to unrepresentative land/ocean sampling and latitude sampling.

2012-03-22 09:47:35Revised version
Tom Curtis


1)  I like the title very much.


2)  The layout is good.


3) The layout in your third formula creates an optical illusion that you are taking Wland to the power of -0.29, and likewise for Wocean.  I'm not sure how to fix that.  Inserting spaces and capitalizing the W, as below, appears to dispell the the illusion.  

Δbias = Tbiased - Tunbiased = Tland(Wland - 0.29) + Tocean(Wocean - 0.71)

You will need to capitalize in your first equation as well for consistency if you use this fix.


4)  Laying out the problems with HadCRUT for Dr Snow has clarrified an issue for me.  In particular, there are two biases introduced by incorrectly weighted samples, one from the correct weighting changing over time as per your discussion, and the other from the different growth rates of the two samples.  They are two different (though related) effects, and should both be discussed because both are involved in introducing the bias you have detected.

To make this clear,

a)  If you had a constant, but different heights between girls and boys, and changed the sample size by reducing the proportion of the taller sex, you would introduce a spurious negative trend; but

b) If you have constant but incorrectly weighted sample sizes with different trends in height, if the sex with the lower growth trend is over-weighted, it will spuriously reduce the overall trend.

Your explanation of the statistical effect is an explanation of (a).  Your illustration of the effect shows the same trend in height for both men and woman, but that makes no difference mathematically.  The negative trend would still be introduced with changing sample proportions but not growth in either population.  However, in addition to that effect in the temperature example you also have the effect of the different trends for land and ocean, and the effect of that should also be explained.

Further, without explaining that effect, you will be unable to explain why the huge change in ratios in the 1940's has relatively little effect on the bias, nor why a linear decrease in the proportion of land stations has a non-linear effect on the bias.

2012-03-22 10:23:54
Kevin C


Here's the trend calculator for

 1. the HADCRUT3 offcial data,

 2. the grid average

 3. the grid average corrected for land-ocean bias

 4. the grid average corrected for land-ocean and lattitude bias

The big change due to the land/ocean correction is in the 1990-2000 period, which is not so useful from a narrative point of view, but it is the one which is really clear and easy to explain. If you're looking for a short term trend change, the lattitude bias makes the big difference, and puts 2010 above 1998.


From a narrative point of view, my feeling is to use this first post to establish the principle, and show that if HadCRUT3 is biased, it is probably low, but not give the corrected data. The second post gives the latitiude bias. Either give the gradients or year figures in this post, or save it as a kick for the methods post.

I've added an ending to the post above accordingly. Is it enough?

2012-03-22 10:49:53
Kevin C


Tom: yes, the analogy doesn't go far enough. Not sure how to fix that without starting from scratch or losing the clarity of the earlier section. I've added a parenthesis in the penultimate paragraph to try and cover it.

I think I've made the other changes.

2012-03-22 11:44:20
Tom Curtis


Kevin, I think you can extend the analogy by simply introducing one small paragraph and a third diagram illustrating the effect of a constant biased sample with a different trend over time between the two sub-samples.  I don't think that would subtract from the clarity.  It may also be usefull to plot the bias introduced by the change in sample ratios using a calculation based on an assumed land trend equal to the SST trend, although I am not convinced of the benefit of this.  I would definitely to it for a technically literate audience, but if you are aiming at a grade 10 reading level, less cluttered graphs are better.

Having said that, by my eyechrometer, the difference in the trend rates with the biased sample ratio accounts for about two thirds of the land/ocean bias after 1960, but is not discussed in your post.  If you where only to discuss one of the sources of the land/ocean bias, I think it is the more important, but much better to discuss both.


With regard to your ending, and plans for future posts, both are good IMO.


I'm giving you a thumbs up now because I think the post is definitely publishable as is.  I have two caveats: that it would be much improved by a paragraph discussing the second way a biased sample will distort the trend; and I am lousy at proof reading, so I make no claims about lack of spelling errors, grammatical errors and misprints.

2012-03-22 18:26:09
Kevin C


Tom: I want to keep it simple. How about this: I'll swtich it to boys/girls growing over a year, with the girls growing slower than boys. I'll update the second figure to show this, and take out the middle sample. Then I can cover both effect in the text. Then I'll turn it into a blog post.

I'm running out of time on this - been burning both ends wihile at a workshop. Once I'm back home time will be very limited, but hopefully this is nearly ready.

And for your entertainment... Here are my fully unbiased annual temps from the HadCRUT3 data:

1998     0.51958
1999     0.29966
2000     0.28491
2001     0.42391
2002     0.46091
2003     0.47775
2004     0.44091
2005     0.51858
2006     0.45475
2007     0.44025
2008     0.38183
2009     0.4645
2010     0.54266

They make an interesting comparison with the HadCRUT4 press release:



2012-03-22 19:46:20
Kevin C


Here's the uploaded version:


2012-03-22 19:58:58


How is this article NOT going to be described as, "SkepticalScience trashes the work of the UK Meteorological Office - makes an excellent case for defunding."?

2012-03-22 20:04:23
Kevin C


Good point Neal. (Although to be fair, the strata problem is in all the indices, but is only noticable in HadCRUT due to it's smaller data and sampling).

I'll add a comment at the end that the best solution is the one they've adopted - get more data, but we can estimate the bias using statistics. That hopefully covers it a bit.

2012-03-22 20:30:52
Tom Curtis


First, I like the revision, so definitely good to go from my point of view.  Thankyou for the acknowledgement, especially as I do not think it was necessary.


With regards to neal's concern, I do not think it is a major worry.  In order to make that attack, the deniers will first have to accept that Kevin's analysis is accurate, and that HadCRUT3 is biased low as a result.  That kills too many denier myths for it to be considered a viable strategy by any halfway rational denier.


Finally, thanks for data, and I was highly entertained:


Year Adj 3v 4v
1998 0.52 0.52
1999 0.3
2000 0.28
2001 0.42 0.43
2002 0.46 0.49
2003 0.48 0.49
2004 0.44 0.44
2005 0.52 0.53
2006 0.45 0.49
2007 0.44 0.48
2008 0.38
2009 0.46 0.49
2010 0.54 0.53


Rank HadCRUT3 Anomaly (°C) HadCRUT4 Anomaly (°C) Uncertainty (HadCRUT4) (°C)  Adj HadCRUT3 Anomaly
1 1998 0.52 2010 0.53 0.1 2010 0.54
2 2010 0.5 2005 0.53 0.1 1998 0.52
3 2005 0.47 1998 0.52 0.09 2005 0.52
4 2003 0.46 2003 0.49 0.1 2003 0.48
5 2002 0.46 2006 0.49 0.09 2009 0.46
6 2009 0.44 2009 0.49 0.1 2002 0.46
7 2004 0.43 2002 0.49 0.09 2006 0.45
8 2006 0.43 2007 0.48 0.09 2004 0.44
9 2007 0.4 2004 0.44 0.09 2007 0.44
10 2001 0.4 2001 0.43 0.09 2001 0.41

From a quick look at the data, I now strongly suspect that HadCRUT4 has not corrected for the biases you found, but that there extra data has helped ameliorate those biases.

2012-03-23 03:47:05
Dana Nuccitelli

Looking good.  One issue:

"(Alternatively, look at this figure from GISTEMP.)"

I presume there's supposed to be a link in there?

Another issue is that this post references the trend tool, which it sounds like won't be ready to go for a while.  I referenced it as the 'soon-to-be-released' tool in the post I just published, so you could either do something similar, or we could put this post on hold (I'd suggest the former).  What would you prefer?

2012-03-23 07:12:53
Kevin C


Dana: Fixed the link, thanks.

Both posts are ready to go. The trend calculator works in any recent browser except IE9. I've sent John a fix which will solve that as soon as he has a change to apply it.

Ideally that would happen first. However IE9 doesn't seem to have huge penetration, so I don't think it is a disaster if the posts go out and the fix goes in later.

2012-03-23 07:37:27
Dana Nuccitelli

Oh, good stuff.  In that case we'll aim to publish both posts early next week, or possibly the calculator tomorrow if it's ready to go.