01 November 2011

Anatomy of a Cherry Pick

[UPDATE: NOAA has a detailed and informative new webpage up on the Russian heatwave that responds specifically to claims made by RC11 and Real Climate.  The new NOAA analysis provides information entirely consistent with the arguments laid out below.]

Given the great interest in my earlier post on the recent PNAS paper by Rahmstorf and  Coumou (RC11), I have decided to summarize various issues for those who are interested and would like to continue the discussion.

NOAA has posted up some very interesting graphs of temperature change in Russia, which are extremely useful for documenting the extreme cherry picking found in RC11 which claimed as a top line result an 80% probability that the Russian heat wave was caused by a general warming trend.

The graphs above show linear trends (top) and statistical significance (bottom) for any combination of start and end date 1880 to 2010 for the GISS dataset. At the NOAA website linked above you'll find similar graphs for the other available datasets. These graphs are the opposite of cherry picking.

These graphs also help us to clearly identify the various cherries that fill the bowl that is RC11. Here is a quick summary:

1. Linear trend cherry pick. In western Russia (a large region that includes Moscow, defined as 50-60N and 35-55E) there are no statistically significant warming trends to present since 1880 (look at the huge area of white in the lower right hand part of the bottom graph) and actually since anytime before the 1930s.

RC11 are able to argue for a long-term linear trend by beginning their analysis of the Russian heat wave from 1910, which you can see from the top graph is one of the selected locations in the top graph where there is a positive trend. They explain that 1910 was chosen because it reflects 100 years, a nice round number.

2. Station cherry pick. But even that linear trend, though positive, is not significant in the region. So RC11 perform another example of slectivity by focusing attention on one station -- Moscow -- rather than the broader partial-continent-sized area that was the focus of the paper that RC11 seeks to refute. In a blog post (but not in the paper) Rahmstorf appears to want to discount the entire lack of warming over western Russia based on a claim of a single improper station adjustment in a single data set. It is not a coincidence that this analysis did not appear in the paper as it is a stretch, even for PNAS.

3. Data set cherry pick. RC11 look only at the NASA GISS dataset and its adjustment, even though there are multiple other datasets that use different adjustments. Do they really want to imply that based on claims about one station's adjustments in GISS data that all data sets for the entire region are flawed? Ironically enough, they may find some sympathy for such arguments in Anthony Watts Surface Stations work! ;-) By contrast, the information provided by NOAA shows that the lack of long-term warming can been seen in western Russia across the various temperature records, which utilize distinct adjustment procedures.

4. Non-linear trend cherry pick. With linear trends on a shaky foundation, RC11 adopt  in their analysis an unconventional "non-linear trend" unique to the climate attribution literature. The "non-linear trend" is really just a highfalutin smoothing procedure that makes history irrelevant -- that outside 15 years before and after the year in question. The effect of the highfalutin smoothing is essentially equivalent to using the linear trends over a much shorter period (i.e., that appear in the top right corner of the top graph above) where there has been strong and significant warming.

The consequence of these various selective choices in the methodology of RC11 leads them to conclude:
We conclude that the 2010 Moscow heat record is, with 80% probability, due to the long-term climatic warming trend.
As explained above, there is no long-term warming trend. There is a short-term warming trend, which may not even reach climate time scales of 30 years or longer.

What RC11 is, in a nutshell, is an analysis that does the following:
RC11 takes a short term trend, along with an estimate of variability, calculates the probability that particular thresholds will be exceeded over a 10 year time frame.
That is it -- This is probability textbook ball and urn stuff, padded with a lot of faux complexity.

That some climate scientists are playing games in their research, perhaps to get media attention in the larger battle over climate politics, is no longer a surprise. But when they use such games to try to discredit serious research, then the climate science community has a much, much deeper problem.

Postscript: For those new here, I believe that the human influence merits our concern and we should be taking various actions. This post should be read in the context of issues of climate science policy, not climate policy per se.


  1. The real injustice is perhaps to the language, calling the smoothing a "non-linear trend." When I hear something like that, I think of some sort of Fourier transform or logarithm or SOMETHING.

    The paper really seems to beg the question.

  2. Roger,

    I love graphs, but I'm struggling with the first one. If I pick a point, say at the year 1977 and look at the length of time of 54 years, am I then averaging together the July temperatures of the 54 years preceding 1977? Or is it a slope of a trend or some other parameter?

    You say it shows 'linear trends' for any start and end date, but I'm still not sure what's being calculated.


  3. -2-maxwell

    For the case you are describing that would be the linear trend for the 54-year period ending in 1977.

  4. Do you think it will be possible to do an 'attribution study' using similar methodologies that has already gained acceptance with these sorts of studies (like the one presented here) that 'discover' that other various events are not predicted by a warming world?

    For example, the recent snowstorm in the east coast comes to mind. Michael Mann has recently been quoted as stating events like these are fully consistent with our changing climate. And indeed they may be-- BUT, what if the accepted methodology doesnt bear this out (perhaps in part due to things like chosen start dates of data, etc.)?

    Would the literature be permitted to contain a mess of events that are successfully attributed to AGW along side a number of others that are diametrically counter using the same methodology?

  5. Roger,

    thanks, but I'm still confused. If I look at the point (1977,54) on the first graph, it's showing a trend? Doesn't a trend need more than one point?

    If that point is the slope of the trend for that period, ending in 1977, I get it.

    Or is it that in order to correctly read the graph I necessarily need to use more than one point? Whereby if I wanted to see the trend in the July temperatures ending in 1977 for the previous 54 years I have to start at the diagonal and read up (vertically) until I hit the imaginary line at 54 years on the y-axis?

    It seems like there is a lot of information in this graph and I just want to make sure I'm seeing it correctly.

    Thanks again.

  6. -5-maxwell

    Have a look at the NOAA page linked in the update, from that page:

    "Figure 2 presents diagrams of time varying trends in July temperatures for the period 1880-2010. In such plots, every possible trend longer than two years and its associated linear temperature change are calculated for the available record. Statistical significances of the trends, based on a t-test, are shown in the right side diagrams. For further details, see Liebmann, et al 2010."

    And specifically ...

    "The change is defined as the trend (°C yr-1) multiplied by length of segment."

  7. Roger -
    Isn't it rather unusual for NOAA to post what is essentially a rebuttal to RC11 online, instead of responding respond in the literature?

  8. Roger,

    Ok, thanks. That's what I had thought. It's the slope of the trend multiplied by the amount of years to give the total change in T in Julys over that time range. I confused myself, as is often the case when looking at data that's new to me.

    It would be nice if NOAA provided a similar graph with a close-up view of the trends that end in the year 2010 just for good measure. The bottom graph from this post clearly shows there are no statistically significant trends longer than 70 years, but it would nice to give the raw data the old eye ball test up close.

    I'll see if one can download that data and try to reproduce a similar graph. That will also allow me to procrastinate with respect to the work I already have...

    Thanks again.

  9. Roger,

    one other note about the new NOAA analysis. They seemed to have used the Hadley-CRU temperature data rather than GISS to show that there is no statistically significant trends in the July temperatures of western Russia longer than 70 years. The trends don't look qualitatively different from those you have posted up top, but I thought I would point that out for completeness.

    Thanks again. This is an interesting topic of discussion and really shows how the peer-review process did not work as well as it can in the instance of the PNAS paper in question.

  10. -7-HowardW


    No, not at all. The NOAA guys have had an ongoing series of pages (they haven't yet caught up to more advanced social media;-) on attribution, including the Russian heat wave, that are pre-publication. More generally in NOAA the GFDL hurricane guys have done the same.

    While it is unlikely to generate blogstorms, the NOAA response is fairly devastating, and specialists will see that. I doubt its a coincidence that since last week Real Climate has been putting up a bunch of posts making the Moscow post disappear down the chain ;-)

  11. Is there an editor that's going to step down by saying "it shouldn't have been published"? ;)

  12. Seriously, if PNAS fired an associate editor every time they published a piece of junk, in a couple of years that just-born 7 billionth human being would be getting a job offer from the National Academy.

    OK, I exaggerate just a little. :-) PNAS is an odd journal. It is prestigious, but because of its peculiar reviewing system, also publishes an unusually high number of, ahem, scientifically dubious articles.

    And disclaimer; I have published there.

  13. NOAA link doesn't work for me?


    Sorry, we cannot find the page or resource you requested.

  14. -14-Roddy

    Try this: