16 August 2010

Why Rare Events are a Certainty

[UPDATE 8/25: See this interview of NOAA's Marty Hoerling.] 

[UPDATE 8/ 17:  See this interview of Peter Stott by Tom Yulsman.]

The Russian heat wave has finally broken, but people will be talking about it for a long while.  In this post I am going to discuss the statistics of rare events.

Consider the following statement:
Meteorologist Rob Carver, the Research and Development Scientist for Weather Underground, agrees. Using a statistical analysis of historical temperature records, Dr. Carver estimates that the likelihood of Moscow’s 100-degree record on July 29 is on the order of once per thousand years, or even less than once every 15,000 years — in other words, a vanishingly small probability.
How rare is a 1 in 15,000 year event?  It is not as rare as you might think, and here is why.

Imagine if you have a fair coin (50-50) and you flip it three times.  Suppose that you want to know what the chances are that you will observe one or more heads in that sequence.  The odds can be calculated by determining that the only sequence with no heads it tail-tail-tail, which will occur, on average only 1/8 of the time.  So the odds of observing at least one head in that series of three flips is 7/8 or 87.5%.  You can generalize this approach such that you can consider coins with different odds and for many flips.  The generalized formula is called the binomial probability distribution, and there are many useful calculators for the distribution on the web (e.g., here).  (Technical note: The binomial distribution can be approximated by other distributions, such as the Posisson distribution, which has been shown to well approximate the occurrence of certain weather extremes.)

We can use the binomial probability distribution to evaluate how rare the Russian heat wave was under a variety of assumptions.  But to do so we need two numbers.

One number that you need to know is the odds of an event.  In this example I'll use the 1 in 15,000 year event provided by Rob Carver.  Whether that number is accurate or not doesn't really matter for this example.  If you'd like to use another, you can, and below I'll show you how.

The second number that you need is the number of relevant events.  This is a bit tricky, and it is not at all clear to me what an "event" is according to Carver.  One possibility would be to use the number of meteorological stations in Russia or the number of grid points in a spatial reanalysis.  But this has some problems as the "event" that we are discussing is not just an extreme at a point, but a more systemic event associated with a persistent atmospheric pattern.  So we could ask how many high pressure systems typically occur in the northern hemisphere over a summer season.  I asked my father this question and he suggested perhaps 10-12 per season at the latitude of Moscow.  Again, if you don't like this number you can alter it to your liking.

So next, open up the Vassar binomial probability calculator.  We can use it to answer a few questions.

1) What are the odds of at least one 1 in 15,000 year heat wave event occurring over a 1,000 year period (picked because Russian meteorologists say that nothing of this magnitude has been observed over the past 1,000 years)?

To answer this enter the following into the calculator:

n = 10,000 (1,000 years * 10 high pressure systems per year)
k = 1 event
p = of probability 0.00006667 (that is, 1/15,000)

The odds of such an event occurring over 1,000 years are 48.7%! Given these statistics, it is not at all surprising to see one such event in Moscow over the past 1,000 years.

2) Part of the problem of course is that the Russian heat wave has already occurred, and this can create a form of hindsight bias in our consideration of rare events.  So looking forward, what are the odds of anbother such event occurring in Russia over the next decade, assuming these same odds (which, again may or may not be accurate).

To answer this enter the following into the calculator:

n = 100 (10 years * 10 high pressure systems per year)
k = 1 event
p = of probability 0.00006667 (that is, 1/15,000)

The answer is 0.7%, pretty small, but not zero.  If you were laying odds in order to bet on such an occurrence they would be about 143 to 1.  A longshot, but not impossible.

3) With weather there are all sorts of events that can be classified as extreme -- floods, hurricanes, drought, temperature, and so on.  I have no idea how many such weather "events" there might be in a year.  But for fun, lets assume that there are 1,000 weather "events" in a year.  We might ask, what is the probability of seeing a 1 in 15,000 year event (of any type) over the course of a year?

n = 1000 (1 year * 1,000 events)
k = 1 event
p = of probability 0.00006667 (that is, 1/15,000)

The odds of at least one 1 in 15,000 year event is 6.4% for one year.  How about over the next 10 years?  28.3%!!

If you want to test out different numbers than I used above, it is easy to do so.  But whatever numbers you use, you'll find that individual rare events are not so rare when considered over time and space.  This is one reason why the issue of attribution of causality is so frustratingly difficult (it is also difficult because of uncertainties in both "n" and "p" in the calculations above).

More specifically, there are three reasons why the question of whether extreme events are increasing due to specific causal factors is difficult to answer with certainty: (1) a short data record, (2) specific extremes occur infrequently and (3) a range of legitimate methodological approaches to the issue.  In such circumstances it would be easy to be fooled by randomness and black swans.   The good news is that the best policies in these conditions do not require certainty about causality, they instead emphasize robustness to uncertainty and ignorance.


  1. Part of the problem with these types of estimations is that extreme events may not be random. For example, the "dust bowl" wasn't one year with very light rain, it was five out of six years. Pete Levitt, CCM, did some work back in the 70's (don't know if it was formally published) that indicated these events tend to cluster because of the persistent of some meteorological regimes.

    So, the chance of more hot weather in the next ten years might be higher than random chance might lead one to believe.

  2. I am not a statistician by any means but I must say I am a little suspect of 100 year events, 500 year events an 1000 year events. I live in Maryland and one winter in the mid-90's there was a major snowstorm followed within 4 days by a major rainstorm and it flooded a town called "Point of Rocks". Since the town had never been flooded that severely before and the flood was rated a "500 year flood" they decide to rebuild the town right where it was. The following September, a major tropical storm hit the watershed again flooding the town with the second "500 year flood" of that year. (Mercifully, they decided not to rebuild the town in the same location.) My point of the story is it seems the probablility of the floods seem to assume a gaussian distribution on event intensity. However, with blocking patterns in the circulation of the jet stream, mother nature plays her own game of Rosencrans and Gildenstern odds making. Is it possible that the statistical treatment of rare weather events is off the mark?

  3. Ah, yes. Probability is kind of counter-intuitive and very perplexing for the average Joe. I had a year-long bet with a colleague many years ago, when the Washington State lottery began. I bet him that I would get just as many "hits" if I used the numbers 1-6 each time--as he would get using any "magic formula, birth dates, etc." he had in mind. We compared results for over a year and it was a total tie.

  4. A 48% chance of something happening in a 1000 years doesn't sound that bad... for a single spot.

    But, when you look at the tens of thousands of simultaneous experiments that get run every year (i.e. just about every spot on the planet where they have somebody with a thermometer) then the probability that you DO NOT get a "worst of event" somewhere falls to close to zero for any year.

  5. Roger,

    are you sure you haven't made an error?

    They said once every 15,000 YEARS, not one out of ever 15,000 STORM SYSTEMS.

  6. There is a 99% chance that someone will win the lottery in the next month.

    There is less than 1/14 million chance that you will win.

  7. -5-Jason S

    Thanks, as I said in the post, the nature of "event" is not clear. My interpretation is denfensible, but there are certainly others that are as well.

    Over 15,000 years there will be ~150,000 high pressure systems. I'll let you do the math ;-)

  8. Roger,

    I wonder what you think about determining highly improbably chances based on very limited data, i.e. talking about chances every so many thousands years when you have only a couple of hundred years of data - at most. You can do it, but your results will be determined by the statistical model that you assume describes the variability of the data. Which could be totally wrong (heck: 15,000 years ago we were in an ice age); many processes in geophysics are autocorrelated, have long tails and/or behave as red noise rather than white noise. However, given the shortness of our records we simply don't know how to statistically describe the record over 15,000 yeras. So I would be very reluctant to use statistics in such a case, and even considere it inappropriate use of statistics. (Yes, I did read Taleb's book).

    Furthermore, a colleague often uses the example of rare flooding events in Europe. A once every 100 years flood would be considered quite rare and have disastrous effects. Yet, since there are about 100 river basins in Europe, you can be pretty sure that every year somehwere in Europe there will be a once in a lifetime (100 years) flood.


  9. I wonder if some are hoping for this to be the mythical sudden, shocking event that will convince everyone that "climate change is real" and cause an abrupt change in climate policy.(http://www.grist.org/article/2009-12-23-the-coming-climate-panic/)

  10. The metrological extremes are in my book quite pointless. If it’s 40 C or 38 C is not rely the issue but that what it all comes down to with this climate record breaking nonsense perspective. The issue is hot and dry.

    To me normal weather/ temperatures is not daily average temps but what’s with in 95 % confidence interval.

    For example 35 C is quite hot over here in Sweden but not “extreme” in any way I have seen in on my own thermometer during my life quite some times.

  11. Maybe OT, maybe not...


  12. I'm with Sean.

    The problem is that our "1 in a " chance events are all done with a small amount of historical data and nowhere near enough understanding of the base system. Nor any great grasp of probability.

    Events that occur often, by chance, are assumed to actually be common whereas they might be rare things. Conversely things that occur infrequently are assumed to be rare, whereas they might just by luck have been avoided.

    If you want to figure out if something is a 1 in a 15,000 year event, you have got to have a lot more than 15,000 years to work from. You could not test if a dice was loaded from 6 rolls, because rolling a 6 is a "1 in 6" event.

    And the chance that it is a 1 in 15,000 event is bogus anyhow. Something that unlikely would be rather worse than a "heatwave". We're talking Armageddon for that sort of event. Seriously, it wasn't even that hot. It was dry, which caused all sorts of fire issues, but people weren't even keeling over. Australians wouldn't even think twice about temperatures like that.

  13. Roger,

    Your original point is well taken.

    I'm objecting to your calculations.

    The input to your equation is one event every 15,000 YEARS (not per system).

    To the extent that the number of systems per year has an impact on this number, it has ALREADY been factored in. Multiplying (again) by the number of systems per year is double counting and produces an erroneous result.

  14. Roger,

    The definition of event is not the problem.

    The number that you are starting off with is one EVENT per 15,000 YEARS.

    To the extent that the number of SYSTEMS per year matters, it has already been factored into the given number.

    When you adjust for the number of systems, you are applying this adjustment for a second time (once by you, and once by the person who originated the fanciful one storm per 15ky number)

  15. -13, 14-Jason S

    Thanks, I probably won't object to your interpretation, but it is unclear what it actually is ... How about redoing my 1, 2, and 3 above using your approach?

  16. -8-Jos

    Yes, this is indeed methodologically problematic because it moves the analysis further from being an empirical test about the real world and toward being an exploratory investigation of the consequences of various assumptions.

    I have seen the Russian heat wave event referred to by climate scientists as a 1 in 15,000 year occurrence (as above) and a 1 in 10 year occurrence (M. Mann) ...

    As I said in an earlier post there is no way to empirically resolve such claims, leaving us to other ways of resolution!

  17. Quote, "Using a statistical analysis of historical temperature records"

    A true historical record would be different an instrumental record, it would also be different from any proxy reconstuction.

    In studying the climatic past historically could easily take you back to the start of the written word.

    So what is the historical temperature record, and can anyone agree upon it?

  18. There is also the existence of 24/7 catastrophe gorging news channels and the deep love of the Russian government for carbon trading. Can we believe the hype ?

    Russia got a particularly good deal from Kyoto.


    The Kyoto pact requires participating countries to cut back greenhouse gas emissions by 2012 to 5.2 percent below 1990 levels. By a quirk of history, Russia stands to benefit from Kyoto, because after a decade of economic dislocation, its emissions today are already substantially below what they were in 1990, when Russia was part of the Soviet Union.

  19. Mandelbrot and Hurst showed pretty clearly that floods in particular are multifractal and therefore extreme events should be clustered together to make "super extreme" events. I wonder if any work along these lines has been done on precip / heat waves? Precip is much harder to forecast since it typically relies on changes in pressure or temperature, and taking derivatives of a model always winds up worse than the original model.

  20. Jason S is right. You've calculated the odds of an event in 1000yrs x 10 systems/yr = 10,000 systems. However, you used a frequency of 1/15,000 years without translating to systems. Therefore your calculation is dimensionally inconsistent.

    Another symptom of the problem should be obvious: if the 1000-yr probability is about 50%, then you'd expect 7.5 events in 15,000 years, which obviously isn't the same thing as 1 in 15,000.

  21. -20-Tom Fiddaman

    Yes, as I said to Jason S, I will not quibble with his interpretation of "event." As I said to him I await your updated answers to my 1, 2 and 3 using your definition of "event". What answers do you get? ;-)

  22. Re: Jason S. I got about a factor of 10 less probability for 1) and 2) when using the correct inputs:
    1) 6.7%
    2) 0.07%

    3) is unchanged

  23. Per your suggestion to Jason to redo 1,2,3 with corrected numbers, it's easy to do by either eliminating the 10 systems/year, or by reestimating the assumed frequency as 1/150,000 (1/15,000/10). Either way, your numbers are roughly 10x too big.

    Also, your analysis of the probability of an extreme, given the diversity of possible events, is likely biased in the same way. In addition to the 10x bias from storms vs. years, 1000 events per year seems a bit high, given that there aren't many widely reported climatic variables, and that Russia is a tenth of the earth's land area. If you define an event as a record in a major variable over a very large area, 100 might be a more reasonable upper bound, in which case your odds are too large by a factor of 100. (If you don't do that, you have to start tracking records by station, accounting for spatial correlations, and it gets messy.)

    No wonder that "Probability is kind of counter-intuitive and very perplexing for the average Joe."

  24. I leave the details as an exercise. :)

  25. -22,23- Toms

    Ok, sure, sure, but what if the heat wave started on a Tuesday?


  26. The big caveats on Carver's statement are worth noting:

    That however, is a tricky assumption to make. We know that the climatic properties of CFSR and GDAS data have to have some correspondence with what’s actually happens in the atmosphere, otherwise weather models wouldn’t work. What becomes difficult to quantify (in the time constraints of writing for the public) is how the statistics of the climate properties line up between observations and reanalysis. And at these extremes, it doesn’t take much change in the average and standard deviation of a property to dramatically change how unusual an event is. Another possible source of error is the assumption that the climatology of CFSR is the climatology of the operational GDAS. Which is not a slam-dunk since NCEP shifted to a higher-resolution model on July 28. Now, I don’t have any information to say the post July 28 GDAS data has different climatological characteristics, but it’s a possibility. Another big assumption I make is that daily maximum temperatures follow a Gaussian (normal) distribution and that from 30 years of CFSR data, I can adequately characterize such a distribution.

    I think your last line is really the key here, "the best policies in these conditions do not require certainty about causality, they instead emphasize robustness to uncertainty and ignorance."

  27. This may well be a rare event. However, judging from GISS record for Moscow and "rural" locations 1938 and 1972 also seemed to have produced extended heat waves in July and for JJA, if not as fierce as the current one. Before succumbing to the equivalent of numerology shouldn't we first check to see the likelihood of the blocking event first and then examine the consequences.
    As a thought experiment if '38 and '72 were the same metereological phenomena in and around Moscow then this suggests that such events occur every 35 to 40 years - the issue then becomes one of understanding the severity.

  28. Roger,

    Tom and I are in agreement.

    But ultimately, the inclusion of of "chances per year" in the original number does nothing to actually validate the original number (which I am deeply skeptical of even ignoring black swan type effects).

    Making precise calculations based on fanciful assumptions is tricky business and best left to economists, climatologists and astrologers.

  29. -Tom, Tom and Jason-

    I see your logic, but the argument is problematic because it is a point forecast. If it is 1/15,000 at the Moscow station, what about such records at other stations in the region? Given an event of the type observed the odds of a very rare record at any one station will vary due to factors ... so to calculate odds in this manner requires aggregating stations and considering their spatial correlation etc. (as Tom F suggests) ...

    That said I'd guess that there is as much uncertainty in the 1 in 15,000 number as there is in how to quantify the definition of "event" ... but the good news is that the various uncertainties don't alter the lesson of the exercise ...

  30. I agree that there's a lot of uncertainty in the 1 in 15,000. However, there's absolute certainty that a dimensionally inconsistent interpretation of event probability is wrong.

  31. -30-Tom

    If you want a dimensionally consistent event probability according to your definition of "event", then simply use a 1 in 1,500 event probability in my example above.

  32. That would be 1 in 150,000 - 1 event in (15,000 yrs)*(10 systems/yr).

    Are you going to revise your example? Leaving it dimensionally inconsistent doesn't seem helpful.

  33. -32-Tom

    Absolutely not. I said that I understand your logic, but also explained that your approach is physically problematic.

    There is no physical basis for your estimating that the probability of an intense high pressure system is 1/15,000 from a single point temperature extreme. In fact, it is logical that a point extreme probability would be much higher than a systemic atmospheric feature, since you can get the former without the latter.

    I offered 1 in 1,500 as a more realistic number to address your dimensionality issue -- which in fact does exactly that.

    To make strident assertions of right and wrong when everyone involved is performing WAGs is a bit silly, no?

  34. -32-Tom

    Since I'll be away from the blog much of the day, let me offer a few last words from my end ...

    Given that there are 4 magnitudes of difference expressed by scientists as to the frequency of this event and perhaps more on the spatial scale of the event itself, there is an enormous range of assumption space that could plausibly be sampled.

    In the example above I chose to modify the nature of "event" being described to maintain some physical plausibility. You prefer to keep the event as described ... fine. You have said so and shown a different approach.

    The order of magnitude difference between various plausible approaches says something about this issue ;-)

    Feel free to have the last word, that is mine!

  35. Jason S is correct. Mr. Pielke did set 1000 = 10000.

  36. Over at the Weather Underground Ryan Maue explains the (one of the) problems with equating a point temperature with a heat wave for calculating return periods:

    "13. RyanFSU 5:34 PM GMT on August 07, 2010
    Dr. Carver, are you using the methodology of Hart et al. (2001)? It would be appropriate to reference this paper from MWR.

    Also, normalized anomalies cannot be interpreted as you are suggesting, because 30-years is way to short to establish the appropriate distributions at every grid point.

    A complimentary (and better) way to approach this problem is to calculate the actual distribution at each grid point and plot the percentile (%). So with the CFSR reanalysis and 500 mb height for example, include all 31-years and then allow for a centered 21-day mean. This gives 31 x 4 x 21 realizations to create the distribution. The reanalysis is too short to be representative of the climate variability at each grid point.

    Thus, you cannot really establish "rarity" of the heat wave event from the CFSR normalized anomalies because you are operating in the very long tail of extreme events associated with the phenomena you are addressing.

    An example from my website, shows the mean sea-level pressure normalized anomalies for the GFS forecast. Thus, any tropical storm is shown as a huge anomaly -- something that only appears once every million years according to your stdev analogy. Not so."


  37. Roger, the correct statement is "oops, I made an error of a factor of ten".

    Determining how peculiar the recent events were is difficult and fraught. I am very interested in the question. There are lots of ways to argue how one should address a particular event. Is this a temperature anomaly, or a summer temperature anomaly? Is it a warm anomaly, or should we count cold anomalies too? Is it the amplitude that makes it matter, the duration, the scale? (In the present case all three were very large.)

    My intuition that it was extremely odd has been confirmed by some first rate meteorologists, the type of people who look at current and historical weather maps all the time. They think it's weird.

    I don't know if anybody knows how to characterize the exact weirdness of an extremely weird event. We should think about it, perhaps not in terms of the present event but in terms of a future event. How extreme does an event have to get before we conclude it is realistically outside the realm of unforced natural variation?

    "No such event can be imagined" seems to me an untenable position. While it has the virtues of clarity and simplicity, it doesn't actually make sense.

  38. -37-Michael Tobis

    No, I actually explained in the post what I did and why I did it, as follows:

    "The second number that you need is the number of relevant events. This is a bit tricky, and it is not at all clear to me what an "event" is according to Carver. One possibility would be to use the number of meteorological stations in Russia or the number of grid points in a spatial reanalysis. But this has some problems as the "event" that we are discussing is not just an extreme at a point, but a more systemic event associated with a persistent atmospheric pattern. So we could ask how many high pressure systems typically occur in the northern hemisphere over a summer season. I asked my father this question and he suggested perhaps 10-12 per season at the latitude of Moscow. Again, if you don't like this number you can alter it to your liking."

    I did not change Carver's probability, but I did use a different definition of event (extreme temp per high pressure system vs. his use of extreme temp per year) because I reject using the point estimate. To use the point value would be "mistaken" in my view. However, I can see the logic in Tom's argument (as I said), I just happen to disagree with it.

    Its really not so difficult to just read what I did.

    If you want to explain to me how to quantify the rarity of this event in terms that make scientific sense, have at it;-) But from your comments, you don't know how to do it either.

    So as I said in my post, if you don't like what I've done, then make up your own numbers, as Tom did, but stop the silliness about factors of 10, you know better.

    None of this actually matters for this exercise, as it all comes out in exactly the same place -- rare event s are not so rare over time and space.

  39. Sigh. I reread the original and since I advocate admitting errors, I was wrong and Roger was sort of right.

    He is discussing similar systems *at the latitude of Moscow*. Counting high pressure centers as potential blocking ridges is peculiar and dissatisfying given the statistic we are starting from, but if you are considering such events globally, you do get some multiplying factor.

    We could better ask how many "places" a blocking ridge could occur, and ten per hemisphere seems like the right ballpark there.

    Roger's #29 threw me off. There it appears like he is arguing about the probability *in the region* and that convinced me that he wasn't making sense. The probability in the region is specified as 1 per 10K years by the problem statement. So that's my defense.

    Also, I'd point out that a component of the unusualness of an event like this might be where it occurred. Say a twenty inch rainfall is rare, but a twenty inch rainfall in Utah in January is unheard of. So there's the basis for some more confusion; are we talking about comparable heat waves, or comparable heat waves in European Russia?

    Enough quibbling. For future reference, let me demonstrate how it's done. I was wrong. Oops.