My Weather app says that there’s “a 50% chance of rain in your location today” But what does that mean?

*That it will rain for half of the day?*

*That it will rain all day in 50% of my location?*

*That there’s a 50/50 chance that I’ll see some rain today?*

Each of those interpretations could be correct and, without further detail, it’s impossible to know for certain which one is accurate. Although we all generally have an intuitive grasp of what these figures indicate, there’s still plenty of room for interpretation – and manipulation.

__Intuition vs Interpretation__

A famous example of interpretation which goes against public intuition is the definition of a ‘White Christmas’ in the UK. It would be fair to assume that seeing snow on Christmas Day would be enough to define that Christmas as ‘White’. In fact, the official definition of a White Christmas used by bookmakers is whether or not it snows at midday on December 25^{th} within a small, paved area about three metres square on the roof of the UK Meteorological Office headquarters in central London. If it does, the Met Office will officially declare a white Christmas. If not, even if you’re standing in three feet of snow outside your house, they won’t.

Statistics can be used in many ways. The famous quote "There are three kinds of lies: lies, d****d lies, and statistics" (popularized by Mark Twain, though the original author is unknown) dates back to the late 19^{th} century, to a time when advertisers were first starting to understand how to sway public opinion, and illustrates how people were becoming aware of the power of data.

During the Covid 19 Pandemic of 2019 – 2021, statistics were thrown around like confetti, always with the attempt to illuminate a specific point, but often doing the complete opposite and just confusing people.

*The R rate has increased by x%*

*The economy has shown a % drop*

*Infection rates have doubled in the over 65s in the last three weeks*

*y% of infections need hospitalization*

These are quite probably all statements of fact, but the information contained beneath the figures can be used very differently, depending on what the person using it wishes to illustrate.

__Misleading presentation of facts__

Those looking to influence public opinion like to present statistics in easy-to-understand formats. Often, statisticians will be asked to produce charts that indicate increases or decreases. The figures used in these will often give a very misleading impression. Unless the viewer sees through the instant visual impact, the chart could be completely misrepresenting the facts. For example, a chart showing an increase from 50 to 60 would be represented as a relatively horizontal line if the axis was marked from 0 to 100. If, on the other hand, the vertical axis was marked from 40 to 70, the line would be much steeper and give a false impression of sudden growth – which is quite possibly the intention!

__How reliable is the data?__

Another element to consider is the sample size of the data, and how biased or otherwise that sample is. We’ve all seen the claims made by manufacturers that “85% of consumers agree” that their product is in some way improved. What many people fail to read is the small print which says that the study was based on 60 customers – consumers who were already actively buying the product beforehand, rather than random members of the public. The headline figure of 85% can look quite impressive, but when you drill down and calculate that, in fact, nine existing customers out of the 60 (15%) didn’t think there was an improvement, it’s not entirely convincing!

__When the numbers matter__

The weather focussed examples above are quite frivolous, as it probably isn’t vital to know for certain whether or not it will rain today. But in other areas, for example medicine, understanding odds and probability is literally a matter of life and death. Although it sounds cold-hearted, oncologists often have to make a cost / benefit analysis when it comes to prescribing chemo or radio therapy for cancer sufferers – is the potential increase in life expectancy worth the cost of treatment? How do you evaluate that? In cases such as this, they’re having to act more like actuarial scientists than doctors.

__Probability vs Possibiity__

In trying to understand odds and probabilities, it’s helpful to turn to an area where such things are in daily usage. In the world of casino gambling and sports betting, odds are represented in many ways – percentage, probability, bookmakers odds, decimal odds and money line.

The last two are used mainly in sports betting and are slightly outside the remit of this article, but the first three are very useful to help explain how to read odds.

To illustrate, we’ll use the example of a dice throw.

Probability measures outcomes on a scale of 0 to 1, where 0 is impossible and 1 is definite.

The probability of throwing a Six on a single die is 1 in 6 (there’s one positive outcome out of the six possible results) This can be expressed as a 0.16667 probability.

It could also be shown as a 16.667% chance.

Another way to express it would be as a 5 to 1 chance – there are five negative outcomes to one positive.

This last formulation often gives rise to a lot of confusion as it’s the way that bookmakers often quote odds for sports events such as horse racing. The most important thing to bear in mind when looking at these odds is that they’re set by the bookmaker. Although they will be influenced by probability, they are not an accurate representation. Bookmakers set their prices with two intentions: to attract money for unlikely results and to minimise their losses should the predicted winner win. Unlike the dice throw example given above, betting on a horse at 5/1 does not mean that it has a 1 in 6 chance of winning!

__Conclusion__

When reading statistics, it’s helpful not just to understand the numbers, but also to know where those numbers come from. Often, the source will explain the way the data is being presented.

The ‘glass half full’ / ‘glass half empty’ is a classic example of putting a positive or negative spin on the same factual information.

With all that in mind, I can see that my weather app is still saying that there’s “A 50% chance of rain in your location today”

Should I take an umbrella, or not?!