Direction of Implication of high probability
There are a huge number of anomalies and biases which can be understood as confusing the claims that the probability of p conditional on q is high if the conditional probability of q conditional on p is high. One is diagnostic expectations – the conditional probability that a person has red hair if the person is Irish is high. People incorrectly guess that the conditional probability that the person is Irish if he or she has red hair. This also helps explain the gamblers paradox. If one flips a fair coin and it comes up tails 3 times, the change that the next flip will come up head is 50%. People guess more than 50%. They are confusing forecasting and inference. The true statement is that, for any prior on fair on unfair, the chance it is fair is higher if it comes up heads than if it comes up tails, Confusing “if it is fair, is the chance of heads high” and “if it comes up heads is the chance that it is fair high”. According to Bayes the probability of p conditional on q is
(probability of p AND q)/(probability of q) and the probability of q conditional on p is
(probability of p AND q)/(probability of p). It seems that people ignore unconditional probabilities. This explains diagnostic expectations. It also explains the gamblers’ fallacy. It even explains the anomaly so strong as to be almost unbelievable; when people are asked about a joint probability of p and q when the probability of q is low and the probability of p conditional on q is high, they often claim that the p AND q is more likely than q. so let’s say q is the probability that Michael plays in the NBA and p is the probability that Michael is taller than 6’3”.
This is crazy. It would be correct only if the conditional probability were greater than 1. This crazy response would occur if people undercorrect for unconditional probabilitie, that is baseline frequencies.
Why do we do that? One explanation is that we speak a different dialect than mathematicians do, When we say that It is likely that two claims are true, we mean that there is a positive correlation between their being true (or maybe a high positive correlation) so we mean
( probability of p AND q)/square root of ((probability of p)(probability of q). The messiness of my pure ascii equation and the difficulty of explaining what I mean in plain English are hints that there is a simpler (but confusing) way to say it (such as “p AND q is likely” (or such as a much simpler way)),
Another is that people are willing to do only a tiny amount of arithmetic in their heads. Even very simple probability calculations require doing some multiplication and remembering the result while doing some more multiplication and some addition and division. People may go to {probably not so approximately zero probability) or (probably so approximately certain) to avoid arithmetic, This can lead to many biases, even if not the most amazing bias I discussed.
