Here is the abstract for a draft of the paper. Comments welcome!
Dilation occurs when upper and lower probability estimates of some event E are properly included in the upper and lower probability estimates of the probability of E conditional on another event F, resulting in a change from a more precise estimate of E to a less precise estimate of E upon learning F. Strict dilation occurs when E is dilated by every event in a partition, which means that sometimes E becomes less precise no matter how an experiment turns out. Many think that strict dilation is a pathological feature of imprecise probability models, while others have thought the problem is with Bayesian updating. However, a point often overlooked in critical discussions of dilation is that knowing that E is stochastically independent of F (for all F in a partition) is sufficient to avoid strict dilation. Since the most sensational alleged dilation examples are those which play up independence between dilator and dilatee, the sensationalism traces to mishandling imprecise probabilities rather than revealing a genuine puzzle about imprecise probabilities.
The paper hinges on a point about how probabilistic independence behaves in an imprecise probability setting, which I think is a source of confusion in some of the recent literature. I draw out this point below the fold.
Start with the textbook definition of stochastic independence. Defined with respect to a classical probability function p, we say that event E is stochastically independent of F just in case the joint distribution is equal to the product of the marginal distributions; that is:
(IND) p(E,F) = p(E)p(F).
Now, so long as p(F) is non-zero, we may also say that E is stochastically independent of F when F is epistemically irrelevant to the probability of E, that is:
(IR) p(E | F) = p(E), when p(F) > 0.
And, indeed, we might just as well switch F and E–so long as we make the appropriate accommodations to avoid conditioning on zero probability events. Let’s then say that E is epistemic independent of F just when:
(EI) p(E | F) = p(E), when p(F) > 0 and p(F | E) = p(F), when p(E) > 0.
When working with a single probability distribution, p, these notions are equivalent. Indeed, if p is given a behavior interpretation, arguably we learn that two events are stochastically independent by observing that one event is epistemically irrelevant to the other. That is, it is common to infer (IND) from observing that (IR) holds.
Yet, while this inference from (IR) to (IND) is sound for a single distribution, it is fallacious when instead there are a set of distributions. (See the paper for details.) What’s more, it turns out that (IR) does not entail (EI) and (EI) does not entail (IND), for versions of these principles adapted to sets of distributions. If you assume that your set of distributions is convex, this still remains the case; however, if you add convexity, (IND) entails (ER) and (ER) entails (IR) for the set-versions of these principles. If you drop convexity, then this chain breaks as well. (But that is another story).
The surprising point is that there are several independence concepts rather than a single unitary notion, and these distinctions are hidden from view when you work with a single distribution. What’s more, familiar and sound principles of reasoning about independence properties break down in the imprecise probability setting, leading to (understandable) confusion.
In any event, this observation raises two questions.
- For orthodox Bayesians: Imprecise probability theory reveals a fact about (IND), (EI), and (IR), namely that they are distinct properties which are collapsed when working with a single probability distribution, p. Are you confident that your elicitation procedure for determining numerically precise degrees of belief (aka, credences) warrants collapsing these distinctions?
- For convex Bayesians: In so far as you rely on a behavior interpretation of your convex set of distributions (aka, credal states), how do you provide a behavioral justification for treating two events as completely stochastically independent given that inferring (EI) from (IND) is fallacious?
[Cross-posted at Choice and Inference]