Ian Hacking, in Emergence of Probability (1975), calls attention to a central assumption of Bayesian theory, namely that uncertainty is always measured by a single, additive probability measure, and that utilities are always measured by a precise utility function. Hacking calls this assumption the dogma of precision.
The point to emphasis here is that a Bayesian always assumes that there is an ideal, precise probability model. A consequence of this assumption is that imprecision is viewed to arise from uncertainty over what the precise model is. This is an important point since it underpins how imprecision is viewed within a Bayesian framework and also how it is handled. Basically, what happens in statistical modelling is that Bayes’ rule is applied to precise prior distribution-precise likelihood function pairs to yield a set of precise posteriors, which are then checked to see if the distribution of these values is what we expected it to be. Imprecision then is viewed as textbook epistemicists view vagueness: it is a by-product of incomplete information.
So a Bayesian models imprecision by a set of precise probability measures of uncertainty. Sensitivity analysis is a common method of using imprecise models to check the robustness of a (Bayesian) statistical model. (Nevermind for the moment that the reasons the Bayesian gives to justify the axioms of probability are based on precision, so are inconsistent with sensitivity analysis.)
Setting aside this particular incoherence of Bayesianism, there is another issue regarding how imprecision is handled. The trouble isn’t generating upper and lower probability from a set of precise probability measures, but rather going in the other direction. That is, given a probability interval marked by (non-linear) upper and lower probability, are we to always assume that there is an ideal, precise probability model (or a set of them) just out of reach? The answer is No. And Peter Walley’s highly recommended Statistical Reasoning with Imprecise Probabilities (1991) demolishes the orthodox reasons for thinking otherwise.
There is a philosophical point here, one that applies to the discussion of conditionals.
In pursuing a theory of natural language conditionals, the orthodox view appears to assume that
(i) there is a formal structure underlying natural language (Montague’s Thesis);
(ii) that the semantic/pragmatic boundary is precise, where pragmatics is the contextual filler standing between us and this core structure, between a sentence and the proposition it expresses, between the information conveyed by an utterance and the semantic value of what is expressed;
(iii) and that it is necessary to carve away to the core semantic structure, by disambiguation and precisification, before we can settle issues of what follows from what, what supports what, et cetera.
There is something substantial to (i), which is what formal semanticists are after. May the wind be at their backs. (iii) is false. I’ll return to this in a moment. Condition (ii) is wrong on the point of precision, hence (ii) is more complicated than our philosophy of language teachers told us, hence pivotal. Let’s start there.
It is commonplace in any science to match the thoroughness of an analysis to the importance of the problem at hand. We then approximate values for features that are not that important. This is true in physics, true in bridge-building.
It is assumed by many that this commonplace trade-off doesn’t hold in most of philosophy, or shouldn´t hold in its ‘core’ areas, since here we’re primarily interested in getting at some or other basic structure, or some set of metaphysically necessary principles, and that to think otherwise is to confuse theory with practice. But this assumption is often wrong in particular cases; the tacit acceptance of this view as a sound philosophical methodology is very much like the dogma of precision that we see in statistics.
To be sure, this dogma is a natural disposition for theoretical-minded people to have, particularly mathematicians, logicians, and philosophers. It is a disposition to abstract away the fit between the models we imagine, the pretty structures we see, and the relationship these structures bear to the data, to the world itself. (Indeed, even this problem is too often addressed in broad generality.) Much more argument is required here. These remarks are to serve as motivation for softening up (ii).
Maybe an example would help. The Natural Language Processing research program attempts to build software systems that understand natural language and to reason about that understood information. The problem is how to build a computer that works like Kirk’s does on the Enterprise, or HAL in 2001, for that matter. Before fantasizing or worrying over machines helping us or turning against us, however, machines have to understand what we, and they themselves, are saying. This has turned out to be much harder than was imagined in the 1960s. Here is one highlight of what we’ve learned.
NLP set off like gang-busters trying to implement a solution based on the (i)-(iii) idea. The architecture was roughly this: First, a parser tears down the syntactic structure of a natural language sentence, like those tree diagrams that they should be teaching in grade school but stopped teaching for some reason. Second, with this set of word-syntactic category pairs, the system consults an enormous dictionary of word meanings for each of the pieces of syntax it has identified. This gives you the atomic elements of your semantic model, elements that you may then combine in ways that obey the syntax of one or another of the sentence parses you’ve generated. Third, the system picks the semantic structure that is the most likely combination of these semantic parts, yielding the meaning of each sentence parse. Then, it picks the most likely semantic structure as the winner, that is, the most likely meaning of the natural language sentence. It was a great idea. But it didn’t work.
In getting something like this to work, and to work rather well, it turns out that you cannot run through this procedure sequentially: syntax, semantics, pick the winner; next sentence: syntax, semantics, pick the winner; next sentence…. The picture is much more complicated. But let’s focus on imprecision. One thing James Allen and his teams at Rochester and at the IHMC, in Pensacola, have done over the years is to enable reasoning with partially interpreted sentences. It turns out that it is necessary to have some imprecise semantic structure to guide the parsing of the sentence. (For example, in a conversation with Tommy Chong, the noun ‘can’ is much more likely to mean toilet than food tin.) Yet to know which semantic structure to prefer, you have to know something about the general domain and specific situation in which your sentences appear. This, by the way, is the reason why NLP models on this architecture work best on fairly restricted domains, and also why they are better suited to spoken dialog than to written language. It is also an important reason why there was a tidal shift in the early 1990s away from language understanding to purely statistical models based on very little or no grammatical structure for written language. This was also around the same time that two Stanford graduate computer science students thought ‘Google’ might be a catchy name for a start-up company that ran with the linguistic-structure-is-useless idea.
Which brings us back to this idea that getting an exact, precise structure is necessary before one can do anything else philosophers might be interested in, before logicians can understand what should follow from what, before epistemologists can understand what supports what. The working assumption of orthodoxy, to continue, is that we may bracket the problem of how to fit our model to practice and pass that off to someone else down the line to solve; we may confidently work-away at our epistemic modelling based on precise structures since this idealization will be useful to that guy down the line as a normative standard to keep in mind when he’s cutting the corners that are necessary to fit together theory and practice. I don’t think that this picture is the right one to hold, however. The search for the logic of natural language “conditionals” is a good example: there is very little that we´ve learned about entailment or about epistemic support from studying conditionals. And what mysteries that remain about them are for the linguists to work on, not philosophical logicians.
Walley’s observations about statistical reasoning and Allen’s experience with natural language understanding systems (including “conditionals”!) are examples from two very different areas that offer evidence against the general methodological idea behind (iii). Finally, each case offers some evidence that the so-called pragmatic turn in epistemology is an important one; and it suggests that arguments against this turn that appeal to the dogma of precision ought not to be left unchallenged.