Dogma of Precision

Ian Hacking, in Emergence of Probability (1975), calls attention to a central assumption of Bayesian theory, namely that uncertainty is always measured by a single, additive probability measure, and that utilities are always measured by a precise utility function. Hacking calls this assumption the dogma of precision.

The point to emphasis here is that a Bayesian always assumes that there is an ideal, precise probability model. A consequence of this assumption is that imprecision is viewed to arise from uncertainty over what the precise model is. This is an important point since it underpins how imprecision is viewed within a Bayesian framework and also how it is handled. Basically, what happens in statistical modelling is that Bayes’ rule is applied to precise prior distribution-precise likelihood function pairs to yield a set of precise posteriors, which are then checked to see if the distribution of these values is what we expected it to be. Imprecision then is viewed as textbook epistemicists view vagueness: it is a by-product of incomplete information.

So a Bayesian models imprecision by a set of precise probability measures of uncertainty. Sensitivity analysis is a common method of using imprecise models to check the robustness of a (Bayesian) statistical model. (Nevermind for the moment that the reasons the Bayesian gives to justify the axioms of probability are based on precision, so are inconsistent with sensitivity analysis.)

Setting aside this particular incoherence of Bayesianism, there is another issue regarding how imprecision is handled. The trouble isn’t generating upper and lower probability from a set of precise probability measures, but rather going in the other direction. That is, given a probability interval marked by (non-linear) upper and lower probability, are we to always assume that there is an ideal, precise probability model (or a set of them) just out of reach? The answer is No. And Peter Walley’s highly recommended Statistical Reasoning with Imprecise Probabilities (1991) demolishes the orthodox reasons for thinking otherwise.

There is a philosophical point here, one that applies to the discussion of conditionals.

In pursuing a theory of natural language conditionals, the orthodox view appears to assume that

(i) there is a formal structure underlying natural language (Montague’s Thesis);

(ii) that the semantic/pragmatic boundary is precise, where pragmatics is the contextual filler standing between us and this core structure, between a sentence and the proposition it expresses, between the information conveyed by an utterance and the semantic value of what is expressed;

(iii) and that it is necessary to carve away to the core semantic structure, by disambiguation and precisification, before we can settle issues of what follows from what, what supports what, et cetera.

There is something substantial to (i), which is what formal semanticists are after. May the wind be at their backs. (iii) is false. I’ll return to this in a moment. Condition (ii) is wrong on the point of precision, hence (ii) is more complicated than our philosophy of language teachers told us, hence pivotal. Let’s start there.

It is commonplace in any science to match the thoroughness of an analysis to the importance of the problem at hand. We then approximate values for features that are not that important. This is true in physics, true in bridge-building.

It is assumed by many that this commonplace trade-off doesn’t hold in most of philosophy, or shouldn´t hold in its ‘core’ areas, since here we’re primarily interested in getting at some or other basic structure, or some set of metaphysically necessary principles, and that to think otherwise is to confuse theory with practice. But this assumption is often wrong in particular cases; the tacit acceptance of this view as a sound philosophical methodology is very much like the dogma of precision that we see in statistics.

To be sure, this dogma is a natural disposition for theoretical-minded people to have, particularly mathematicians, logicians, and philosophers. It is a disposition to abstract away the fit between the models we imagine, the pretty structures we see, and the relationship these structures bear to the data, to the world itself. (Indeed, even this problem is too often addressed in broad generality.) Much more argument is required here. These remarks are to serve as motivation for softening up (ii).

Maybe an example would help. The Natural Language Processing research program attempts to build software systems that understand natural language and to reason about that understood information. The problem is how to build a computer that works like Kirk’s does on the Enterprise, or HAL in 2001, for that matter. Before fantasizing or worrying over machines helping us or turning against us, however, machines have to understand what we, and they themselves, are saying. This has turned out to be much harder than was imagined in the 1960s. Here is one highlight of what we’ve learned.

NLP set off like gang-busters trying to implement a solution based on the (i)-(iii) idea. The architecture was roughly this: First, a parser tears down the syntactic structure of a natural language sentence, like those tree diagrams that they should be teaching in grade school but stopped teaching for some reason. Second, with this set of word-syntactic category pairs, the system consults an enormous dictionary of word meanings for each of the pieces of syntax it has identified. This gives you the atomic elements of your semantic model, elements that you may then combine in ways that obey the syntax of one or another of the sentence parses you’ve generated. Third, the system picks the semantic structure that is the most likely combination of these semantic parts, yielding the meaning of each sentence parse. Then, it picks the most likely semantic structure as the winner, that is, the most likely meaning of the natural language sentence. It was a great idea. But it didn’t work.

In getting something like this to work, and to work rather well, it turns out that you cannot run through this procedure sequentially: syntax, semantics, pick the winner; next sentence: syntax, semantics, pick the winner; next sentence…. The picture is much more complicated. But let’s focus on imprecision. One thing James Allen and his teams at Rochester and at the IHMC, in Pensacola, have done over the years is to enable reasoning with partially interpreted sentences. It turns out that it is necessary to have some imprecise semantic structure to guide the parsing of the sentence. (For example, in a conversation with Tommy Chong, the noun ‘can’ is much more likely to mean toilet than food tin.) Yet to know which semantic structure to prefer, you have to know something about the general domain and specific situation in which your sentences appear. This, by the way, is the reason why NLP models on this architecture work best on fairly restricted domains, and also why they are better suited to spoken dialog than to written language. It is also an important reason why there was a tidal shift in the early 1990s away from language understanding to purely statistical models based on very little or no grammatical structure for written language. This was also around the same time that two Stanford graduate computer science students thought ‘Google’ might be a catchy name for a start-up company that ran with the linguistic-structure-is-useless idea.

Which brings us back to this idea that getting an exact, precise structure is necessary before one can do anything else philosophers might be interested in, before logicians can understand what should follow from what, before epistemologists can understand what supports what. The working assumption of orthodoxy, to continue, is that we may bracket the problem of how to fit our model to practice and pass that off to someone else down the line to solve; we may confidently work-away at our epistemic modelling based on precise structures since this idealization will be useful to that guy down the line as a normative standard to keep in mind when he’s cutting the corners that are necessary to fit together theory and practice. I don’t think that this picture is the right one to hold, however. The search for the logic of natural language “conditionals” is a good example: there is very little that we´ve learned about entailment or about epistemic support from studying conditionals. And what mysteries that remain about them are for the linguists to work on, not philosophical logicians.

Walley’s observations about statistical reasoning and Allen’s experience with natural language understanding systems (including “conditionals”!) are examples from two very different areas that offer evidence against the general methodological idea behind (iii). Finally, each case offers some evidence that the so-called pragmatic turn in epistemology is an important one; and it suggests that arguments against this turn that appeal to the dogma of precision ought not to be left unchallenged.


Dogma of Precision — 4 Comments

  1. Greg, thanks for the exciting post. I’ll need some help to appreciate its importance – actually, help to just understand some of the things you say at a basic level. Hopefully, my questions will also be helpful to others.

    1. I would have thought that it has long been established that (ii) is false. The falsehood of (ii) can be shown in a number of ways. Maybe the case of conditionals has peculiarities that are not apparent to me now. I was thinking of the Davidson of “A Nice Derangement of Epitaphs”. You will certainly find greater fans of Davidson than me, particularly greater fans of the “radical interpretation” project. But one of the most intriguing suggestions of that paper is that the literal/non-literal divide is a mirage. (I like that very much. It gives us a nice opportunity to show how pragmatics is either largely epistemology in disguise or else largely a pile of baffling explanations. The thoroughly epistemological explanation for the rational interpretation of metaphorical discourse that Ingrid Finger and I developed several years ago recommends ignoring that divide in Davidsonian spirit. But that’s for some other time.) For present purposes, the reason why I thought the Davidson reference mattered is that it is hard to believe that anybody — and especially an author of D’s sophistication in what was his home turf — would boldly assert any such view if there were much room for doubt as to whether (ii) is false. What have I missed?

    2. As I understand, you’re discussing an idea according to which having a definitive account of semantic structure, “getting an exact, precise structure is necessary before one can do anything else philosophers might be interested in, before logicians can understand what should follow from what, before epistemologists can understand what supports what”. I think I understand the part about how an account of semantic structures affects an account of what follows from what. But how did epistemology get in there? Can you elaborate a little?

    3. This is related to the previous point. You write that “there is very little that we’ve learned about entailment or about epistemic support from studying conditionals”. What, exactly, do you think people expected to learn about entailment from the study of conditionals? Much of what we think we know about entailment comes from an age when even the dream of fully developed semantics may not have disturbed anybody’s sleep. Are you talking about the validity of specific rules, like Contraposition, Hypothetical Syllogism?

    4. I’m perplexed by your claim that “what mysteries that remain about [conditionals] are for the linguists to work on, not philosophical logicians”. I understand that you’re trying to be provocative here. But I just don’t even basically get it. Are you suggesting that the mysteries of conditionals are somehow to be solved by _empirical_ investigation?

    I hope these questions are not a total waste of time for CD readers.

  2. Hi Gregory,

    Your discussion of the difficulties encountered by the NLP program suggests to me the need for something like a recursive process of semantic refinement. In other words, we start with a rough interpretive guess based on context and background information and gradually work toward a more precise understanding as a limit. We rarely if ever get to the semantic limit, though, because the refinement process yields progressively less and less semantic bang for the cognitive processing buck.

    Does that sound on the right track for what you’re driving at?

  3. Greg, I think you’re making very good points here. But the puzzle is how to keep the comments you make with respect to (ii) and (iii) from infecting (i) as well.

  4. Claudio, Alan and Steve: thanks very much for your comments. I putter around at home most weekends, away from the net. Hence my delay in replying.

    Alan & Claudio’s (1): I am primarily interested in how (ii) fails to be true rather than pointing out that it is false. I think the picture that many people have for why it fails to be true, and how to remedy its being false, is roughly along the lines of Alan’s suggestion. That is, the failure of precision here is viewed as a instance of the problem of vagueness and that we can then idealize following Alan’s suggestion, even if only in principle. But this is what, surprisingly, doesn’t work so well in practice. And it fails for a deeper reason than that we’re simply imperfect creatures with limited time on our hands. (It might be worth mentioning that Davidson’s ideas have been influential in this branch of NLP; one approach to structure partially interpreted sentences is largely based on Davidson’s theory of events.)

    There are other epistemologists who have offered similar warnings. Isaac Levi has long warned against making a fetish of language, calling philosophy’s widespread failure to take heed ‘the curse of Frege’. I have come to think that there is a very serious point behind this jab.

    Claudio’s (2): This is brief, but return to Slim and his two friends for a moment. The question was to figure out whether his friends’ utterances were warranted. There is a tension here then between the way the facts are arranged in this story (as it is normally read and understood), the semantic structure of the natural language sentences reportedly used by each friend, and competing philosophical accounts of the logical structure of those “types” of sentences. The discussion predictably changes from whether F1 and F2 are warranted to assert A and B in English to whether F1 and F2 are warranted to assert something bearing the structure of A* and B* in one of these philosophical accounts of condidtionals. This is a bad move; it impacts epistemology by letting a fictitous structure of A and B play a role in framing the discussion of warranted assertion.

    Claudio’s (3): We are in a golden age of logic at the moment, and logic is the study of what follows from what, i.e., entailment, i.e., logical consequence. We are still learning a great deal about this relation. Moreover, we are learning a great deal about how to apply logical languages with various expressive capacities to model entailments. We are also learning about restricted forms of entailment; entailments that “preserve” something other than truth; consequence relations that extend entailment (and are thus classically invalid) while enjoying some rather nice properties of (classical) logical consequence. And much, much more. One consequence of this of moment for epistemologists is that this is increasing our vocabulary to articulate, and hence to understand, epistemic relations.

    Claudio’s (4). I am being provocative here, yes. My hope is that the underlying point is important enough to warrant the theatre.

    I don’t know what the problem of “conditionals” is, exactly. For a logician to work on this I would assume that the presumption is that there is some logical structure to “if…then…” constructions. It isn’t a truth functional connective, and we know that it isn’t from decades of theoretical and empirical study of natural language, including contributions from many philosophical logicians. So, what about the grammatical feature of mood (i.e., indicative; subjunctive)? Does this feature reveal stable logical structure in the sense that if we fix these values, then we can evaluate “if…then…” sentences by evaluating their constituents? No. Tense can change the semantic meaning of constituents. Likewise aspect. I don’t know why a logician would be interested in natural language “conditionals” now, given what we know about them. Let the linguists take over from here, is what I’m saying.

    A quick caveat: One kind of work on conditionals that is very important is studying the properties of connectives corresponding to some consequence relation or other. For instance, logical consequence in linear logic has certain types of properties and it is very helpful to have a conditional in the object language having many, if not all, of these properties too. Figuring this out is the kind of thing logicians do, including philosophical logicians.

    Finally, Steve’s nice point about (ii) and (iii) undermining (i): The most informative and accurate work in natural language formal semantics that I’ve seen is not at all the kind of thing that anyone could begin to use to work on any remotely interesting question concerning what follows from what. The short answer then is that whether or not semanticists are successful confirming Montague’s thesis is beside the point: the formal semantics project is (largely, not entirely) irrelevant to the main concerns of epistemology.

    Finally finally: Steve, Alan: A small point, by way of follow-up: natural language might have (interesting) formal structure without being recursively enumerable.

Leave a Reply

Your email address will not be published. Required fields are marked *