“Contextualism, Contrastivism, and X-Phi Surveys”

Here is the paper I gave at the Oberlin Philosophy Colloquium last month.  In it I discuss recent experimental work relevant to (or at least often enough taken to be relevant to) epistemic contextualism.  I believe the papers for that conference will be appearing in a special issue of Philosophical Studies.  Comments welcome (either left as comments to this post or e-mailed to me at my-first-name dot my-last-name at yale dot edu).

Some readers may get a kick out of footnote #18, at the bottom of p. 22:

I guess it’s a sign that I’m a true armchair philosopher at heart that I’m here discussing the bearing of the results of an experimental philosophy survey—that I’ve only imagined has been conducted!


“Contextualism, Contrastivism, and X-Phi Surveys” — 15 Comments

  1. Keith, I enjoyed this talk and paper very much, though I had to miss the discussion to catch a plane.

    1. Here’s a question I had, which we might have discussed briefly in conversation the night before, but I don’t recall very well.

    One of your criticisms of how some of the questions are asked concerns the fact that, as I believe you said “It is easier for a true claim to be inappropriate than for a false claim to be appropriate” and that “there is a typical presumption that a statement is true” (I’m quoting from my notes from the talk, but I think there are statements to the same effect in the paper).

    As a result, you say it’s very important–to test *your* theory anyway–that they “feature a speaker making a denial of knowledge in HIGH and a speaker making an affirmation of knowledge in LOW” (Page 3 of the MS from Oberlin).

    You note that May, Sinnott-Armstrong, Hull, and Zimmerman don’t do that, nor does Buckwalter. As I understand it, the problem is that the results will be skewed for the reason mentioned above, which I think you referred to the “Lewis point” about how people ordinarily try to find a construal on which the claims of others are true. Thus it seems, in fact, that one could get just about whatever results they wanted by making sure that they framed the questions in a way that POSITIVE evaluations of attributions would support their thesis. Is that too strong? If so, isn’t it still the essence of your criticism? If it’s too strong, why is that so?

    The Buckwalter results, at least, support this. His subjects were presented with versions of HIGH that feature positive attributions of knowledge. As you note, “survey takers tend to rule both attributions of knowledge to be true” (Oberlin MS, 5).

    Let’s call this the “Accommodation Bias Effect” (ABE). Why is it that *your* results are vitiated by ABE in the same way Buckwalter’s are?

    I’m sure experimentalists have ways of handling such all-forms-of-testing-are-subject-to-the-same-bias cases, and I have some intuitive idea of how they’d have to go, but I wonder if any X-Phi folks out there are re-tooling (I noticed James Bebee was taking a lot of notes…) 🙂

    2. You really made painfully clear how hard it would be to test your theory in your discussion about how salience works on your view (though the air of mystery was not fully dispelled for me). Your description of what’s going on in the mind of a survey taker was quite persuasive. However, again, I have some concern that we philosophers reading the cases are not so different from the hapless “surveyees” you describe. This, like the above, reinforces my own predisposition that none of these kinds of results can play much of an evidential role pro or con. In other words, I came away from your talk pretty convinced of three things: 1. No current X-Phi results offer much evidence at all against contextualism. 2. None are likely to do so. 3. There is no ordinary language basis for contextualism.

    3. Do you think interest relativists are more open to criticism from the kind of results you defend your view against?

    Sorry for all the questions, man, but you know I’m super-interested in this stuff.

  2. Thanks for posting this, Keith! It’s great to hear you weigh in on this work. I haven’t had a chance to go through the entire paper carefully yet, but I did look at your discussion of MSH&Z in some detail. 🙂

    As you note, contextualism wasn’t really our target. But I do worry about your setting these results entirely aside when it comes to contextualism. A couple of initial comments for what they’re worth:

    (1) As you point out, you make claims about the truth-values of knowledge attributions, not about whether people will say someone in the vignette knows. You place a great deal of weight on this difference. In fact, you say: “It may turn out, for all I know, that this difference will not matter much to how rushed survey-takers will tend to answer the questions they are given. But this would mean that they’re probably not very competent in answering at least one of the questions” (p. 4).

    I worry, along the lines that Trent does (in his comment above), that such a response will erode the ordinary language basis for contextualism. If this difference is so crucial, why wouldn’t it appropriately guide the judgments of ordinary speakers? Perhaps there is some plausible error theory here, but without one in hand this seems merely speculative.

    You seem to suggest an error theory when you say the survey takers were “rushed.” But what evidence do you have that our participants were rushed? I have pretty good evidence that they weren’t. 🙂

    (2) You mention that the context of reading a vignette is “exceedingly odd” (p. 5). I’m not sure why it’s odd, let alone exceedingly. Are you suggesting that the ordinary language basis concerns only contexts such as natural verbal attributions of knowledge? Or are you suggesting that it’s less odd for ordinary people to talk about the truth of knowledge attributions rather than just ascribing knowledge? I’d submit the reverse is the case.

    You do at one point suggest at least one problem with the context is that we “have no idea what kind of use this judgment is going to be put to” (p. 5). Even if this is a really odd context (for this or other reasons), is it not the one philosophers are in when reading the cases in books and journal articles? Are such contexts less odd for philosophers? If so, again there seems to be a significant difference between ordinary speakers and philosophers, and so the ordinary language basis seems less stable.

    Anyway, those are just some initial thoughts. I look forward to going through the paper in more detail!

  3. Trent: Re your (1): There are a couple of different things to distinguish here, both of which get involved in my complaints (two distinct complaints) about changing the HIGH bank case so that it involves a positive claim to know in section 1.3 (pp. 5-8) of the paper.

    First, there’s a methodological point: that there’s a presumption that warranted claims [not based on false beliefs the speaker has about relevant underlying matters of fact] are true — that’s a lot stronger than any presumption there might be to the effect that inappropriate claims are false. This presumption supports the intuition I rely on in making my case: that the denial of knowledge in HIGH is true. By changing HIGH so that it features a positive claim to know (which they’re thinking the contextualist will hope will be, and will be intuited to be, false), the case for contextualism loses that source of (legitimate) support, and is thereby (illegitimately) weakened. This (in brief) is the complaint I make in the paragraph that straddles pages 6-7 and the two short paragraphs that follow it.

    Then there’s accommodation–which, as I’m using it, refers to a (c.p. & within certain limits) tendency for the content of context-sensitive terms to adjust so as to make what’s said true. By changing HIGH in the way in question, one makes it so that accommodation is no longer a force that works in favor of the intuition I appeal to concerning HIGH to a force that works against me, thereby illegitimately weakening the case. This (again in brief) is the complaint I raise in the paragraph that straddles pp. 7-8, and that paragraph that follows it.

    Thus it seems, in fact, that one could get just about whatever results they wanted by making sure that they framed the questions in a way that POSITIVE evaluations of attributions would support their thesis. Is that too strong? If so, isn’t it still the essence of your criticism? …. Let’s call this the “Accommodation Bias Effect” (ABE).

    No, that’s not my criticism. Supposing a “bias” is unreliable, *I’m* not positing any bias here. Supposing that there really is a tendency for the content of context-sensitive terms to adjust so as to make what’s said true, then, about the relevant cases, when we intuit that the relevant claims are true, we are reliably discerning the correct truth-value of the claims. Yes, it is (way) too strong. Again, I’m just claiming that the change to the case illegitimately changes a force that’s working in my favor to a force that works against me, not that you can get just any results you want just by constructing cases so that judgments to the effect that speakers are making true claims in the cases are the judgments that will support your thesis!, since subjects do often enough intuit that claims made within cases are false.

  4. Hey Keith, I really enjoyed the paper!

    I was wondering though about this one point. F&Z seemed to have run the cases you claimed to be crucial in the first part of the paper: their DV in terms of truth-value, and the knowledge denial in HIGH verbatim from Stanley. It seemed dialectically strange then to argue that the intuitive basis is not threatened because some people have not run certain experiments properly to test your empirical predictions about OLPs, when it happens that some other people actually have run the relevant experiments (and interestingly, comparing all the studies in that section may give at least some reason to think that manipulating the DV between attribution/truth value as well as the statements evaluated in HIGH between attribution/denial in bank cases doesn’t make a ton of difference to the direction of participants responses in the various tests, though I agree the pressures you (and Lewis) hypothesize here need testing directly).

    In that one section, you said that F&Z’s results are prima facie trouble for the view, but I was wondering if you could say a little more about why they don’t make lots of prima facie trouble for the view. Was it due to shallow processing and a confound with the bank cases when you claim that “a survey taker might well suppose that the purpose behind the questions is to find out how much confidence respondents tend to have in the stability of banking hours”? (Importantly, we should note here that F&Z have run other studies using various other cases, for instance the bridge cases, where incredibly salient stakes make little difference to the judged truth-values of knowledge sentences, suggesting that the pattern does not arise due to a factor specific to the bank cases).

    Or was the problem with the results more to do with the issue that their vignettes did not manipulate the relevant context altering factors the view hypothesizes making the difference in OLPs? Or was it more due to the general concerns of survey methodology you raise in the following section?

  5. Trent: Re your (2): You draw three conclusions:

    1. No current X-Phi results offer much evidence at all against contextualism. 2. None are likely to do so. 3. There is no ordinary language basis for contextualism.

    1 is pretty much what I was arguing for (at least wrt the studies I was discussing; there may be others [perhaps very recent] that I’m unaware of). I have little idea what your basis is for 2-3. 2 of course is friendly to me, but I wouldn’t take my paper to support it, & my only reason for agreeing with it (well, at least that no forthcoming results will count very strongly against contextualism) is my faith that contextualism is true & that ordinary language considerations count strongly in its favor, so it’s unlikely that good future survey results will count very strongly against it. I suspect there are good ways to check relevant intuitions experimentally, and we’ll just have to wait & see which way that wind blows. My suspicion that it will blow in a pro-contextualist direction is based on my thinking contextualism is right. If I were like you & didn’t think that contextualism is right, I wouldn’t think I have any good basis for 2. I of course strongly disagree with your 3, but probably don’t understand your reasons enough to address it well. The criticisms I was advancing myself in my paper were specific problems with the way the surveys were set up that don’t apply to how I have made the pro-contextualist case (which, unsurprisingly, is set up just the way I’d like it!). I think what you have in mind are the kind of problems with survey methodology that I discuss in section 1.6. As I write, those problems are better pressed by someone more expert in survey methodology than I am: I’m reporting on them in order to discuss their interactions with the problems I am raising. But my impression is that at least a lot of the biggest problems there don’t apply (either at all or with as much force) to philosophical readers of the philosophical literature. Which isn’t to say that there’s no cross-over. As Patrick Rysiew pointed out in his comments on the paper at Oberlin (& I believe the comments to the Oberlin papers are supposed to be appearing in the PHIL STUDS issue), one very good thing about x-phi is that it draws attention to a lot issues about how to construct examples and treat (sometimes suspiciously) intuitions about them that apply to the use of examples in old-fashioned philosophy as well.

  6. Trent: Re your 3:

    3. Do you think interest relativists are more open to criticism from the kind of results you defend your view against?

    The complaints I was advancing myself in my paper were quite specific to the issue of how to test the case for contextualism, and I think offer no relief to interest relativists. But in my (very non-expert) opinion, the kinds of issues I discuss in section 1.6 also very strongly affect the experimental case against IR.

  7. Pingback: DeRose on X-Phi & Bank Cases » Josh May

  8. Josh: Re the importance of the distinction between asking about the truth-values of knowledge claims made within the story vs. asking subjects direct whether characters in the story know:
    The key point I’m making here (certainly sufficient if it’s right) is that contextualism as it’s so far been developed (at least in any version I know) simply makes no predictions whatsoever about whether survey-taking subjects will describe characters in these stories as knowing or not; it only predicts (on the assumption that subjects will judge correctly) what truth-values the survey-taking subjects will assign to knowledge claims made by characters within the story. (I won’t repeat the explanation of why this is: it’s in the longish paragraph of the paper that straddles pp. 4-5.) What I’m doing in the passage you quote is expressing agnosticism about whether the distinction, crucial to the philosophical purposes at play, would make a difference to how non-philosophical subjects would answer a relevant survey question. I just don’t know.

  9. Josh: Re “[exceedingly] odd contexts”:

    You write: “You do at one point suggest at least one problem with the context is that we “have no idea what kind of use this judgment is going to be put to” (p. 5). Even if this is a really odd context (for this or other reasons), is it not the one philosophers are in when reading the cases in books and journal articles?”

    Yes! (And a very good point.) I think such philosophical contexts are likewise odd in ways that make judgments about the claims made within them especially problematic. I’ve long thought this (and expressed it several months ago in a comment on Jonathan Ichikawa’s blog here). I’ll hopefully be going into this in more depth in my second volume (that I’m now writing up).

  10. Wesley (#4): Thanks: This seems one of the things I should make clearer in the paper. I do want to (& think I do) acknowledge the prima facie trouble, but also to make clear at least some of the factors that lessen the threat. In the end, it’s for everyone to decide for themselves how worried to be (for the contextualist), and this is especially so because that evaluation crucially involves some matters that your humble author is no expert in.

    But the threat-lessening factors to my thinking are, yes, 1a: issues of general survey methodology like those mentioned in section 1.6 (issues that others, expert in survey methodology, are better positioned to discuss than I am — and that I think will soon enough be publicly discussed by such people), but also, in conjunction with that, 1b: the fact that the p.f. problematic (to contextualism) survey results from F&Z are so neutral. 1b is very briefly discussed in the last paragraph of section 1.6 at the bottom of p. 14. I don’t want to overstate its importance: It’s 1a that should be stressed — esp. since the importance of 1b depends a lot on just what’s going on in 1a. [This all reminds me that I still have to read the paper Jennifer Nagel sent me to read in connection with the issue of neutral results. (Sorry, Jennifer.) In case others might be interested, the reference is: Wändi Burine de Bruin, et. al., “Verbal and Numerical Expressions of Probability: ‘It’s a Fifty-Fifty Chance’,” Organizational Behavior and Human Decision Processes 81 (2000): 115-31. Some of the quotations in table 1 of the paper are fun. I especially like the first one, from a story in the New York Times: “Larry Robideaux Jr., the trainer of the speed horse Fox Trail, watched today as 11 rather evenly matched colts were entered in Saturday’s $1 million Travers Stakes and observed: ‘Everyone has a 50-50 chance of winning this race.’“]

    But also: 2: Josh & Jonathan’s results, discussed in section 1.7 of my paper. This is a bit delicate, since, I think, in the relevant study, J&J asked subjects directly whether characters in the stories knew things, rather than asking them about the truth values of knowledge claims made within the stories. This jeopardizes the extent to which their results really provide relief to contextualisms (of either the standard or the contrastivist variety, I would think). Still, if J&J’s seemingly very plausible explanation for why their results differ from those of earlier studies (very roughly: at least by philosophers’ standards, you really have to hit people over the head with things before they really register), that would seem to have significant trouble-relieving implications for studies (like F&Z’s) that do ask about the truth-values of claims made within the stories, suggesting that in such studies, as well, you might well get different & better results by not being so subtle, but really driving factors home, by using such devices of J&J’s inserted remarks about poor, poor Leon and all the trouble he suffered when his bank changed its hours.

  11. Hi Keith,

    I finally got entirely through your excellent paper. There is much to digest here, but your discussion of the context of taking a survey primarily struck some ideas I thought I’d express here. Your nicely labeled “WTF?! Neutral Response” (p. 14) is certainly important to address. In short, your idea seems to be that our data merely show (via the means near the mid-point of 4.0) that participants were unsure what was being asked of them and so of how to respond, so they put down something near 4.0 (neither agree nor disagree). If this is the case, then we shouldn’t rely much on these data reflecting the intuitive judgments of ordinary folks about the bank cases. If true, this certainly is a serious problem. But I doubt that it is true.

    We can address this worry directly by examining two things: (1) the details of how we went about administering the surveys and (2) the data from different angles (e.g. histograms which show the frequencies of each type of response). For our purposes here I’ll just report these for our first experiment (the between-subjects one that tested both contextualism and SSI, specifically of the Schaffer- and Stanley-variety). I can only report this information for our (MSH&Z) studies, but I suspect the same holds for Feltz & Zarpentine and Buckwalter.

    1. Survey Context

    We ran the vast majority of our experiments in a classroom environment. I contacted a professor, typically lower-division and non-philosophy, and asked if I could come administer the survey at the beginning of the class. So students were in a calm situation with nowhere to go afterward, so no rush to just finish the survey and leave. Participants were all told verbally what the survey was generally about (e.g. “We’re interested in what you think about claims about knowledge. There are no tricks; we just want to know what you think. Take your time, we’ll collect them in about 5-10 minutes”). I then waited until they all seemed finished and asked them to pass them up. So I gave them ample time and didn’t rush them throughout. Subjects were not offered any compensation (no McDonald’s coupons, etc.). Their only primary motivation to participate in the survey (if they chose to do so) was to help the researchers.

    Thus, contrary to what you suggest on p. 14, the participants likely weren’t rushed and we had the relevant “preamble” about what the survey was about. (We could have said more about knowledge attributions, but we of course don’t want to turn our folk into theorists!) We did approach some people just around campus. But, as I mentioned, the vast majority of participants were solicited in a classroom setting as described above.

    As you mention, you’ve been warned that this might not entirely solve the problems of “respondents not having a good feel for why they’re being asked” such questions (pp. 13-14). But it at least lets us know that some of the worries don’t apply here. There are certainly many things that could cause problems. But looking at averages alleviates many worries of this type. There will no doubt be participants that lie, don’t pay attention, and so on. But we’re only looking for trends based on the widespread assumption in the social sciences that such students won’t on the whole be so devious or lazy.

    2. Modes, Frequencies, Histograms

    I can only report more data from MSH&Z, but we can get an idea of whether many people were providing neutral responses in a rather direct way. We can look at the most frequent response (mode) and the frequencies of each response. Again, these are only for our first experiment.

    The most frequent response in our Between-Subjects Experiments was 6.0 (moderately agree) for each condition except Low Stakes + Alternative (LS-A), which was 7.0 (strongly agree). In each condition, the percentage of participants who gave the “Neither Agree nor Disagree” response (4.0) was only between 5 and 12 percent (8.3%, 9.8%, 5.0%, & 11.7%). So in each condition, very few people gave the “WTF?! Neutral Response” you describe. Furthermore, the majority either moderately or strongly agreed with the knowledge attribution. This suggests that the vast majority of subjects were not unsure about what to say about their given vignette.

    The full details of the frequencies can be found here:

    Data Supplement (PDF)

    I can’t reproduce here the graphs of the frequencies of each response, so I’m making them available at this link as well. In short, the histograms again show that there is a strong trend toward agreement with the knowledge attribution. Neutral responses are about as frequent as disagreement responses.

    I think these data help assuage worries about participants being largely unsure about what’s being asked and how to respond. Of course, it’s possible that they still were unsure about the question and how to respond. But attaching plausibility to such a possibility seems rather speculative. I submit that the evidence positively favors the view that our participants were not confused, puzzled, or otherwise in a state that would render their responses untrustworthy as expressing their intuitive judgments about the bank cases.

    Those are just some thoughts on part of your discussion of x-phi and survey problems. I look forward to engaging more with your extremely rich paper!

  12. I can report (and I believe F&Z can as well) a similar distribution. It seems doubtful then that these data support the worry raised in the paper of interpreting null results on this particular occasion. Though I think the best thing to do here is treat this not just as a philosophical objection, but rather as a further testable hypothesis relative to the F&Z bank cases. However, just mentioning some possible way in which the study failed to detect intuitional variance due to the factors hypothesized by the view is pretty unconvincing without testing it. To me it doesn’t seem like the burden of proof ever belonged to F&Z, and especially so here.

    A general point here about the method section: Everyone agrees that it is important to scrutinize experimental practices in the social sciences. There’s no such thing as a perfect experiment—there will always be limitations. However, here’s the crucial bit: there are times when these research limitations impede one from progress in a particular research area or question, and there are times when such research limitations, while present, do not threaten the outcome of an experiment. The goal in discussing research methodology here (and in the general discussion we see of xphi), I think, should be to show through testing how one specific limitation directly influences the results in a certain way, so that we can make genuine scientific progress (in this case by moving closer to discovering what/how factors play meaningful roles in ordinary knowledge practices).

    When someone runs an experiment and finds results I don’t like, the first thing I’d probably do is try to pick apart the experimental design in an effort to invalidate the conclusion. But there is a distinction to be made between flaws in the design that confound the conclusion of a particular outcome in particular way, and criticizing an experiment because my theory is inconsistent with the implications of that conclusion.

  13. re 11 & 12: The concern about neutral results (see the bottom paragraph on p. 14) was raised only about the one F&Z survey.

    re the last two paragraphs of 12: This paper doesn’t undertake to pick apart any of the studies on the basis of issues of general survey methodology (the issues mentioned in section 1.6: I’m assuming this is the section you are referring to as the “method section”). As I write, those are issues better pursued by others (p. 12). The concerns I was raising myself about the pre-S&K experimental work are those articulated in sections 1.2-1.5. (And wrt N&P and MSH&Z, these aren’t even concerns with the studies themselves, but with people taking the results as going against contextualism.) I did think those fairly well-known methodological concerns did need to be mentioned so as to discuss how the concerns I was raising fit in with those other, better-known worries. Thus, section 1.6.

  14. Pingback: New Experiment on Bank Cases « Certain Doubts

Leave a Reply

Your email address will not be published. Required fields are marked *