More X-Phi on Fake Barn Intuitions

Experimental results in social psychology are plagued by failure to replicate and also by being based on WEIRD samples. In line with the latter concern, Wesley Buckwalter, Stephen Stich, Edouard Machery and Dave Colaco have a recent paper in Episteme, titled ‘Epistemic Intuitions in Fake-Barn Thought Experiments’.

They found that, while participants respond generally in line with the intuition that fake-barn cases are cases of knowledge, older individuals seem to be the exception; the older the participant is, the less likely they think that the subject in the fake-barn style case knows. Here is a visualization of the findings (the Y-axis represents a 0-6 Likert scale anchored at 0 with ‘Doesn’t Know’, and 6 with ‘Knows’ the X- axis represents age in years):

Image and video hosting by TinyPic
Note that they did not find a similar trend in the control; it is not the case that older participants were simply less likely to attribute knowledge in general.

Next: attempts to replicate?


Comments

More X-Phi on Fake Barn Intuitions — 25 Comments

  1. Given that barn facade cases obviously aren’t cases of knowledge, the data here seems to support the general hypothesis that elders know best 🙂

  2. The basic finding that people attribute knowledge in fake-barn-style cases has been replicated many times — by different researchers, with different cover stories, and with different dependent measures. By this point, it’s clear that these are viewed as cases of knowledge.

    The age-finding is, as far as I’m aware, the first of it’s kind, so we’ll probably want to wait for more evidence before drawing firm conclusions on that. I follow the authors in treating this as a sign that further investigation is warranted. In the meantime, I applaud them for this valuable work and for raising our consciousness about the potential importance of controlling for age-related differences.

  3. This is a really nice study, very well-designed and with a super-surprising result. (It’s pretty exciting to see these age effects on epistemic intuitions.) I would be happy to run a quick replication on mturk to see if we can get this demographic effect to come out again.

    David et al., if you send me the full materials, I can run the replication and then post the results here on this blog later this week.

    (By the way, if any of you are interested in general in seeing which experimental philosophy studies have successfully replicated and which have not, you can find all the results up at http://pantheon.yale.edu/~jk762/xphipage/Experimental%20Philosophy-Replications.html)

  4. I think younger people are more likely to have had significant exposure to virtual reality in a broad sense. I think with the advent of social media people online relationships are considered more meaningful, more like when relationships were once developed by meeting in person at a coffee shop. I think more people are receptive to the idea of artificial life (AL) for instance Conway’s Cellular Automata CA)”Game of Life” which is lifelike enough that people speculate about whether CAs can support sentience where the line between the tangible (real) and ephemeral blurs and is hard to distinguish. I just read that the Turing Test, one lasting 5 minutes, has been passed. The movie “Terminator” was famous for Skynet achieving self-awareness and trying to destroy humanity. Two more AI movies which achieve transcendence were released in 2013. The movement (which I consider a cult) which warns about the Evil lurking in heart of Super AIs has gained momentum over the years. I think more people are living in their heads with concepts with less physical reality to ground them, and this is a cultural swell inundating common sense which most relies on acknowledging limitations including one’s own. In the old days, the remedy was humility = a virtue, which took considerable experience and failure to acquire.

    So I think there is a strong cultural geek speak influence that younger people are growing up under which has more impact now than 30 years ago, which would be hard to measure.

    I think the fake barns counterexample arising is a natural and widespread consequence of the type of a class of concepts which arise and defy a complete definition: Truth, Beauty, Knowledge etc. The older folk are likely to have read John Myhill, whereas maybe for the younger folk, he was before their time.

    Hofstadter in Metamagical Themas describes Myhill’s Beauty as an example outside of mathematical sets,
    “We finally come to the prospective, also known as the productive. Myhill’s characterization of it is this: “A prospective character is one which we cannot either recognize or create by a series of reasoned but in general unpredictable acts.” Thus it is neither effective nor constructive. It eludes production by any finite set of rules. However-and this is important-it can be approximated to a higher and higher degree of accuracy by a series of bigger and better sets of generative rules. Such rules tell you (or a machine) how to churn out members of this prospective category. In mathematical logic, works by Tarski and Gödel establish that truth has this open-ended, prospective character. This means that you can produce all sorts of examples of truths -unlimitedly many- but no set of rules is ever sufficient to characterize them all. The prospective character eludes capture in any finite net.
    As his prime example outside of mathematical logic of this quality, Myhill suggests beauty. As he puts it:”

    “Not only can we not guarantee to recognize it [beauty] when we encounter it, but also there exists no formula or attitude, such as that in which the romantics believed, which can be counted upon, even in a hypothetical infinitely protracted lifetime, to create all the beauty that there is.”

    SH: I don’t think a complete definition exists for Justified or True or Belief (JTB) so a definition using those terms (or any such similar collection of conceptually abstract related grouped terms) will always leave finite gaps for exceptions to the rule to escape.

  5. David Colaco just wrote to me with some very helpful points about this proposed replication. Participants in Mechanical Turk studies tend to be in the 18-40 age range. I had originally been thinking that this range would be sufficient to replicate the effect, but David rightly points out that the only way to do this adequately would be to include participants in their 60’s, 70’s and 80’s.

    An inspection of the scatterplot shows that there is almost no effect of age within the 18-40 range. Participants within that range pretty much all see this as a case of knowledge. The entire effect is basically driven by the participants who are considerably older. Their responses are all over the place, yielding a mean response that is right about at the midpoint of the scale.

    So, sadly, just as Colaco suggests, it seems that mturk will not be appropriate for this study, and it would be necessary to use some other method. (Personally, my guess is that this is a real effect and that it would replicate successfully.)

  6. Just eyeballing it, it looks to me like any effect starts at 50 rather than 60. Anyway, about 5% of my murk participants tend to be ~60+, and it looks like David and colleagues had 7 participants (out of about 85) in the fake-barn condition who were 60+. So these seem to be roughly comparable, no?

  7. Thanks, everyone, for the comments so far. I would like to attempt to replicate the findings of our study. If it is the case that m-turk has an age range comparable to that of the original study, then I would be fine running a replication study using it. Repeating the original protocol is both laborious and costly; if this is a viable alternative (I have only marginal experience using m-turk), then it is worth replicating online.

  8. Hi, Josh. As I recall, in the studies I’ve done on epistemic judgments, I’ve only seen one age-effect and it was small. Given the number of comparisons involved in this work, I’d have expected to see a false positive more than once by now, and that has led me to suspect that neither age nor things closely correlated with age significantly affect epistemic judgments. (Caveats: [1] a lot of this work has not used scaled knowledge probes, whereas David and colleagues did use these; [2] this work was done online, whereas David and colleagues administered their study in person.) That’s part of the reason why I think that David and colleagues were wise to conservatively interpret their finding in this case. Another part of the reason is that demographic effects can be pretty volatile and it often takes a number of studies to figure out just what’s going on with them.

    I would like to emphasize, again, that I think very highly of this paper, that I am 100% convinced by the overall pattern of high knowledge attribution, and that the authors responsibly reported and interpreted the surprising demographic difference that they observed. It’s important that people not shy away from reporting such observations and that we, as a community, evaluate them with care, respect, and gratitude.

  9. Just chiming in with a few additional points:

    – I wanted to point out how impressive and gracious David’s response (and the discussion as a whole) has been. It stands in stark contrast to almost all of the discussion surrounding replication in psychology, if anyone’s been following that.

    – I also think that replicating this experiment online is an excellent idea. As a couple people have pointed out the number of participants older than 50 was small, so in replicating, it would be good to collect around 2.5 * the original sample size. This has become pretty standard when replicating experiments, and luckily mturk is very cheap.

    – Regardless of what happens with the replication of this one unpredicted effected, the central findings reported in David’s nice paper also seem worthy of further discussion!

  10. Hi All,

    I realise that this thread has died off a while ago, but I had some questions about these surveys because of some recent discussion on facebook about them.

    It seems that some results indicate that the folk are likely to ascribe knowledge in some fake barn cases or cases said to be similar to them. There are also results, however, that indicate that the folk are not likely to ascribe knowledge in some fake barn cases. Nagel et. al. found 59% of respondents saying that the subject didn’t know that the building they saw was a barn.

    Looking at Nagel’s paper and Colaco’s paper, some differences jumped out and I thought that these might deserve some discussion/consideration.

    Let’s look at Nagel’s prompt first. In “Lay Denial of Knowledge for Justified True Beliefs”, Nagel et al offered this case:

    (NC) Emma is shopping for jewelry. She goes into a nice- looking store, and selects a diamond necklace from a tray marked “Diamond Earrings and Pendants”. “What a lovely diamond!” she says as she tries it on. Emma could not tell the difference between a real diamond and a cubic zirconium fake just by looking or touching. In fact, this particular store has a very dishonest employee who has been stealing real diamonds and replacing them with fakes; in the tray Emma chose almost all of the pendants had cubic zirconium stones rather than diamonds (but the one she chose happened to be real).

    Her comprehension question asked what kind of stone Emma tried on. Respondents were asked whether or not Emma knew that the stone was a diamond.

    Compare this to the prompt and questions Colaco et al offer:
    (CC) Gerald is driving through the countryside with his young son Andrew. Along the way he sees numerous objects and points them out to his son. ‘That’s a cow, Andrew,’ Gerald says, ‘and that over there is a house where farmers live.’ Gerald has no doubt about what the objects are. What Gerald and Andrew do not realize is the area they are driving through was recently hit by a very serious tornado. This tornado did not harm any of the animals, but did destroy most buildings. In an effort to maintain the rural area’s tourist industry, local townspeople built house façades in the place of destroyed houses. These façades look exactly like real houses from the road, but are only for looks and cannot be used as actual housing.

    In the high-defeater scenario, the story ends as follows:
    Though he has only recently entered the tornado-ravaged area, Gerald has already encountered a large number of house façades. However, when he tells Andrew ‘That’s a house,’ the object he sees and points at is a real house that has survived the tornado.

    Participants were then asked the following three questions about one of the objects in the brackets:
    (1) Comprehension Question: Does Gerald think he saw a house?
    (2) Comprehension Question: Did Gerald see a house?
    (3) Knowledge Question: Does Gerald know he saw a house?

    There’s a difference in the questions that might matter here. Colaco et al asked a comprehension question about what Gerald saw and the knowledge ascription they focus on is knowledge of what he saw. Nagel’s comprehension question doesn’t focus on what Emma saw, but on what Emma wore. Her focus is on whether Emma knew the stone was a diamond.

    Lots of people seem to run together objectual seeing with propositional seeing (seeing a diamond vs seeing that the stone is a diamond) and lots of people seem to think that propositional seeing is either sufficient for knowing or sufficient for being in a position to know. (I did a survey last night on one non-philosopher just to confirm that the folk often get confused when trying to work out the difference between objectual and propositional seeing.) I wouldn’t be surprised if respondents are often disposed to confuse the two kinds of seeing and take propositional seeing to be nearly sufficient for knowledge. If that’s right, though, isn’t there a worry here about using objectual seeing questions as a comprehension question in probing folk intuitions?

    I get the sense that the data from Colaco et al is being used to challenge a philosophical view that seems to be confirmed by data that we get from Nagel et al, but if the questions differ in significant ways, I’m not sure that this is how we should be using the data now until we have a clearer picture of the influence that judgments of objectual seeing have on their disposition to ascribe knowledge.

    If we step back from this for a second and just think about Nagel’s case for a moment, it is weird that there would be people who’d say that Emma cannot distinguish visually a diamond from the non-diamonds she sees and then insist that she nevertheless knows that the stone is a diamond, no? If you put it that way to the folk and we tell them that that’s what they think, I’d be surprised if they said this captured the way they thought about the cases.

    • Hi Clayton,

      These are good questions. I have a couple thoughts in response, along with some new results.

      First, it would help if you said more about the Nagel et al. results. In particular, was there a minimally-matched control and, if so, what was the rate of knowledge attribution in it?

      Second, you suggest that it could be problematic to ask a comprehension question about objectual seeing because people are apt to confuse objectual and propositional seeing. However, it’s not clear that there is any confusion at all on the point in this case. Instead, it seems that the person sees that it’s a barn. (If you asked me why, I’d say it’s because he sees it and this causes him to believe that it’s there, in the typical way perception causes beliefs.)

      Third, I just ran a quick follow-up (N = 43) on the high-defeater fake-barn case without the comprehension questions. The scale ran 0 (“doesn’t know”) – 6 (“knows”). Mean response to the knowledge question (M = 4.14, SD = 1.70) was significantly above the midpoint (=3), p less than .001 (two-tailed), and it did not differ significantly from what Calaco et al. originally reported (=4.51, p = .160, two-tailed). Mode response was 6 (i.e the highest score). Over 67% of people selected options higher than the midpoint (i.e. 4, 5, or 6).

      So the basic finding can’t be due to asking the comprehension question. And even if it were, it’s unclear that it would be problematic.

  11. Hi John,

    Thanks for your reply. For the details of Nagel’s control, you’d have to ask her or check her work.

    A quick thing about this:
    “Second, you suggest that it could be problematic to ask a comprehension question about objectual seeing because people are apt to confuse objectual and propositional seeing. However, it’s not clear that there is any confusion at all on the point in this case. Instead, it seems that the person sees that it’s a barn. (If you asked me why, I’d say it’s because he sees it and this causes him to believe that it’s there, in the typical way perception causes beliefs.)”

    Isn’t it controversial that the person sees that it’s a barn? I know that you think that. Colin McGinn says something similar in “The Concept of Knowledge”. I guess that’s also Sosa’s view. Others deny it. McDowell and Pritchard, for example, deny it. I think Craig French does, too. Millar’s another possible denier. One argument for denial might just be this: you cannot see that a is an F if you cannot tell from a’s look whether a is an F. When a is surrounded by non-Fs that share a’s look, you cannot tell from a’s look whether a is an F. That’s why I lean towards denying that subjects in fake barn cases can see that the building is a barn, although they clearly can see a barn.

    The data you offer seems to suggest that the comprehension question isn’t likely to skew results greatly (although it would also be interesting to look at rates of response for the different target propositions since your questions were about knowing something about what’s seen and Nagel’s weren’t), but you can see now why I’d worry about the inclusion of a question about objectual seeing. I think the question as to whether the fake barn case is a case of propositional seeing is as controversial as the question as to whether it’s a case of propositional knowledge.

    • Hi Clayton,

      Thanks for your further thoughts!

      Starting with your last point, I’m unwilling to infer much here about potential differences without putting them into an experimental design alongside minimally-matched controls. What you’re interpreting as a difference between questions could be due to lots of other differences across the stories, procedures and analyses.

      Coming back to your first point, I recognize that some philosophers have denied that the person sees that it’s a barn, but I don’t think it actually is controversial.

      To test this, I ran the case again with all new participants. This time, I had people agree on a 0 (completely disagree) – 6 (completely agree) scale with two statements:

      (See) Gerald sees that the object he’s pointing to is a house.
      (Know) Gerald knows that the object he’s pointing to is a house.

      Participants always answered them in that order and on different screens. I asked a comprehension question on a separate, subsequent screen (“Did Gerald actually see a barn?”).

      After filtering for comprehension failures, N = 40. Here is the mean response to each statement, along with standard deviations and the p-value for a comparison against the neutral midpoint (=3):

      See: M = 5.28, SD = 0.82, p < .001
      Know: M = 4.58, SD = 1.32, p < .001

      No one "completely disagreed" with either statement. In fact, no one even "slightly disagreed" with See: the lowest response to See was "neutral." The mode response to each statement was "completely agree" (=6).

      Comparing scores for See and Know, it turns out that people more strongly agree with See than with Know, p < .001.

      From these results, I conclude two things. (1) Both the seeing-that attribution and the knowledge attribution are uncontroversial. (2) The seeing-that-attribution is even more uncontroversial than the knowledge attribution.

  12. Hi John,

    “From these results, I conclude two things. (1) Both the seeing-that attribution and the knowledge attribution are uncontroversial. (2) The seeing-that-attribution is even more uncontroversial than the knowledge attribution.”

    Uncontroversial if you set aside the philosophers!

    I find these results really interesting, but I don’t know what to make of them. Since I think fake barn cases are clear non-knowledge cases, I’m really curious to know more about what’s going on in the head of the folk who respond differently. I can’t tell from your comment, but do you think we should listen to the folk? I’m happy to take account of what they say, but I’m not inclined to say that the answer to a question is uncontroversial simply because there’s folk agreement.

    Here’s a case that worries me:

    Lucky Penny
    Jill has a lucky penny that she’s dubbed ‘Lucky Penny’. She hasn’t seen other pennies before. Her brother stole her lucky penny and took it with him to school. He dropped it. Someone picked it up but later dropped it. It worked its way across the city. A week later Jill was on a school trip when she looked down and saw a penny that happened to be her lucky penny. “It’s my lucky penny!” she said. The penny caused her belief that it’s her lucky penny in the normal way.

    I don’t think Jill knows that this is her lucky penny. I don’t think Jill saw that this was her lucky penny. On the gloss of propositional seeing above, though, Jill meets the conditions for seeing that this is her lucky penny. If the folk ascribe knowledge in this kind of case and say that Jill can see that this is her lucky penny, I’d be very worried about their competence in the authentic evidence cases.

  13. Hey Clayton,

    I’m 100% with you in wanting to understand what’s going on to cause this intuitive disagreement! For my part, I suspect it comes down to whether people think you can see the barn (or otherwise detect it). As it turns out, most people straightaway agree that you can and do detect it. A small minority of people think otherwise. The disagreement about knowledge reduces to a prior, implicit disagreement about perceptual ability. (This is not surprising because knowledge just is true belief through ability.)

    As to whether we should listen to the folk, I think it’s important to keep in mind how epistemologists got fixated on fake barn cases.

    When Goldman popularized the case, he motivated it by making two behavioral claims (“Discrimination and Perceptual Knowledge,” pp. 772-3). On the one hand, when Henry looks at the barn in an ordinary setting, “Most of us would have little hesitation in saying” that “Henry *knows* that the object is a barn.” On the other hand, information is added about all the nearby fakes, “We would be strongly inclined” to not attribute knowledge. Then he asks, “How is this change in our assessment to be explained?”

    But the second behavioral claim is demonstrably false: people are actually strongly inclined to attribute knowledge. The only way to demonstrate this, though, is to actually listen to the folk. So, in this respect, we not only should but must listen in order to know what’s what.

    It’s possible that there could be compelling arguments for counting the high rates of knowledge attribution in these cases as performance error. But I’d say we’re well past the point of any quick or easy way to that conclusion.

    For what it’s worth, the way Jill’s case is described, I think that she obviously sees and knows that it’s her lucky penny. I’d be surprised if most people thought otherwise.

  14. I ran the “Lucky Penny” case, with two small changes (omitting “that she’s dubbed ‘Lucky Penny’” and “The penny caused her belief that it’s her lucky penny in the normal way”).

    Jill has a lucky penny. She hasn’t seen other pennies before. Her brother stole her lucky penny and took it with him to school. He dropped it. Someone picked it up but later dropped it. It worked its way across the city. A week later Jill was on a school trip when she looked down and saw a penny that happened to be her lucky penny. “It’s my lucky penny!” she said.

    People rated two statements and answered one comprehension question:

    (See) Jill sees that it’s her lucky penny. (0-6)
    (Know) Jill knows that it’s her lucky penny. (0-6)
    (Comp) Did Jill see her lucky penny? (Y/N)

    I used the same 0 (strongly disagree) – 6 (strongly agree) scale as above, one question per page, in that same order, no going back. After eliminating comprehension failures, N = 40. Here is the mean response to each statement, along with standard deviations and the p-value for a comparison against the neutral midpoint (=3):

    See: M = 4.80, SD = 1.36, p less than .001
    Know: M = 4.18, SD = 1.63, p less than .001

    Mode response was “agree” (=5) for See, and it was “slightly agree” (=4) and “strongly agree” (=6) for know. 85% of participants agreed to some extent to See, as did 73% to Know.

    Comparing scores for See and Know, it turns out that people more strongly agree with See than with Know, p less than .007.

    In sum, this was an excellent conceptual replication of the findings reported above and of previous findings on fake barn cases.

  15. Hi John,

    That is a very interesting result! I’m thinking of writing something up on the use and misuse of barns. Is it alright if I mention this data?

    I was a bit tired yesterday because I was working on a number of different things at once and had made the terrible mistake of taking a very powerful night time cold medicine that’s probably been banned in most Western nations, so there was something that I should have caught earlier but didn’t. My initial worry was that exposing subjects to a comprehension question that uses ‘sees’ might muddy their intuitions. If I’m following, all the data from above (from Colaco’s paper? Not sure) involved subjects who were exposed to a see-question prior to a knows-question. Is that right? If so, it doesn’t quite address the question as to whether a subject’s disposition to ascribe knowledge will be influenced by the salience of a sees-judgment. So, the hypothesis that the inclusion of a sees-comprehension question will partially account for the differences between these results and Nagel’s hasn’t been disconfirmed (not if I’m tracking, which I might not be).

    Anyway, the larger project is this. I recently wrote a paper explaining why the virtue-theorists shouldn’t think of fake barn accuracy as attributable to ability (http://philpapers.org/rec/LITFBA). (I thought I’d be following your lead here and defending a view that you liked, but I guess I failed! I hadn’t seen the empirical stuff when I wrote the paper up.) I did this to defend some conditionals that I think we do both like, conditionals that link knowledge to things like warranted assertability and the like. I’m worried about the use of fake barn cases as counterexamples to various k-accounts (e.g., k-accounts of warranted assertion, evidence, F-ing for reasons, etc.), so the aim is to explain why we shouldn’t think of fake barn cases as cases of warranted assertion without k, evidence without k, F-ing for the reason that p without k, etc. So, one way to go is to argue that the authentic evidence cases are actually cases of k, in which case the purported counterexamples fail. The other, which I’m exploring, is to try to offer an error-theory that explains why philosophers are mistaken in treating these as cases of warranted assertion, evidence possession, F-ing for the reason that p, etc. Looks like we’ll have to disagree about something, but I’m glad that we’ll always have our conditionals!

  16. Hi Clayton,

    Sure, it’s fine to mention this data (though keep in mind that these were pointed follow-ups designed to answer very specific questions about the original findings, and I did not include control conditions).

    Not all of the data above were from people initially exposed to a prior see-question. The follow-up reported on Sept 4, 10:57am included no comprehension questions. Mean response was above midpoint and no significantly different from Colaco et al.’s original result. The basic finding is not driven by the see-question.

    I do like the conditionals approach!

Leave a Reply

Your email address will not be published. Required fields are marked *