Knowing the Semantic Web

There is a story in today’s New York Times about the Web 3.0, otherwise known as the W3C Semantic Web.

The idea behind the semantic web is to link content or data rather than to link syntactic strings. This is better illustrated by an example. Typing ‘Spaniard logicians’ into Google turns up pages on centuries-old dead guys. Why? There are far more historical pages on the web using the syntactic string ‘Spaniard logician’ than there are (if any at all) that list current logicians from Spain. So, if you are looking for a Spanish logician, or looking for how many logicians are from Spain, typing either ‘Spanish logician’ or ‘How many logicians are from Spain’ will not return pages that answer your question. The reason is that search engines don’t understand the content of your question. The ambitious aim of WC3 is to tackle this problem by developing tools to “understand” the content of web pages, and to “understand” the meaning of search queries.

Google is extraordinarily clever at exploiting the structural features of natural language syntax. It can recognize ‘Wh-‘ questions, for instance, and deliver answers to questions like ‘What is the GDP of Spain?’ But there are limits to this method, as evidenced by searching on ‘What is the number of Spanish logicians?’ Which is why there is interest in working on WC3 technologies. This is what the Times article is about.

What’s this got to do with epistemology? A lot. The thing that’s being worked on here is content: how to find it, how to recognize it when you do find it, how to combine it, what relationships it bears to other things you’ve found, how to recognize those relationships, how learning one thing changes other things you have discovered. I see no fundamental difference between answers to these questions and the answers to their epistemic corollaries.

I doubt that the W3C project will be a complete success, and I suspect my skepticism is shared by most readers of this blog. However, there is little doubt that there will be many partial successes. And I am certain that much will be learned about epistemic model building from the successes and failures to come.


Comments

Knowing the Semantic Web — 7 Comments

  1. Reaction #1: I agree that the successes of the W3C project will be less than “complete” (although it is interesting to note that even with ones fellow human language users there will not be “complete” success either). This brings up an interesting question: what standard or metric would one use to judge progress in this research? Wouldn’t this depend upon what you are interested in? Google may be playing a different “language-game” here and bringing in an over-arching word like “understanding” to try to link what Google desires to do and what philosophers are interested in may be blurring two issues. Let’s say that not only did Google “pass the Turing test” with respect to questions like the one about Spanish logicians but let’s say that it was statistically more likely to generate “correct” (speaking loosly) responses than a large sample of fully linguistic adults. This would be complete success for the purposes that Google wanted for this technology. But I can still hear Searle or somebody like that saying that “it is only behaving as if it understands…etc.”.

    Reaction #2: Assuming that the “blurring” charge is not true from reaction #1…Donald Davidson was tireless of emphasizing that his proposal for using the Tarskian truth theory machinary to explicate the type of recursive pattern that must exist (in our brains lets say) in order to explain some key characteristics of our linguistic abilities was simply ONE POSSIBLE way that we might ACTUALLY perform such recursive functions. In other words, as he often put it, “if I can show that here is one possible way that it can be done then this might shed some light on how we actually do it” (I think that is a direct quote from the Davidson/McDowell “In Conversation” video). This strikes me as similar to the Google W3C work and the philosopher. If Google can map out some possible way in which it could work it may very well shed some light into how it is that we actually do this type of thing. And if we take a Davidsonian or Dennett view about how content is derived then it doesn’t really matter that much anyway if Google’s algorithms aren’t “locatable” somewhere within our “hardware”. Ok, I’ll stop because now I’m just thinking out loud. I hope this helps.

  2. Hi Joe,

    Thanks for your comments. The short of my reply is that I think there is a closer relationship between epistemology and AI than there is between the philosophy of mind and AI. So, we might be focusing on two different sets of questions.

    My point about the semantic web is that there are a cluster of problems standing between where we are now and what the WC3 wants to achieve that are structurally similar to a cluster of problems in epistemology. And there is good reason to suggest that the methods and results that are generated in attacking these problems will have bearing on their epistemological cousins. I think it will have a direct bearing on our model building.

    So, to answer your first question, the progress will come in having a wider variety of methods to model and articulate problems that exercise epistemologists. I don’t know how to measure that.

    My answer to your second question is similar to Davidson’s but without making any commitments about the workings of human brains. I think that we do not clearly understand the problems that still stand between us and answer(s) to how we actually reason/plan/change view/change preferences/understand natural language/use formal languages… .

    The results of AI do not necessarily amount to empirical conjectures about the actual working of human minds, despite (often) enthusiastic claims you’ll find in abstracts and motivating examples. At bottom they are modeling proposals, which can be evaluated both empirically and formally. And now the field is poised to turn its sights on a class of problems that bear remarkable similarity to core concerns of traditional epistemology. This is where we stand to learn a great deal.

  3. Nice post. I think that it is very clear that there is a tight connection between epistemology and traditional AI. The Semantic Web (I’m refraining from the fairly new Web 3.0 buzzword) can be seen as an application of traditional AI to the Web. The interesting question for me is whether the Semantic Web raises any new philosophical issues — whether in epistemology or not — that did not come up in traditional AI. In my view, the answer is No, with some exceptions.

    A lot of the work by researchers described in the article you link to is different from what is the mainstream approach in the Semantic Web community. A primary goal of the research described in the article is to “data mine” the Web, and infer the structures of web pages. This is certainly relevant to Semantic Web, but I think the much more dominant approach to achieving the same results (a more intelligent web, an understanding of content, etc.) in the SW community is logic-based knowledge representation. The Web Ontology Language (OWL) has become a W3C standard, and I think the common view is that people will create ontologies in this language for modelling their domains of interest, that users will mark up data using these ontologies, and tools will support certain inference services for this marked up data. OWL is a KR language, and two of its three “flavors” happen to be decidable subsets of first-order logic.

    It is hard to find agreement in the community on how exactly the above goals will be met. This is not surprising, since many of these are regurgitated disagreements from previous AI debates. What is the right knowledge representation? Do we want to represent commonsense knowledge, and is representing our knowledge the right path to intelligent systems at all? If it is, do we want to use (first-order?) logic as the main tool for doing so? Etc.

    So far these are unoriginal questions. What makes the Web different in a way that may raise new problems? I can think of two things: (i) sheer size, and (ii) the ‘free-for-all’ aspects of it. (i) is obvious. By (ii) I mean that right now, nothing prevents anyone from linking to any document and commenting on it, where as traditional KR systems seemed to be much more controlled. Currently my comments, posted on my web page, about one of your documents will merely be text in some natural language. But what if instead of linking to your document, I borrowed a concept from your ontology? For example, from a project like CYC, which attempts to build a huge “commonsense” ontology, I may import the concept “Cat”. Naturally, Cat is defined in terms of other concepts, like Animal and Mammal. Does my importing of this concept imply acceptance of CYC’s view on these other concepts? (And what if these concepts are inconsistent with other things I have already defined?) If CYC changes the definitions of concepts — just like you can change text on your web page — am I still committed to them? The intuition is that the answers to both should be No, but it is hard to work out the techical details to make this happen.

    Many of these issues have been labelled as “Social Meaning” problems, see http://www.xml.com/pub/a/2003/03/05/social.html for a nice discussion.

    There are other issues that came up in traditional logic-based KR but that get a new “twist” in a Web context. One is old Open v. Closed world assumptions. In a database, closed world makes sense, and we can usually employ negation-as-failure. Does it make sense to think of the Web as a database? Again, the common answer is that it doesn’t, since OWL has open-world semantics, but this is actively contested. Some proposed a “local” (or “scoped”) negation-as-failure, which is relativized to documents, but I have not seen the technical details worked out on this either. In fact, anything having to do with scoping and negation in this context is really hard to get right..

  4. Your comment is very rich, Yarden. I’ll try to be brief.

    I think the contribution that several (but not all) branches of AI can make to epistemology is by offering very clean and clear modeling tools. There is a comment Robert Stalnaker has made about modal logic. I think it is in one of the Blackwell handbooks, perhaps the one on philosophical logic. The comment is that what modal logic allows us to do is to articulate problems involving modalities much clearer than we are able to manage without it, and a consequence of having this very clean and clear tool for representing modalities (relational structures) opens new philosophical ground. I think this is exactly right.

    Does AI truly offer improved methods? I think it does, but it has a marketing problem: it still has a snake-oil salesman’s reputation in philosophy. This is mostly due to sociological noise “progressives versus the hand-wavy old guard”, or “technological know-nothings tossing computers at problems that have withstood resolution by giants standing on the shoulders of the giants of Western Civilization”. (Sigh).

    But, if you were to look at the similarity of structure of many problems in both fields, the clarity of constraints demanded of solutions in many AI problems, and the results that are coming out of AI, I think there are signs of progress on methods that are relevant to epistemology.

    Let me try a very general example. It concerns “systematic approaches” to problems. AI researchers are much less willing to collapse a diachronic system to a static model, and will (often) insist on dynamic models from the start, study how the other attributes behave through time and with each other. Right? And we do this because enough experience has taught us that building up a model and then tossing time in later rarely goes well.

    Now turn back to philosophy. You’ve always had “systems” philosophers making more or less similar points about analogous epistemological issues, but they have tended to take a back seat in analytic philosophy in the last 50 years or so. The reason, I suspect, is that the field’s standards for clarity are keyed to a few basic tools (logic, probability, and basic modal logic) which are essentially methods for static models. One person’s modus ponens is another’s modus tollens, to be sure, but you’ve got the upper hand if your negated consequent is necessarily false in the field’s Style Guide. And I note some continued resistance in the field to natural extensions of the later two to allow them to behave more dynamically.

    AI, despite its enthusiasm, or maybe because of it, doesn’t have this institutional habit of tying its hands behind its back regarding methods. It will try just about anything as a representation language, study that representational language qua language, and tell you what it is and is not good for.

  5. PS: Yarden, do you know what is going on with Cycorp lately? I’ve lost touch with news on them in the last few years. [Perhaps that is an email question.] Also, do you have a reference for a survey paper for scoped negation as failure?

  6. More detailed reply in email (sorry for the delay, I seem to have failed passing the “CAPTCHA” for your address on your homepage the first time :)).

    A relevant reference is:

    Axel Polleres, Cristina Feier, and Andreas Harth. Rules with contextually scoped negation. In Proceedings of the 3rd European Semantic Web Conference (ESWC2006), volume 4011 of Lecture Notes in Computer Science, Budva, Montenegro, June 2006. Springer.

  7. Pingback: CrunchyLogic » Blog Archive » AI, Semantic Web and Epistemology

Leave a Reply

Your email address will not be published. Required fields are marked *