Aristotle (384-322 BCE) was remarkable for the depth and scope of his knowledge, which included mastery of a wide range of topics from medicine and philosophy to physics and biology. Aristotle not only had command over a significant portion of the world's knowledge, but he was also able to explain this knowledge to others, most famously, though briefly, to Alexander the Great.
Today, the knowledge available to humankind is so extensive that it is not possible for a single person to assimilate it all. This is forcing us to become much more specialized, further narrowing our worldview and making interdisciplinary collaboration increasingly difficult. Thus, researchers in one narrow field may be completely unaware of relevant progress being made in other neighboring disciplines. Even within a single discipline, researchers often find themselves drowning in new results. MEDLINE, (1) for example, is an archive of 4,600 medical publications in 30 languages, containing over 12 million publications, with 2,000 added daily.
Making the full range of scientific knowledge accessible and intelligible might involve anything from simply retrieving facts to answering a complex set of interdependent questions and providing appropriate justifications for those answers. Retrieval of simple facts might be achieved by information-extraction systems searching and extracting information from a large corpus of text, such as Voorheese (2003). But aside from the simplicity of the types of questions such advanced retrieval systems are designed to answer, they are only capable of retrieving "answers"--and justifications for those answers--that already exist in the corpus. Knowledge-based question-answering systems, by contrast, though generally more computationally intense, are capable of generating answers and appropriate justifications and explanations that are not found in texts. This capability may be the only way to bridge some interdisciplinary gaps where little or no documentation currently exists.
Project Halo is a multistaged effort aimed at creating Digital Aristotle (DA), an application encompassing much of the world's scientific knowledge and capable of answering novel questions through advanced problem solving. DA will act both as a tutor capable of instructing students in the sciences and as a research assistant with broad interdisciplinary skills, able to help scientists in their work. The final DA will differ from classical expert systems in four important ways.
First, in speed and ease of knowledge formulation. Classical expert systems required years to perfect and highly skilled knowledge engineers to craft them; Digital Aristotle will provide tools to facilitate rapid knowledge formulation by domain experts with little or no help from knowledge engineers.
Second, in coverage. Classical expert systems were narrowly focused on the single topic for which they were specifically designed; DA will over time encompass much of the world's scientific knowledge.
Third, in reasoning techniques. Classical expert systems mostly employed a single inference technology; DA will employ multiple technologies and problem solving methods.
Fourth, in explanations. Classical expert systems produced explanations derived directly from inference proof trees; DA will produce concise explanations, appropriate to the domain and the user's level of expertise.
Adoption by communities of subject matter experts of the Project Halo tools and methodologies is critical to the success of DA. These tools will empower scientists and educators to build the peer-reviewed, machine-processable knowledge that will form the foundation for Digital Aristotle.
The Halo Pilot
The pilot phase of Project Halo was a six-month effort to set the stage for a long-term research and development effort aimed at creating Digital Aristotle. The primary objective was to evaluate the state of the art in applied KR&R systems. Understanding the performance characteristics of these technologies was considered to be especially critical to DA, as they are expected to form the basis of its reasoning capabilities. The first objectives were to identify and engage leaders in the field and to develop suitable evaluation methodologies; the project was also designed to help in the determination of a research and development roadmap for KR&R systems. Finally, the project adopted principles of scientific transparency aimed at producing understandable, reproducible results.
Vulcan undertook a formal bidding process to identify teams to participate in the pilot. Criteria for selection included a well-established and mature technology and a worldclass team with a track record of government and private funding. Three teams were contracted to participate in the evaluation: a team led by SRI International with substantial contributions from Boeing Phantom Works and the University of Texas at Austin; a team from Cycorp; and a team from Ontoprise.
Significant attention was given to selecting a proper domain for the evaluation. It was important, given the limited scope of this phase of the project, to adapt an existing, well known evaluation methodology with easily understood and objective standards. First a decision was made to focus on a "hard" science and, more specifically, on a textbook presentation of some part of that science. Several standardized test formats were also examined. In the end, a 70-page subset of introductory college-level advanced placement (AP) chemistry was selected because it was reasonably self-contained and did not require solutions to other hard AI problems, such as representing and reasoning with uncertainty, or understanding diagrams (Brown, LeMay, and Bursten 2003). This latter consideration, for example, argued against selecting physics as a domain.
Table 1 lists the topics in the chemistry syllabus. Topics included stoichiometry calculations with chemical formulas; aqueous reactions and solution stoichiometry; and chemical equilibrium. Background material was also identified to make the selected chapters more fully self-contained. (2)
This scope was large enough to support a large variety of novel, and hence unanticipated, question types. One analysis of the syllabus identified nearly 100 distinct chemistry laws, suggesting that it was rich enough to require complex inference. It was also small enough to be represented relatively quickly--which was essential because the three Halo teams were allocated only four months to create formal encodings of the chemistry syllabus. This amount of time was deemed sufficient to construct detailed solutions that leveraged the existing technologies, yet was too brief to allow significant revisions to the teams' platforms. Hence, by design, we were able to avoid undue customization to the task domain and thus to create a true evaluation of the state of the art of KR&R technologies.
Nevertheless, at the outset of the project it was completely unclear whether competent systems could be built. In fact, Vulcan's secret intent was to set such a high bar for success that the experiment would expose the weaknesses in KR&R technologies and determine whether these technologies could form the foundation of DA. The teams accepted the challenge with trepidation caused by several factors, including the mystery of working in a new domain with the novel performance task of answering hard, and highly varied, advanced placement questions; and generating coherent explanations in English--all within four months.
The Technology
The three teams had to address the same set of issues: knowledge formation, question answering, and explanation generation (Barker et al. 2004; Angele et al. 2003; and Witbrock and Matthews 2003.) They all built knowledge bases in a formal language and relied on knowledge engineers to encode the requisite knowledge. Furthermore, all the teams used automated deductive inference to answer questions. Despite these high-level similarities, the teams' approaches differed in some interesting ways, especially with respect to explanation generation.
Knowledge Formation
Each system achieved significant coverage of the parts of the domain represented by the syllabus and was able to use that coverage to answer substantial numbers of novel questions. All three systems used class taxonomies, such as the one illustrated in figure 1, to organize concepts such as acids, physical constants, and reactions; represented properties of classes using relations; and used rules to represent complex relationships.
Domain-Driven Versus Question-Driven Knowledge Formation