Archaeological Data Integration for the Study of Long-Term Human and Social Dynamics



A lack of millennial- or centennial-scale data seriously impairs scientific investigations of social and socioenvironmental systems. In developing and testing socioecological models, we must do more than project recent observations-reflecting at most a few decades-into the past or future. Archaeology can provide the long-term data on societies and environments that are needed to better illuminate such critical topics as demography, economy and social stability. The complexities of archaeological data, lack of data comparability across projects and limited access to primary data have crippled current efforts to understand phenomena operating on large spatiotemporal scales. Nonetheless, the potential for archaeological insights to contribute to the study of long-term human and social dynamics is enormous; the fundamental challenge is to enable scientifically meaningful integration and use of the expanding corpus of archaeological data.

Intellectual Merit. A two-year long investigation of the information-integration demands of archaeology-supported by a National Science Foundation HSD exploratory grant-revealed fundamental technical challenges that cannot be handled by a straightforward adaptation of existing technologies. In response, a team of archaeologists and computer scientists proposes to implement a Knowledge-Based Archaeological Data Integration System (KADIS) that employs novel, query-driven ad hoc data integration strategies. Once archaeologists have registered datasets though KADIS, researchers across scientific disciplines could, over the Web, extract sensibly integrated and appropriately scaled databases of analytically comparable observations from numerous archaeological datasets gathered using incommensurate recording protocols. Although initial development of KADIS will focus on fauna from archaeological contexts, it establishes an open-source, extensible foundation for a global, archaeological information infrastructure. The project will establish the capacity to build and access a worldwide archive of primary data representing the full history of human use of animals. Concept-oriented queries of this archive will advance socioecological modeling efforts and allow scientists to address large-scale and long-term social and natural science questions with empirical support that has heretofore been unthinkable. Testbed research will investigate the socioenvironmental conditions that lead to depressed abundance of preferred game-over two millennia in two US regions.

Broader Scientific Impacts. Query-driven, ad hoc integration architecture will be applicable to many other science informatics domains in which complex inferences need to be made over multiple heterogeneous, inconsistent, and context-dependent sources. Using KADIS, specialists in other fields could use intermediate-level archaeological knowledge to obtain to primary data scaled to the scope of their inquiries. By providing scholars in diverse fields with meaningful access to long-term data on society, population, and environment, archaeology can help explain the complex human and social dynamics that have constituted today's social world and shaped the modern environment.

Impacts on the infrastructure of social and natural science extend far beyond the traditional boundaries of academia. KADIS addresses critical needs of private, tribal, and governmental archaeology programs. In addition, it enables serious archaeological research by individuals outside academia and those lacking physical or financial capacity to do fieldwork. It provides a means to maintain the long-term utility and accessibility of irreplaceable primary data in the face of inadequate metadata and rapidly changing technology.

Broader Societal and Educational Impacts. This research will engage a multidisciplinary team of graduate assistants and undergraduate interns and will be a testbed for Computer Science students to explore key issues of science informatics. Undergraduates worldwide can become a new community of users as critical thinking exercises in anthropology courses are redesigned to employ large-scale research datasets using KADIS, rather than the artificial data usually analyzed.


National Science Foundation


November 2006 - October 2010