History

Overview

tDAR has undergone some significant changes over the past few years, starting out as a NSF project and growing into an international archive for archaeological data.
 

Data Integration & tDAR: the Early Years

tDAR originated out of the attempt to solve a major research  challenge in archaeology — how to synthesize systematically collected data recorded using different coding conventions, across multiple data sets and sites.  A team of archaeologists and computer scientists at Arizona State University (ASU), led by archaeologist Keith Kintigh and computer scientist K. Selçuk Candan, initiated this work.  Based on a proposal from ASU faculty, the National Science Foundation (NSF) funded a 2004 workshop that was held in Santa Barbara, California. The 31 participants, drawn from archaeology and computer science — many of them nominated by professional organizations — developed recommendations concerning archaeology’s need for information infrastructure that were published in American Antiquity in 2006.  The recommendations of that report were endorsed by the Society for American Archaeology, the Society for Historical Archaeology, and the American Association of Physical Anthropologists.  Based on these recommendations, in 2006, NSF funded an initiative to develop a prototype digital information infrastructure — tDAR, the Digital Archaeological Record.  The goal of this research was to develop tools for synthetic and comparative research based on novel, on-the-fly, ontology-based data integration to be deployed and tested in the context of the prototype infrastructure. The grant used archaeological fauna (animal bones recovered in archaeological contexts) as the material focus for the data integration development. The grant was unique in its use of practical applications for the technology as a core requirement from day-one and in the interdisciplinary approach used.  The grant also engaged a national group of faunal experts to assist in developing the necessary knowledge base and to test the system through infrastructure-enabled research on resource depression (e.g., overhunting). 

The NSF effort required development of a user interface that included discovery and access and ingest of information resources. The initial version of the infrastructure and its interface was based on GEON, a geosciences infrastructure implemented at the San Diego Supercomputer Center. The GEON infrastructure, however, was not found to be the appropriate fit for the tDAR requirements, and the application was migrated to use a J2EE enterprise development platform based on Struts2, Hibernate, and PostgreSQL. The interface implementation was led and largely executed by Allen Lee, a professional software engineer at ASU, with expert consultation by Dr. Candan.

The Andrew W. Mellon Foundation’s interest in supporting scholarly communication among archaeologists led it, in 2006, to convene a multi-institutional group of archaeologists to plan the development of a digital repository for archaeological data.  That group, led by Kintigh and then called archaeoinformatics.org, wrote a planning grant that the Foundation funded in 2007.  That planning grant was largely focused on developing an organizational structure and business model that could support a self-sufficient digital repository that focused on preservation and access.

Becoming a Production Digital Repository

In 2008, the Andrew W. Mellon Foundation funded the proposal developed by archaeolinformatics.org. That grant enabled tDAR to move from the prototype phase into a production digital repository, and established Digital Antiquity as an organization explicitly designed to ensure the self-sufficiency and overall sustainability of the project. The effort was overseen by a Board of Directors, the core of which was formed by the archaeoinformatics.org group that wrote the proposal. Digital Antiquity hired Frank McManamon as a full-time Executive Director.  The professional staff, led by Director of Technology Adam Brin, Allen Lee, and Matt Cordial, developed the production version of tDAR. It was designed to build upon the existing tDAR infrastructure while adding a digital repository backend — initially envisioned as leveraging the Fedora software but ultimately using a more light-weight microservices approach for the storage, management, and preservation of the repository.  Initial goals for the production software included enhancement of the data entry, storage, and preservation architecture as well as adding tools for confidentiality and resource management.

Building a Trans-Atlantic Gateway

Funded jointly by the UK’s JISC (Joint Information Systems Committee) and the US National Endowment for the Humanities, The Trans-Atlantic Gateway Project (TAG) was developed to create interoperability between tDAR and the Archaeological Data Service (ADS) repository in two stages. The first has been the creation of an infrastructure to enable basic cross-search of Dublin Core compatible metadata records for digital resources covering the archaeology of the USA and UK. This has built on earlier work on the EU-funded ARENA project that demonstrated such an approach achievable within Europe. Nonetheless, mapping European to North American metadata schemes offered some real challenges, particularly with regard to periodization and subject type. The second stage of TAG was an attempt to develop a much deeper and richer level of cross-searching for faunal data from North America and Europe. This sub-discipline was chosen as there is a relatively high level of agreement over basic classifications and the provision of deep data mining across contents and datasets is truly ground-breaking. With development primarily led by Matt Cordial on the Digital Antiquity side, the result is a Web-services gateway that allows for the discovery of materials in both repositories.  A web interface for the gateway was developed by ADS and is available at: http://archaeologydataservice.ac.uk/TAG/intro.jsf.

Adding the National Archaeological Database

The National Archaeological Database (NADB) Reports module was created by the National Park Service to identify and catalog the significant number of reports generated by archeological investigations for public projects across the United States.  This type of material — often described as “grey literature” — is critical to archaeology but is typically published in small numbers, limited in its distribution, and difficult to discover or locate. Over the initial phase of NADB’s development, the database captured and catalogued over 350,000 citations for archaeological reports or related materials.  The integration of NADB into tDAR occurred in 2011, and was a major milestone in tDAR’s development.  Beyond significantly increasing the depth and breadth of tDAR’s content, NADB also provided significant data to leverage on how archaeologists describe and use archaeological keywords and data.  Prior to its addition to tDAR, Digital Antiquity spent months analyzing and enhancing the NADB database.  This included the reconciliation of duplicate people, places, terms, and other information, and adding more complete bibliographic information and spatial references when available. Technologically, the addition of NADB prompted the development of a Web Services interface for the programmatic addition and management of records within tDAR.  

Current version of tDAR

Through a renewal grant, the Mellon Foundation has continued to support the initial phase of tDAR’s operation as it moves to financial independence. In the summer of 2012, Digital Antiquity embarked on a major plan to enhance and update tDAR.  As part of this process, Digital Antiquity worked with Fervor Creative to re-envision the public interface for tDAR, with the primary goals of simplifying and enhancing the interface. Paired with the front-end work, Jim deVos led the redesign effort for the data entry and management interface. In 2012, NSF funded refinement of tDAR’s data integration interface and a major research application of these tools to large datasets of archaeological fauna from the Southwest.  That research is ongoing.