I appropriated, and over-simplified, the title of this blog from a recent article, Archaeological Analysis in the Information Age: Guidelines for Maximizing the Reach, Comprehensiveness, and Longevity of Data by Kansa et al. 2020, because it’s so true.

As part of the 2020 Kansa et al. article, which lays out a series of guidelines to improve “data management, documentation, and publishing practices so that primary data can be more efficiently discovered, understood, aggregated, and synthesized by wider research communities” (specifically in archaeology), the authors state, “…we strongly emphasize the importance of viewing data as a first-class research outcome that is as important as, if not more important than, the interpretive publications that result from their analysis.” I personally interpret this to mean that the data you produce are a more valuable contribution to your discipline than are many of your interpretations. Though this may sting a little for some to hear, I’m quite content with this idea. Science is bigger than me and my little sphere of knowledge and expertise, and I’m good with that.

As much as any academic is ever truly pleased with the final manuscript that’s been accepted to a peer-reviewed journal, there’s a good chance that you know that the results and conclusions drawn from that study are not even close to being 100% correct; especially in archaeology. We attempt to reconstruct past human behavior from an incomplete archaeological record, so the conclusions we draw about an event that created a biface or how North America was initially colonized can never be completely accurate. We know that there are flaws in our data (taphonomic and spatial bias), problems with the way statistical tests are interpreted, and even incongruences with the theories we employ to explain sets of behaviors. However, the data that we generate to come to our conclusion can help create even better results if our data are synthesized with data from others and reused. The more data we generate, and then reuse, the better our ability to resolve some of the issues stated above. Ultimately, we can paint a more accurate picture of any past event if our data are made available so that future generations of scientists can augment our past research. This is how science built upon foundation works. But this work doesn’t come without some costs.

Though not fully acknowledged, we know that the data that we generate are, in fact, very valuable. From a monetary perspective, it can cost a lot of money to generate data, from lab to field equipment and labor; those who practice archaeology know that our work is a spendy endeavor. And, of course, there is the intrinsic value in the information that our research seeks to accomplish through an enhanced understanding of some past phenomena or occurrence.

Other measures by which we can measure the worth of data include:

  1. Its existence can save time. For example, graduate students can spend up to 80% of their time searching for data and fixing formatting issues to make it suitable for analysis. Data that’s been vetted and placed in trusted repositories can be reused by students, thus saving them a tremendous amount of time during their graduate career.
  2. It enables new research. Research often requires more data than one individual can collect. Thus, sharing data provides resources for future research by other teams, which should in turn advance our knowledge about a given topic.
  3. It allows you to get credit for data creation. Researchers who share data are more likely to be asked to be co-authors on publications using their data, or will be more frequently cited because their data are available in a trusted repository.
  4. Data availability establishes trust. Surveys show public trust in research is enhanced when data is available, because, unfortunately, some science has been less than trustworthy.

This last point is an important one. Regardless of our results, outcomes, and/or conclusions, the data that we generate holds onto its intrinsic and independent value, because this raw data can be reused in the scientific method.

In the scientific method, the reuse of existing datasets is paramount. The methods and hypotheses we generate are built on this ideal and is why the FAIR movement (findable, accessible, interoperable, and reusable) currently has momentum across the sciences. And though FAIR is in vogue, it’s still a struggle to get scientists to share and provide data in trusted, open-access repositories for others to reuse or even evaluate.

Recently, the editor of the journal Molecular Brain, Tsuyoshi Miyakawa, penned an editorial entitled “No raw data, no science: another possible source of the reproducibility crisis”, a commentary on data availability in peer-review. He writes, “….97% of the 41 manuscripts did not present the raw data supporting their results when requested by an editor, suggesting a possibility that the raw data did not exist from the beginning, at least in some portions of these cases.” So not only is there no metadata that describes what the data are, but the data themselves did not exist? That’s really hard to stomach, but it seems to be an unfortunate truth. And Miyakawa is not alone in his observations.

The editors of the Lancet, one of the most prestigious journals in medicine (impact factor = 59.102 versus Journal of Archaeological Science impact factor = 3.030) recently retracted a very high-profile article titled “Hydroxychloroquine or chloroquine with or without a macrolide for treatment of COVID-19: a multinational registry analysis”. The authors claimed to use an international database of patient data created by the company Surgisphere Corporation to evaluate the efficacy of Hydroxychloroquine in treating Covid-19. Their results are notable and resulted in the WHO recommending the discontinuation of this as a treatment. However, this may have been done in haste. It turns out Surgisphere “declined to make the underlying data…available for an independent audit”, calling into question the article’s results. It’s entirely possible that their data did not really exist. How is it even possible that this type of academic fraud occurs in today’s science community? Will we ever be able to truly say that science abides by the FAIR principles when there are academics and journals that don’t hold scientists and their data accountable like we do of their interpretations of data?

As someone who works for a Center whose mission it is to archive, preserve, and make data available, I hope that the archaeology community begins to embrace the idea that the quality and availability of their data are just as important as the original interpretations of that data. There is good science and bad science. Good science, and scientists, make data available. This enables others to reproduce and corroborate, or even dispute, conclusions drawn in a study. That’s how real science is intended to work.

Findable, accessible, interoperable, and reusable/reproducible data are the foundation of good science, and are more important than our interpretations. Our ability to serve as trusted scientists lies not only in our ability to push the frontiers of knowledge, but also in our willingness to be transparent and accountable about our data. In this way, our conclusions – while no doubt thought-provoking – pale in comparison to the manner in which we generate and contribute data. This paradigm shift is paramount for good science to flourish; and the first step is letting our data loose to be reused and productive in someone else’s hands.

When I started at Digital Antiquity (way back in late 2019), I was delighted to learn that the office had established a “shared” drive (our own mini-cloud) which we could access via VPN from just about anywhere we had decent internet access. Since I frequently get “epiphanies” on the weekend, I really like being able to access my office documents remotely from home; I often tinker and re-tinker with documents or save work that pertains to a project I’m currently focused on.  When the Covid-19 virus event initially cropped up, our office staff discussed what it might look like if we had to work from home.  We are physically housed in newly remodeled Hayden Library at Arizona State University, and at first that there was no real indication that we would need to work from home. What began more as a thought exercise turned out to be good planning on our part.  Though we weren’t thinking that we would really need to work remotely for an extended period of time, this turns out to be our “new normal” (though I’m not a fan of that phrase).

It turns out that all the work that Digital Antiquity staff had undertaken over the past several years to 1) create a shared drive on our university server, 2) establish a Slack Channel™ and Zoom™ meetings for communication, 3) facilitate Remote Desktop access, and 4) create administrative log-in protocols on tDAR has paid off 100-fold.  Though these platforms have been in use for years, the manner in which they’ve become intertwined into our daily workflows as a result of our enforced social-isolation is pretty new to most of us who were working in an office setting.  In thinking about how our digital world is constantly changing, it was wise to plan and enact procedures that gave us the flexibility to access work-related documents remotely. For now, Digital Antiquity staff is able to efficiently work from home given the absurdity of the world around us; all of which has me thinking more about the importance of planning, archiving, preservation, access, and later reuse.  Our reliance on technology is as prevalent as ever.

I’m hyper-aware of how selfish this sounds given that we are facing a global lock-down in the face of a pandemic.  But since I’m at home and able to work (and I realize that many are not as fortunate) it has given me a chance to reflect and have a new appreciation for that ability.  The technological capacity to access digital documents and data, either from home or remotely in a field setting, can truly be invaluable.  Because we have digital information stored in a service like tDAR (and there are many others), and the tools to access them, we can continue to work and provide a platform for others to do the same.  The same cannot be said for many other archaeologists/historic preservationists.

CRM work is very much a client-based professional service based on federal regulations and is steeped in technology. In the CRM world today, if a SHPO isn’t physically open for an extended period of time nor has an online presence, how do you go about conducting gray literature reviews, even if you are able to go into the field?  Having documents archived digitally and made accessible thus becomes incredibly important for compliance work to continue.  Likewise, SHPOs who may receive digital documents may not be in a position to store them in a manner that facilitates their access within the organization, limiting their ability to help companies or agencies meet their compliance requirements.

The financial impact to archaeology likely pales in comparison to the overall economic impact that the Covid-19 pandemic will ultimately have on our nation’s economy, but the impact to the field of archaeology and historic preservation will nonetheless be felt.  This period of confinement/social-distancing has given me a chance to reflect a bit on many things personally and professionally, but from a strictly professional perspective, working from home reminds me that not all organizations have the cyber-infrastructure that allows them the flexibility to work remotely with digital documents. 

Archaeologists need to continue efforts in 1) converting our physical documents (those reports sitting on shelves gathering dust) to digital formats, 2) creating online platforms to access these items, and 3) planning for future work interruptions, whether they be from pandemics or other reasons.  Making information and data accessible, either intra- or inter-office, truly is as important as ever. 

A rather dystopic article recently released from the MIT Technology Review entitled “We’re not going back to normal” (https://www.technologyreview.com/s/615370/coronavirus-pandemic-social-distancing-18-months/; accessed 4/6/2020) postulates that this pandemic (and maybe future ones) will be cyclical, necessitating multiple periods of social-distancing over extended periods of time.  If such does occur, as a discipline we need to be innovative and consider different ways of accessing and sharing information, documents, and data across groups; both within our own offices and to our customers.  Archaeology firms, SHPOs, university staff, student’s and the like will need to consider how their workflows will look different, both today and in the coming months and the importance of having access to variety of materials so that we can continue to be productive.  Not to mention we may find that the investment in getting these resources online during disruptions should pay off in “normal” times.

Our work and research are so intricately woven into cyber-infrastructure and data that we “need” the ability to access information; this period of human history, if nothing else, highlights our continued need to flex our phenotypic capacity to adapt changes in our economy.  If there is a return to normal, will we look back at the economic ramifications of working remotely and take steps to adapt by modifying our existing platforms?  Is accessing document and data remotely, despite major societal disruptions, important enough for many organizations to make significant changes to how they operate?  Only time and/or another disruption will ultimately tell.