Message of the day
“Data that is mobile, visible and well-loved stands a better chance of surviving” ~ Kurt Bollacker
Things to consider
Legacy, heritage and at-risk data share one common theme: barrier to access. Data that has been recorded by hand (field notes, lab notebooks, handwritten transcripts, measurements or ledgers) or on outdated technology or using proprietary formats are at risk. Born-digital files can be at risk too, since they can be susceptible to poor management, bit rot, or even direct attempts at reducing access.
Securing legacy data takes time, resources and expertise but is well worth the effort as old data can enable new research and the loss of data could impede future research. So how to approach reviving legacy or at-risk data?
How do you eat an elephant? One bite at a time.
- Recover and inventory the data
- Format, type
- Accompanying material–codebooks, notes, marginalia
- Organize the data
- Depending on discipline/subject: date, variable, content/subject
- Assess the data
- Are there any gaps or missing information
- Triage–consider nature of data along with ease of recovery
- Describe the data
- Assign metadata at the collection/file level
- Digitize/normalize the data:
- Digitization is not preservation. Choose a file format that will retain its functionality (and accessibility!) over time: “Which file formats should I use?”
- Review
- Confirm there are no gaps or indicate where gaps exist
- Deposit and disseminate
- Make the data open and available for re-use
Stories
- Check out the #datarefuge efforts
- NSDIC wins data rescue award
- Bringing historical climate data into the 21st century (Retraction Watch)
- How Not to Build a Digital Archive: Lessons from the Dark Side of the Force
- Data Rescue in Cuba
- International Data Rescue Award in the Geosciences
- Alberta Hail Project Meteorological and Barge-Humphries Radar Data Archive
Resources
- CODATA Data at Risk Task Group
- RDA Data Rescue Interest Group
- International data rescue portal
- Center for International Earth Science Information Network: Curation of Scientific Data at Risk of Loss: Data Rescue and Dissemination
- Curating a 23-year oceanographic time-series
- Unlocking GATE: Gaining Access to Analog Data in a Digital World
Activities
There are many opportunities to rescue at-risk or legacy data. Locally, as faculty retire, reach out to departments to assist in curating existing yet inaccessible data. Regionally and nationally, partner with other stakeholders to revitalize at-risk data. Think: Citizen Science.
Get involved with the #datarefuge project
Share a tweet about today’s message with #LYD17 , or use #WhyILYD17 and you’ll be entered in a raffle for a book from Facet.