Finding the right data

Message of the day

Need to find the right data? Have a clear question and know how to locate quality data sources.

Things to consider

romanticlocationicon_nounprojectIn a 2004 Science Daily News article, the National Science Foundation used the phrase “here there be data” to highlight the exploratory nature of traversing the “untamed” scientific data landscape. The use of that phrase harkens to older maps of the world where unexplored territories or areas on maps bore the warning ‘here, there be [insert mythical/fantastical creatures]’ to alert explorers to the dangers of the unknown. While the research data landscape is (slightly) less foreboding, there’s still an adventurous quality to looking for research data.

Stories

Resources

1. Formulate a question

The data you find is only as good as the question you ask. Think of the age-old “who, what, where, when” criterion when putting together a question – specifying these elements helps to narrow the map of data available and can help direct where to look!

  • WHO (population)
  • WHAT (subject, discipline)
  • WHERE (location, place)
  • WHEN (longitudinal, snapshot)

This page from Michigan State University Libraries’ “How to find data & statistics” guide does a great job of further articulating these key elements to forming a question and putting together a data search strategy.

 2. Locate data source(s)

After you’ve identified the question, then you can begin the scavenger hunt that is locating relevant source(s) of research data. One way to find data is to think about what organization, government, industry, discipline, etc., might gather and/or disseminate data relevant to your question.

Below are some good suggestions. You might also want to check out the UO Libraries guide to locating data.

  • There are an increasing number of city or state-wide data portals – some examples: New York City, Hawaii, and Illinois – that provide access to regional data on everything from traffic patterns to restaurant inspection results.
  • Science data tend to be distributed among a vast array of repositories, usually by specific discipline. See this page for some recommended repositories, or go to an Open Access Data Repositories list.

Check out this post from Nathan Yau, data viz whiz and creator of FlowingData — his post includes some of the sources listed above, but also highlights tips like scraping data from websites and using APIs to access data.

3. Cite accordingly  

The ability to reuse data is only as good as its quality; the ability to find relevant data is only possible if it’s discoverable. As a producer of data, that means following many of the practices articulated in earlier posts. As a consumer of data, that means being a good citizen and citing your data sources.

In general, citing data follows the same template as any other citation — include pieces like author, title, year of publication, edition/version, persistent identifier (e.g., Digital Object Identifier, Uniform Resource Name). Check with your data source as well – they may provide guidance on how they want to be cited!

See DataONE and ICPSR pages on data citation for examples and more guidance.

Activities

BYODM — build your own (research) data map!map-and-compassAsk yourself:

  • What data sources are most relevant to my research?
  • Are there relevant data sets generated or held locally that I have access to?
  • What information do I need to retrace my steps back to these data (e.g., contact information, URLs, etc.)?

Share a tweet about today’s message with #LYD17, or use #WhyILYD17 and you’ll be entered in a raffle for a book from Facet.

The 2017 Love Your Data Week is February 13 – 17, 2017. Monday Tuesday Wednesday Thursday Friday
Adopted with permission from the international Love Your Data Week 2017 materials.
Image credits:
Unagar, Pravin. (n.d.) “Romantic Location.” The Noun Project. https://thenounproject.com/term/romantic-location/611259/
Sáenz, D. (n.d.) “Map and compass.” The Noun Project. https://thenounproject.com/term/map-and-compass/113305/
Print Friendly, PDF & Email
This entry was posted in Data centers & repositories, Data quality, Sharing / publishing and tagged , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *