DataCite Blog
  • Support
  • DataCite homepage

It’s all about Relations

April 14, 2016 Martin Fenner
https://doi.org/10.5438/pe54-zj5t

In a guest post two weeks ago Elizabeth Hull explained that only 6% of Dryad datasets associated with a journal article are found in the reference list of that article, data she also presented at the IDCC conference in February [@https://doi.org/10.5281/zenodo.32412]. This number has increased from 4% to 8% between 2011-2014, but is still low. One important reason is missing incentives: we don’t yet have the same automated citation linking between articles and data that exists between articles thanks to Crossref.

Wouldn’t it be nice if a data publisher such as the Oak Ridge National Laboratory is automatically informed about journal articles citing one of their datasets [@https://doi.org/10.3334/CDIAC/00001_V2017]?

Global, Regional, and National Fossil-Fuel CO2 Emissions.
Global, Regional, and National Fossil-Fuel CO2 Emissions.

The challenge: both DataCite and Crossref collect metadata as part of the respective DOI registration services they provide. These metadata describe the information required for a citation (title, authors, publication date, etc.) [@https://doi.org/10.5438/0010]. And the metadata can contain references to related resources. But what is missing is an automated exchange of the information collected by Crossref and DataCite.

We can’t simply store information coming from Crossref in the DataCite Metadata Store (MDS) for two reasons:

  1. Only the organization publishing the DOI can update the metadata, and it is important to keep it this way to to have a single authoritative source.

  2. The DataCite MDS stores information about DataCite DOIs, but can’t store metadata (again title, authors, publication date, etc.) for other resources such as Crossref DOIs.

DataCite thus needs a service to enhance its DataCite Metadata Store (MDS). Data citations are the most important use case, but his service should be flexible enough to also handle information coming from other providers besides Crossref, for example claims of DataCite DOIs in the ORCID registry or links of DataCite DOIs to code repositories such as Github.

The new service is called DataCite Event Data, and the screenshot above shows six data citations coming from Crossref. The software powering the service is called Lagotto, open source software originally developed in 2009 by the Open Access publisher Public Library of Science. While Lagotto provides the basic functionality needed for the Event Data service, significant development effort was required to enable the full functionality described above. This work was done, and will continue, in close collaboration with Crossref, as Crossref wants to address similar use cases. Although the core Crossref infrastructure is built around citation linking of publications, Crossref is working on registering other online events associated with Crossref DOIs, e.g. a Wikipedia page referencing one or more journal articles.

This Tuesday we released version 5 of the Lagotto software [@https://doi.org/10.5281/ZENODO.49516] with support for what we need for the Event Data service. The release would not have been possible without developer Joe Wass from Crossref. The list of changes is long and can be read about in detail in the release notes. The highlights include:

  1. A deposits API allowing anyone with a valid API key to push events into the system using a JSON object which can be (almost) as simple as
{ "subj_id": "https://doi.org/10.1098/rspb.2015.2857",
  "obj_id": "https://doi.org/10.5061/DRYAD.7BQ5T",
  "relation_type_id": "cites",
  "sourceid": "europepmc_fulltext" }
  1. A contributor model to aggregate resources by contributor, using the ORCID ID as persistent identifier.
  2. Support for Github, describing the relations between software release, code repository, and repository owner, for the by now more than 7,000 DataCite DOIs for software linked to a Github release.

In the coming months DataCite and Crossref will continue developing the platform to build out their Event Data services, so stay tuned for updates. And if you don’t mind minor bugs and incomplete data (currently about 1.2 million events for about 400,000 DataCite DOIs), take a look at DataCite Event Data and send us your feedback.

A real life lagotto. Credit: Anke Büter and Najko Jahn (Exeter)
A real life lagotto. Credit: Anke Büter and Najko Jahn (Exeter)

References

Martin Fenner
Technical Director at DataCite | Blog posts
  • Martin Fenner
    #molongui-disabled-link
    Farewell to DataCite
  • Martin Fenner
    #molongui-disabled-link
    The DataCite Technology Stack
  • Martin Fenner
    #molongui-disabled-link
    We need your feedback: Aligning the CodeMeta vocabulary for scientific software with schema.org
  • Martin Fenner
    #molongui-disabled-link
    DataCite is hiring an application developer

Share this:

  • Click to share on Twitter (Opens in new window)
  • Click to share on Facebook (Opens in new window)
Uncategorized.

© 2016 Martin Fenner. Distributed under the terms of the Creative Commons Attribution license.


Post navigation

re3data.org Reaches a Milestone and Begins Offering Badges
To better understand research communication, we need a GROUPID (group object identifier)

Recent Posts

  • New Release of Fabrica: Improvements Inspired by User Feedback
  • Welcome our new DataCite Committee Members
  • Wellcome Trust and the Chan Zuckerberg Initiative Partner with DataCite to Build the Open Global Data Citation Corpus
  • Full API support for DataCite Metadata Schema 4.4
  • DataCite Celebrate and Reflect on a Year of Global Community Collaboration

Tags

Anniversary (3) API (3) Bibliometrics (2) Citation (8) Conference (2) Content negotiation (2) Crossref (10) CSV (4) Data-level metrics (9) Data citation (7) Discovery (2) Docker (3) DOI (18) Dublin core (2) Fabrica (4) FAIR (5) FORCE11 (2) FREYA (8) Github (2) Google (2) GraphQL (7) IGSN (5) Impactstory (2) Infrastructure (13) MDC (7) Members (11) Metadata (34) Open hours (2) ORCID (17) Organization identifiers (4) PIDapalooza (5) PID graph (8) Policy (2) RDA (8) Re3data (11) React (2) ROR (5) Schema.org (3) Search (3) Services (5) Software (2) Software citation (5) Staff (6) Strategy (2) THOR (13)

Archives

  • January 2023 (4)
  • December 2022 (4)
  • November 2022 (3)
  • October 2022 (5)
  • September 2022 (6)
  • August 2022 (3)
  • July 2022 (1)
  • June 2022 (3)
  • May 2022 (1)
  • April 2022 (1)
  • March 2022 (2)
  • February 2022 (3)
  • January 2022 (1)
  • December 2021 (2)
  • November 2021 (3)
  • October 2021 (5)
  • August 2021 (2)
  • July 2021 (2)
  • June 2021 (1)
  • May 2021 (2)
  • April 2021 (2)
  • March 2021 (2)
  • February 2021 (3)
  • January 2021 (3)
  • December 2020 (1)
  • November 2020 (2)
  • October 2020 (4)
  • September 2020 (4)
  • August 2020 (3)
  • July 2020 (3)
  • June 2020 (2)
  • May 2020 (3)
  • April 2020 (2)
  • March 2020 (2)
  • February 2020 (4)
  • January 2020 (4)
  • December 2019 (3)
  • November 2019 (3)
  • October 2019 (5)
  • September 2019 (3)
  • August 2019 (3)
  • July 2019 (3)
  • June 2019 (2)
  • May 2019 (5)
  • April 2019 (6)
  • March 2019 (2)
  • February 2019 (5)
  • January 2019 (1)
  • December 2018 (4)
  • November 2018 (3)
  • October 2018 (4)
  • September 2018 (4)
  • August 2018 (4)
  • June 2018 (4)
  • May 2018 (4)
  • April 2018 (1)
  • February 2018 (3)
  • January 2018 (1)
  • November 2017 (2)
  • October 2017 (2)
  • August 2017 (4)
  • July 2017 (1)
  • June 2017 (1)
  • May 2017 (2)
  • April 2017 (5)
  • March 2017 (2)
  • January 2017 (1)
  • December 2016 (4)
  • November 2016 (2)
  • October 2016 (5)
  • September 2016 (3)
  • August 2016 (1)
  • July 2016 (3)
  • June 2016 (1)
  • May 2016 (6)
  • April 2016 (5)
  • March 2016 (5)
  • February 2016 (2)
  • January 2016 (2)
  • December 2015 (3)
  • November 2015 (3)
  • October 2015 (8)
  • September 2015 (5)
  • August 2015 (6)

About

  • What we do
  • Governance
  • Members
  • Steering groups
  • Team
  • Job opportunities

Services

  • Create DOIs with Fabrica
  • Discover metadata with Commons
  • Integrate with APIs
  • Partner services

Resources

  • Metadata schema
  • Support
  • Fee model

Community

  • Members
  • Partners
  • Steering groups
  • Service providers
  • Roadmap
  • FAIR Workflows

Contact us

  • Imprint
  • Terms and conditions
  • Privacy policy
  • Mail
  • RSS Feed
  • Twitter
  • Mastodon
  • GitHub
  • YouTube
  • LinkedIn
We use cookies on our website. Some are technically necessary, others help us improve your user experience. You can decline non-essential cookies by selecting “Reject”. Please see our Privacy Policy for further information about our privacy practices and use of cookies.
RejectAccept
Manage consent

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Non-necessary
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.
SAVE & ACCEPT