DataCite Blog
  • Support
  • DataCite homepage

Publishing tabular data as blog post

May 20, 2016 Martin Fenner
https://doi.org/10.5438/1z7j-h7vw

CSV in many ways is for data what Markdown is for text documents: a very simple format that is both human- and machine-readable, and that – despite a number of shortcomings – is widely used. Given the popularity of Markdown for writing blog posts, using CSV to publish blog posts with tabular data should be an obvious thing to do, and we have just published our first blog post using CSV data. The blog post shows Table 3 from the DataCite Metadata Schema [@https://doi.org/10.5438/0010], describing the mandatory properties.

Periodic table of elements. From: Wikipedia
Periodic table of elements. From: Wikipedia

The DataCite blog uses the Jekyll static site generator, and all blog posts are written in Markdown format. All posts have their metadata in YAML format at the beginning of the file (separated by --- from the main text).

---
layout: post
title: Publishing tabular data as blog post
author: mfenner
tags:
 - csv
 - metadata
 - blog
---

Markdown is a nice format for writing texts, but doesn’t work so well for tabular data, as the current Markdown table implementations are difficult to edit and read for humans for all but the simplest tables. CSV is a much better fit for tabular data, and can be written both with a general text editor, or with a spreadsheet program or other specialized tool.

To add the metadata required for every Jekyll blog post we are again adding a YAML header, the resulting file format is CSVY, about which we have talked before [@https://blog.datacite.org/thinking-about-csv]. Jekyll can be extended to understand many file formats beyond Markdown. As a CSVY converter doesn’t exist yet, we have written this converter and released jekyll-csvy as Ruby gem [@https://github.com/datacite/jekyll-csvy], so that CSVY support can be easily added to every Jekyll-powered blog.

In HTML tabular data are typically displayed as HTML tables, and this is what we are doing with the CSVY converter. This works well for tables that are not too wide, and the converter supports inline Markdown formatting (bold, italic, links, etc.) in table cells. Block formatting (e.g. lists) is on our list of future improvements, and we will polish the converter based on user feedback. We are of course also interested in embedding CSV tables within Markdown documents, as this is a common use case.

One important feature of using CSVY for blog posts is that the CSV remains available, and can be ingested and processed by tools that can read CSVY, e.g. using the R rio [@https://cran.r-project.org/web/packages/rio/index.html] package.

References

Martin Fenner
Technical Director at DataCite | Blog posts
  • Martin Fenner
    #molongui-disabled-link
    Farewell to DataCite
  • Martin Fenner
    #molongui-disabled-link
    The DataCite Technology Stack
  • Martin Fenner
    #molongui-disabled-link
    We need your feedback: Aligning the CodeMeta vocabulary for scientific software with schema.org
  • Martin Fenner
    #molongui-disabled-link
    DataCite is hiring an application developer

Share this:

  • Click to share on Mastodon (Opens in new window)
  • Click to share on Twitter (Opens in new window)
  • Click to share on LinkedIn (Opens in new window)
  • Click to share on Facebook (Opens in new window)
Uncategorized.

© 2016 Martin Fenner. Distributed under the terms of the Creative Commons Attribution license.


Post navigation

Get involved in the DataCite Community!
Our New Website: Dynamic, Responsive and Adapted to your Needs

Recent Posts

  • Iratxe Puebla joining DataCite as Make Data Count Director
  • PIDs y Ciencia Abierta: Construyendo comunidad en América Latina
  • PIDs and Open Science: Building Community in Latin America 
  • Breaking a Metadata Barrier: Improving discoverability with automatic subject classification
  • DataCite in Buenos Aires, a recap of the csv,conf,v7

Tags

Anniversary (3) API (3) Bibliometrics (2) Citation (8) Conference (2) Content negotiation (2) Crossref (10) CSV (4) Data-level metrics (9) Data citation (7) Discovery (2) Docker (3) DOI (18) Dublin core (2) Fabrica (4) FAIR (5) FORCE11 (2) FREYA (8) Github (2) Google (2) GraphQL (7) IGSN (5) Impactstory (2) Infrastructure (13) MDC (7) Members (13) Metadata (35) Open hours (2) ORCID (17) Organization identifiers (4) PIDapalooza (5) PID graph (9) Policy (2) RDA (8) Re3data (11) React (2) ROR (5) Schema.org (3) Search (3) Services (5) Software (2) Software citation (5) Staff (6) Strategy (2) THOR (13)

Archives

  • May 2023 (5)
  • April 2023 (4)
  • March 2023 (3)
  • February 2023 (2)
  • January 2023 (5)
  • December 2022 (4)
  • November 2022 (3)
  • October 2022 (5)
  • September 2022 (6)
  • August 2022 (3)
  • July 2022 (1)
  • June 2022 (3)
  • May 2022 (1)
  • April 2022 (1)
  • March 2022 (2)
  • February 2022 (3)
  • January 2022 (1)
  • December 2021 (2)
  • November 2021 (3)
  • October 2021 (5)
  • August 2021 (2)
  • July 2021 (2)
  • June 2021 (1)
  • May 2021 (2)
  • April 2021 (2)
  • March 2021 (2)
  • February 2021 (3)
  • January 2021 (3)
  • December 2020 (1)
  • November 2020 (2)
  • October 2020 (4)
  • September 2020 (4)
  • August 2020 (3)
  • July 2020 (3)
  • June 2020 (2)
  • May 2020 (3)
  • April 2020 (2)
  • March 2020 (2)
  • February 2020 (4)
  • January 2020 (4)
  • December 2019 (3)
  • November 2019 (3)
  • October 2019 (5)
  • September 2019 (3)
  • August 2019 (3)
  • July 2019 (3)
  • June 2019 (2)
  • May 2019 (5)
  • April 2019 (6)
  • March 2019 (2)
  • February 2019 (5)
  • January 2019 (1)
  • December 2018 (4)
  • November 2018 (3)
  • October 2018 (4)
  • September 2018 (4)
  • August 2018 (4)
  • June 2018 (4)
  • May 2018 (4)
  • April 2018 (1)
  • February 2018 (3)
  • January 2018 (1)
  • November 2017 (2)
  • October 2017 (2)
  • August 2017 (4)
  • July 2017 (1)
  • June 2017 (1)
  • May 2017 (2)
  • April 2017 (5)
  • March 2017 (2)
  • January 2017 (1)
  • December 2016 (4)
  • November 2016 (2)
  • October 2016 (5)
  • September 2016 (3)
  • August 2016 (1)
  • July 2016 (3)
  • June 2016 (1)
  • May 2016 (6)
  • April 2016 (5)
  • March 2016 (5)
  • February 2016 (2)
  • January 2016 (2)
  • December 2015 (3)
  • November 2015 (3)
  • October 2015 (8)
  • September 2015 (5)
  • August 2015 (6)

About

  • What we do
  • Governance
  • Members
  • Steering groups
  • Team
  • Job opportunities

Services

  • Create DOIs with Fabrica
  • Discover metadata with Commons
  • Integrate with APIs
  • Partner services

Resources

  • Metadata schema
  • Support
  • Fee model

Community

  • Members
  • Partners
  • Steering groups
  • Service providers
  • Roadmap
  • FAIR Workflows

Contact us

  • Imprint
  • Terms and conditions
  • Privacy policy
  • Mail
  • RSS Feed
  • Twitter
  • Mastodon
  • GitHub
  • YouTube
  • LinkedIn
We use cookies on our website. Some are technically necessary, others help us improve your user experience. You can decline non-essential cookies by selecting “Reject”. Please see our Privacy Policy for further information about our privacy practices and use of cookies.
RejectAccept
Manage consent

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Non-necessary
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.
SAVE & ACCEPT