Search This Blog

Wednesday, 19 March 2014

Three publications describing the Open Citations Corpus | Open Citations and Related Work

Three publications describing the Open Citations Corpus

Last September, I attended the Fifth Annual Conference on Open
Access Scholarly Publishing, held in Riga, at which I had been invited
to give a paper entitled The Open Citations Corpus – freeing scholarly citation data.  A recording of my talk is available here, and my PowerPoint presentation is separately available here.  My own reflections on the major themes of the conference are given in a separate Semantic Publishing Blog post.

While in Riga preparing to give that talk about the importance of
open citation data, I received an invitation from Sara Abdulla, Chief
Commissioning Editor at Nature, to write a Comment piece for
their forthcoming special issue on Impact.  My immediate reaction was
that this should be on the same theme, an idea to which Sara readily
agreed.  The deadline for delivery of the article was 10 days later!

As soon as the Riga conference was over, I first assembled all the
material I had to hand that could be relevant to describing the Open Citations Corpus (OCC)
in the context of conventional access to academic citation data from
commercial sources.  That gave me a raw manuscript of some five thousand
words, from which I had to distil an article of less than 1,300 words. 
I then started editing, and asked my colleagues Silvio Peroni and Tanya
Gray for their comments.

The end result, enriched by some imaginative art work by the Nature
team, was published a couple of weeks later on 16th October [1], and
presents both the intellectual argument for open citation data, and the
practical obstacles to be overcome in achieving the goal of a
substantial corpus of such data, as well as giving a general description
of the Open Citations Corpus itself and of the development work we have
planned for it.

Because of the drastic editing required to reduce the original draft
to about a quarter of its size, all material not crucial to the central
theme had to be cut.  I thus had the idea of developing the original
draft subsequently into a full journal article that would include these
additional themes, particularly Silvio’s work on the SPAR ontologies described in this Semantic Publishing Blog post [2], Tanya’s work on the CiTO Reference Annotation Tools described in this Semantic Publishing Blog post,
and a wonderful analogy between the scholarly citation network and
Venice devised by Silvio.  I also wanted to give authorship credit to
Alex Dutton, who had undertaken almost all of the original software
development work for the OCC.  For this reason, instead of assigning
copyright to Nature for the Comment piece, I gave them a license
to publish, retaining copyright to myself so I could re-use the text.  I
am pleased to say that they accepted this without comment.

Silvio and I then set to work to develop the draft into a proper
article.  The result was a ten-thousand word paper submitted to the Journal of Documentation a week before Christmas [3].  We await the referees’ comments!


[1]     Shotton D. (2013).  Open citations.  Nature 502: 295–297. doi:10.1038/502295a.

[2]     Peroni S and Shotton D (2012). FaBiO and CiTO: ontologies for describing bibliographic resources and citations. Web Semantics: Science, Services and Agents on the World Wide Web. 17: 33-34. doi:10.1016/j.websem.2012.08.001.

[3]     Peroni S, Dutton A, Gray T and Shotton D (2014). Setting our
bibliographic references free: towards open citation data.  J. Documentation (submitted for publication).

This entry was posted in Uncategorized. Bookmark the permalink.

One Response to Three publications describing the Open Citations Corpus

  1. Pingback: Open access journals – wheat, chaff and hopeful monsters | Semantic Publishing

Three publications describing the Open Citations Corpus | Open Citations and Related Work

No comments:

Post a comment