[Humanist] 28.327 text-mining for bibliographic references

Humanist Discussion Group willard.mccarty at mccarty.org.uk
Fri Sep 12 07:35:10 CEST 2014

                 Humanist Discussion Group, Vol. 28, No. 327.
            Department of Digital Humanities, King's College London
                Submit to: humanist at lists.digitalhumanities.org

        Date: Thu, 11 Sep 2014 15:06:18 +0200
        From: Elodie Faath <elodie.faath at openedition.org>
        Subject: Deployment of the text-mining tool Bilbo on Revues.org

Dear all,

We are pleased to announce the deployment of Bilbo, our automatic annotation
tool for bibliographic references, on almost 80% of Journals on Revues.org.

It has been developed by OpenEdition Lab, a research and development
programme launched in 2011 by teams at OpenEdition and the Avignon IT
Laboratory (LIA  http://lia.univ-avignon.fr/ ), later joined by the
Information Sciences and Systems Laboratory (LSIS  http://www.lsis.org/ ,
Aix-Marseille University – CNRS), and initially financed by a Google Grant
for Digital Humanities  http://oep.hypotheses.org/312 .


Best regards, 
Élodie Faath

Deployment of the text-mining tool Bilbo on Revues.org

For a few weeks now Revues.org has boasted a new feature: the text-mining
tool Bilbo, which automatically annotates journals’ bibliographic

Bilbo identifies the bibliographic references in journal articles and
semantisizes their constituent parts. It then identifies the DOIs
corresponding to these references and, where they exist, adds them to the
end of the reference as a hyperlink, making it possible to directly access
the cited resource. Developed by OpenEdition Lab
http://lab.hypotheses.org/?lang=en_GB , Bilbo is now deployed on almost 80%
of the journals on the Revues.org platform.

How does Bilbo work?

Bilbo (“Bibliographical Robot”) is a piece of software that detects,
identifies, analyses and encodes bibliographic references in articles. Bilbo
uses data mining and machine learning to identify the first name and surname
of the author(s) and the title, publisher, year and place of publication of
each bibliographic reference. The first version of Bilbo focuses on article
bibliographies. A second version will extend the identification of
bibliographic references to footnotes. Finally, a third stage will involve
identifying implicit references within the body text of articles.

Bilbo will regularly analyse the same bibliographies as the algorithm
develops, but also in light of the fact that DOIs are attributed to
thousands of new publications each day. Automatically identified references

Using the author and title of an article, Bilbo can query the search engine
maintained by Crossref  http://www.crossref.org/ , the official registry for
Digital Object Identifiers (DOIs), whose database contains millions of
academic references. Bilbo can thereby retrieve the article’s DOI, where
it exists, and add it to the reference in the article’s bibliography. The
DOI is added as a hyperlink, allowing the reader to directly access the
cited resource.

Richer references

Once the reference is identified, Bilbo is able to enrich it with
complementary data and display it in different formats. Readers consulting
an article via a library or institution that has subscribed to one of
OpenEdition’s *Freemium *programmes 
http://www.openedition.org/8873?lang=en  will be able to download references
for which Bilbo has found DOIs in APA, MLA and Chicago formats. The list of
subscribing libraries and institutions can be consulted on this page: 

Who runs Bilbo?

Developed by OpenEdition Lab, Bilbo is a research and development programme
launched in 2011 that aims to develop features related to reading, writing,
navigation and system recommendations. Two teams work closely together on
the project: the OpenEdition team and the Sciences and Systems Laboratory
team (LSIS, Aix-Marseille University – CNRS). Initial funding for the
project was provided by a Google Grant for Digital Humanities
http://oep.hypotheses.org/312 .

Further information

- Recommendations and information about Bilbo for editorial teams on *LaMaison des Revues* (in French):  http://maisondesrevues.org/680
- The *OpenEdition Lab* research blog, which traces the progress of the
Bilbo project since its inception (in French): <
- A description of how DOIs work, on *La Maison des Revues* (in French):
http://maisondesrevues.org/253 .

lab at openedition.org

Élodie Faath
Chargée des projets recherche et développement en fouille de textes -
Lab  http://lab.hypotheses.org/

Cléo  http://cleo.openedition.org/  / OpenEdition <http://openedition.org/>
École Centrale de Marseille - Technopôle de Château Gombert
38 rue Frédéric Joliot Curie
13451 Marseille Cedex 13

Courriel : elodie.faath at openedition.org

CNRS - Aix-Marseille Université - EHESS - Université d'Avignon

More information about the Humanist mailing list