[Humanist] 29.565 encoding code for preservation?

Humanist Discussion Group willard.mccarty at mccarty.org.uk
Sat Dec 19 10:02:34 CET 2015

                 Humanist Discussion Group, Vol. 29, No. 565.
            Department of Digital Humanities, King's College London
                Submit to: humanist at lists.digitalhumanities.org

        Date: Fri, 18 Dec 2015 11:01:28 -0600
        From: Carlos Monroy <cm13 at rice.edu>
        Subject: Encoding for programming source code (similar to TEI)

As part of our ongoing project, we are mining a large corpus of programming source code (C, C++, Java, etc.). One of our goals is to curate and maintain a corpus encoded in such a way that broader analyses can be carried out in the future. As you can imagine, it is difficult to know before hand all possible metadata, since it is an evolving process. I have in the past worked with TEI for literary works, and although it is not intended for source code, it offers good ideas for what we want to do. One of my goals in the context of our group is to advance the notion that original sources (programming code in our case) are essential for preservation, research, and dissemination. I want to give the perspective that scholars or folks in digital humanities give to literary texts. If anyone has suggestions, please contact me. Thanks in advance.


Carlos Monroy, Ph.D.
Research Scientist
Department of Computer Science
Rice University
cm13 at rice.edu <mailto:cm13 at rice.edu>
http://monroy.blogs.rice.edu/home/  http://monroy.blogs.rice.edu/home/

More information about the Humanist mailing list