Dear All,


I would like to draw your attention to the open beta launch of AVOBMAT (Analysis and Visualization of Bibliographic Metadata and Text), a multilingual text and metadata mining platform developed for DH research and teaching.

Designed in close collaboration with DH scholars, AVOBMAT supports transparent and reproducible workflows across a wide range of textual and bibliographic corpora. It handles large-scale datasets that are difficult or impossible to analyse with commercial LLMs. The service runs on an extensible, scalable, and modular cloud-based infrastructure, hosted by the Gesellschaft für wissenschaftliche Datenverarbeitung Göttingen (GWDG). AVOBMAT is developed at the University of Szeged, Hungary.  

Key Features:

  • Multilingual preprocessing, analysis, and visualization in 24+ languages

  • Bibliographic metadata analysis with network visualizations, gender analysis

  • Named entity recognition, disambiguation & linking

  • Topic modelling, POS tagging, N-gram viewer, corpus comparison, KWIC, lexical diversity

  • Export/import of configurations and results for reproducible workflows

  • Support for both public and private corpora

  • Integrated Help with interface overview, workflow, configuration settings, glossary and appendices

  • Upload templates, example corpora, and corrected/enriched metadata for ELTeC and DraCor are available on this GitHub repository

Current content:

  • 1,708 novels in 15 languages (ELTeC)

  • 4,113 dramas in 12 languages (DraCor)

  • Upcoming: corrected/enriched corpora including CoNSSA (Spanish novels), ECCO TCP, EEBO TCP, and Early American Imprints TCP

Free to use during the pilot phase with GWDG (until 25 March, 2026).

Community & webinars:

GWDG will host the first AVOBMAT webinar on 12 November (15.00-16.30, CET). For registration, please visit the event page.

Explore AVOBMAT: https://avobmat.hu

Read our intro article in the Journal of Open Humanities Data.

See AVOBMAT-related research projects and publications here.

We warmly welcome your feedback, as we aim to continue refining and enhancing AVOBMAT in close dialogue with the DH community.

I would also be grateful for your help in recommending curated, open-access text collections that we could process and make publicly available for teaching (multilingual) DH. Although AVOBMAT currently supports 24 languages, we only have sample databases for 15. You can view the list of supported languages and available features in this chart. Please feel free to reply via private message.

We’re also happy to collaborate on research and teaching projects related to multilingual DH.

Please feel free to share this on social media.

Thank you in advance for your support and suggestions.

All the best,

Róbert

**********************************************
Péter Róbert, Ph.D. / Róbert Péter, Ph.D.
habilitált egyetemi docens / associate professor
Angol-Amerikai Intézet / Institute of English and American Studies
Digitális Bölcsészeti Laboratórium vezetője / Head of Digital Humanities Laboratory
Szegedi Tudományegyetem / University of Szeged
Bluesky: @robertpeter.bsky.social 
Fedihum.org/@Robert_Peter
ResearchgateAcademia.edu