Skip to content

Cite, preserve, connect: Wikidata as a metadata layer for a Neo-Latin collection

Neven Jovanović, University of Zagreb__ Digital Neo-Latin studies: ideas and perspectives, Aarhus, 24–25 September 2025
Zenodo DOI: 10.5281/zenodo.17191691

The paper is part of the project AdriArchCult that has received funding from the European Union's Horizon 2020 Research and Innovation Programme (Architectural Culture of the Early Modern Eastern Adriatic, GA n. 865863 ERC-AdriArchCult).

URL of the page:

http://temrezah.ffzg.unizg.hr/del/deliciae/2025-cite-wikidata/

Proposal

On examples from the Croatiae auctores Latini (CroALa) collection I will show how Wikidata, the free and open structured knowledge base of Wikimedia, can be used and enriched to provide biographical, bibliographical and lexicographical information about Neo-Latin authors and texts. When information is published in Wikidata, the platform guarantees computational manageability, persistence and long-term preservation.

The plan

  1. Introduction
  2. Wikidata as a structured, language-independent, interconnected and citable encyclopedia supported by literature
  3. The two states of CroALa: the repository and the database
  4. The need for a bibliographic level in CroALa – beyond full-text search
  5. Why Wikidata, and not VIAF, ISNI, DNB, NSK (Croatia)?
  6. CroALa indices: Auctores (with Wikidata IDs), Opera, Genera, Tempora
  7. Deliciae auctorum Croatiae as an anthology which uses Wikidata IDs
  8. Historiae: Antonio Rosaneo: Vauzalis sive Occhialinus, Algerii Prorex, Corcyram Melaenam terra marique oppugnat nec expugnat
  9. How was this done, or XML as an archival format
  10. Wikidata and lexicography: Wikidata Lexemes (L), Forms (F) and Senses (S)
  11. Wikidata describes polysemy in CroALa: columna
  12. Ideas, plans, visions... and a conclusion

Introduction

Today I will show two ways in which Wikidata can support a collection of Neo-Latin texts. First I will explain briefly what is Wikidata, stressing that it differs from Wikipedia. Then I will present, equally briefly, the collection which I am connecting with Wikidata, the Croatiae auctores Latini (CroALa). Finally I will demonstrate what can be done: works, genres, and other bibliographic features of CroALa texts are annotated in Wikidata, and how Wikidata's Lexemes can be connected to lemmata of words from CroALa texts. Then we can start imagining a world in which many individual Neo-Latin texts and smaller collections become automatically, by queries, linked through Wikidata, thanks to its persistent, citable, language-independent codes that can easily be edited and annotated through further persistent, citable, language-independent codes.

How Wikidata works

Wikidata is a structured description of everything, and a description that everybody can edit. Its structure arises from alphanumerical codes organised in subject-property-object statements.

A sample statement:

Antun Rozanović / languages spoken, written or signed / Latin (2 references: stated in...)

CroALa as a repository and as a database

croala/auctores

Deliciae auctorum Croatiae

A sketch for an anthology with Wikidata IDs as bibliographic support

A sample page: Vauzalis sive Occhialinus, Algerii Prorex, Corcyram Melaenam terra marique oppugnat nec expugnat

How was this done

github.com/nevenjovanovic/croatiae-auctores-latini-textus/blob/master/txts/rozan-a-vauz.xml

Wikidata and lexicography

Wikidata describes polysemy

Columna, Wikidata L265215

CroALa (333 occurrences): colunas columna columnam columnas columnis columnarum columnę columnae columnasque

Columna 1: structural element sustaining the weight of a building

Wikidata L265215-S2

(15) Toma Arhiđakon (ante 1268): uir quidam, Seuerus nomine, cuius domus fuerat iuxta colunas palatii supra mare.

a man by the name of Seuerus, whose house was near the colunas (columnas) of the palace, by the sea

(16) Pavlović, Pavao (1371–1408): procedendo usque logiam magnam Iadrae et usque ad secundam columnam ipsius logiae

moving on all the way to the large loggia of Zadar and all the way to the second columna of that loggia

(17) Ciriaco d'Ancona (1440; Ancona): Vidisti praeterea nostrae huiusce praeclare civitatis ornamenta alia quam plura, sed inter potiora antiqua atque nobilia undique ex cocto latere moenia, maritimum fronte litus, tresque ripales et aereas arces; portas deinde regias, turres innumeras et praecelsas, nec non sacra superis speciosa ornatissimaque delubra; alta quoque magistratum praetoria, civiumque palatia et conspicuas aedes; marmoreos itaque arcus et gestarum rerum trophea; scaenas, columnas, statuarumque fragmenta; bases et epigrammata; quin et harenarum ingentia vetustissimaque numidicae architecturae loca pereminentia urbis amphiteatra, magnum inditium splendoris primaevae tam praeclarae civitatis familiae et verendissimae antiquitatis.

Besides, you have seen the embellishments of our beautiful city; there are many of them, but particularly important are the ancient and noble walls, built from all sides of bricks, the sea-front, three heaven-high citadels on the shore; the royal gates, the countless and soaring towers, as well as the beautiful and adorned temples dedicated to divinities; there are also high residences of city magistrates, the palaces and distinguished houses of the citizens; also the marble arches and trophies of victories; the stages, columnas, and the remains of sculptures; the column-bases and the inscriptions; the grand and immensely ancient amphitheater with the arena, a place of Numidic (?) architecture towering above the city, the large sign of earlier magnificence of the city so glorious, of an age so venerable.

(18) Cipiko, Koriolan (1477; Delos): Imperator imposita in naues omni praeda ad Delon insulam uenit. Haec olim ob insigne Apollinis templum et sacrorum cerimonias conuentibus uniuersae Graeciae celebrata fuit, nunc deserta ac inhabitata est. Extant tamen templi et amphitheatri uestigia albi marmoris, columnarum quoque ac signorum maximus numerus, colossus etiam cubitorum quindecim, cum hac inscriptione: Νάξος Ἀπόλλωνι. Sunt et cisternae multae mirae magnitudinis, etiam nunc aquarum plenae.

The general ordered to put all the booty on ships and sailed to the island of Delos. The island was earlier famous for its large temple of Apollo, for its festivities in which all Greece took part; now it is deserted and uninhabited. There still can be seen remains of the temple and of the amphitheater, built of white marble, and a great number of columnarum and of sculptures; also a colossus of fifteen cubits with the inscription Νάξος Ἀπόλλωνι. There are also cisterns of amazing size, still full of water.

(19) Andreis, Franjo Trankvil (1527; Sulla speaks about suffering of tyrants): in agris aediculas et holerum in ocio tranquillissimo cibum longe praestare tessellatis pauimentis et incrustatis marmore parietibus aureisque tectis, quae sexcentis columnis innituntur

small huts in the fields and vegetable as food, taken in peaceful leisure, is far better than floors adorned with mosaic, than walls incrusted with marble, than roofs of gold held up by hundreds of columnis

Columna 2: object resembling a pillar

Wikidata L265215-S4

(20) Pavlović, Pavao (1371–1408): apparuerunt in Iadra signa mirabilia in coelo, videlicet apparuit quidam splendor magnus, quasi columna nubis, igne accensa

At Zadar, wonderful signs appeared in the sky: a certain great brightness, like a columna of cloud, kindled with fire

Columna 3: (figuratively) support

Wikidata L265215-S3

(21) Jan Panonije (1447–1472, "De morte Andreolae, Nicolai V. pontificis Romani et Philippi cardinalis Bononiensis matris"): Salve magna parens, sublimem enixa columnam

Hail, great mother, you who gave birth to a high columnam

(22) Marulić, Marko (1499): Basilius etiam episcopus uere columna ignis (ut cuidam uisum fuit) calore ęstuans charitatis

And bishop Basil, the true columna of fire (as somebody thought), burning with the heat of love

(23) Marulić, Marko (1499): Hunc Paulus et Cepham et Ioannem, ueluti reliquorum pręsides, columnas uocat

Paul calls James, Cephas and John columnas, because they are the leaders of the rest (of apostles)

(24) Marulić, Marko (1499; cf. 2 Chronicles 3–4, on the Temple of Jerusalem): Hęc est ergo columna Iachin et columna Booz, id est, firmitatis et fortitudinis, quę epistylia liliorum malogranatorumque sustentant.

So this is the columna Iachin and columna Booz, that is, of strength and of courage, which support the architrave of lilies and of pomegranates.

Ideas, plans, visions...

The (next) rapture: Wikidata: Embedding Project

"The project’s aim is to enhance the search functionality of Wikidata by integrating vector-based semantic search. By leveraging advanced machine learning models and scalable vector databases, the project seeks to support the open-source community in developing innovative AI applications and using Wikidata's multilingual and inclusive knowledge graph, while making its extensive data more accessible and contextually relevant for users across the globe."

Conclusion

Wikidata provides ready infrastructure for any collection of texts. It is an additional, structured layer in which we can express what we know about a text. In Wikidata, this knowledge will become persistent, citable (through URL addresses) and supported by references to scholarly literature.

The downside is that using and preparing information itself in Wikidata takes a lot of planning and requires a lot of bibliographic and lexicographical toil. The results can be magical, but the magic happens slowly and takes a lot of preparation.

The return for the labor is better annotation of our little known Neo-Latin texts; the annotation should improve and invite presentation and exploring of the texts. Because annotations in Wikidata become language-neutral, the problems of multiple name variants and multiple allographs can be overcome; the Wikidata infrastructure is sustainable, because it is maintained by a whole committed organisation (the Wikimedia Foundation), which is also non-commercial; finally, the Wikidata infrastructure is machine-actionable: its structures can be accessed and processed by computer programs.

The world of computing, and its digital humanities branch, are currently fascinated by Large Language Models. To an extent, these models are diametrically opposed to my manufacturing approach, where information about a text is painstakingly, philologically added, checked and documented. But the Wikidata Embedding Project, which aims to make Wikidata's data accessible to Artificial Intelligence and Machine Learning, promises to connect my small manufacture with the mega-mining of the large language models. I would like Wikidata to be prepared, from the aspect of Latin language and Neo-Latin literature, when that rapture happens.