
Номер: Electronic System «All-Ukrainian Toloka Archival Card Index»: Structure, Tools, Prospects of Development

5th International Conference on Computational Linguistics and Intelligent Systems (COLINS 2021)

Oksana Tyshchenko, Institute of Ukrainian Language of the National Academy of Sciences of Ukraine
Vladyslav Tyshchenko, National Pedagogical Drahomanov University 

Electronic System «All-Ukrainian Toloka Archival Card Index»: Structure, Tools, Prospects of Development. Proceedings of the 5th International Conference on Computational Linguistics and Intelligent Systems (COLINS 2021). Volume I: Main Conference. Lviv, Ukraine, April 22–23. –2021. – Р. 555–565.  (http://ceur-ws.org/Vol-2870/;  http://ceur-ws.org/Vol-2870/paper41.pdf)

The article covered the principles and tools of collective recognition of manuscripts of the Archival Card Index (ACI) – lexical and phraseological materials of the Commission for compiling the Dictionary of the living Ukrainian language of the All-Ukrainian Academy of Sciences. In 2018, the Institute of the Ukrainian Language of the National Academy of Sciences of Ukraine created an electronic system «Archival Card Index» (ESACI) – digital format of ACI. ACI (350 thousand units) has got a great importance in the context of the cultural and national revival in Ukraine in the early 20th century, as it plays an important role in the development of the Ukrainian language, the theory and practice of Ukrainian studies in the 20th – early 21th century. The ACI fragment (3000 units) was recognized manually: the texts were entered into the ESACI according to the fields of the microstructure of the card. Such recognition requires considerable the effort and the time, so the platform «All-Ukrainian Toloka Archival Card Index» (AUTACI) has been created on the ACI website, which provides unlimited simultaneous online participation of volunteers for manual card recognition. Collective access to the collection of the transcribed documents is accompanied by instructions and samples of execution. The form for filling in the card is simplified in contrast to the form in the ESACI, as we plan to involve non-specialists in the work. Access to the AUTACI is possible after registration and has no time limits. In the future, we plan to use it to create tools for future verification of ACI texts, were automatically recognized by the Transkribus software, and for the partition linguistic information in the appropriate fields.

Archival Card Index (ACI); Electronic System «Archival Card Index» (ESACI); open platform «All-Ukrainian Toloka Archival Card Index» (AUTACI); Ukrainian Lexicography; Manual Handwriting Texts Recognition; Lexicographic Toloka (Crowdsourcing)