Comparable Corpora: Compilation Methods and Areas of Application

Authors

  • Ketevan Mchedlishvili Ilia State University

DOI:

https://doi.org/10.32859/kadmos/16/237-253

Keywords:

Comparable corpora, compilation methods, application

Abstract

Comparable corpora and their application in research have been an object of interest since the 1990s. Following the establishment of the annual workshop series “Building and Using Comparable Corpora” (BUCC) in 2008, there has been an increasing interest in comparable corpora and the study of their effectiveness for bilingual/multilingual projects. Although there is a general comparable corpus of the Georgian language compiled as part of the “Aranea” project, a family of web-crawled comparable corpora, currently, there are no specialized comparable corpora available for the Georgian language. In general, the application of comparable corpora for bilingual/multilingual specialized lexicography in Georgia is a novel research topic that has not been explored before. Therefore, this review paper aims to analyze the concept and types of comparable corpora. It also discusses the advantages of using comparable corpora and the areas of their application. Furthermore, the paper focuses on the methods of compiling specialized comparable corpora, in particular, such issues as representativeness, balance, corpus size, and comparability criteria.

Downloads

Published

2024-12-29

How to Cite

Mchedlishvili, K. (2024). Comparable Corpora: Compilation Methods and Areas of Application. Kadmos. A Journal of the Humanities, (16), 237–253. https://doi.org/10.32859/kadmos/16/237-253