Translation corpora in lexicography: new resources
Abstract
The linguistics of the corpus has revolutionized lexicography: data on frequency, examples of uses, and translations are, among other things, the resources offered by extensive text corpora. The frequency of use of a word and the frequency of dictionary visits to that word are related; furthermore, these data can be integrated into the dictionary entry. Therefore, it is interesting to take frequency lists as a point of departure in establishing the entry list of a dictionary.
Data obtained from parallel corpora can be used in bilingual lexicography. In recent years, a number of parallel corpora have been created with Basque, such as the literary German-Basque corpus. From these parallel corpora we can extract examples of use and their respective translations, which can be helpful not only in lexicographic work, but also in providing information to the user, a service that is becoming more and more common through the use of dictionary websites.