By Estelle Maryline Delpech
Computer-assisted translation (CAT) has regularly used translation stories, which require the translator to have a corpus of past translations that the CAT software program can use to generate bilingual lexicons. this is often complex whilst the translator doesn't have this sort of corpus, for example, while the textual content belongs to an rising box. to unravel this factor, CAT examine has regarded into the leveraging of similar corpora, i.e. a suite of texts, in or extra languages, which care for an analogous subject yet usually are not translations of 1 another.
This paintings had basic goals. the 1st is to evaluate the enter of lexicons extracted from related corpora within the context of a really good human translation job. the second one aim is to spot bilingual-lexicon-extraction equipment which most sensible fit the translators’ wishes, choosing the present limits of those concepts and suggesting advancements. the writer focuses, particularly, at the identity of fertile translations, the administration of a number of morphological constructions, and the rating of candidate translations.
The experiments are performed on language pairs (English–French and English–German) and on really expert texts facing breast melanoma. This learn places major emphasis on applicability – methodological offerings are guided via the desires of the ultimate clients. This publication is equipped in elements: the 1st half offers the applicative and clinical context of the learn, and the second one half is given over to efforts to enhance compositional translation.
The study paintings offered during this publication got the PhD Thesis award 2014 from the French organization for average language processing (ATALA).
Read Online or Download Comparable Corpora and Computer-assisted Translation PDF
Similar software development books
Good selection and association of issues, made all of the extra authoritative by means of the author's credentials as a senior educational within the quarter Prof. David S. Rosenblum, college university London i locate Somerville inviting and readable and with extra acceptable content material Julian Padget, college of tub Sommerville takes case stories from appreciably diversified parts of SE.
Abstraction is the main uncomplicated precept of software program engineering. Abstractions are supplied by means of versions. Modeling and version transformation represent the middle of model-driven improvement. versions should be sophisticated and eventually be remodeled right into a technical implementation, i. e. , a software program method. the purpose of this publication is to offer an summary of the state-of-the-art in model-driven software program improvement.
Model-Driven software program improvement (MDSD) is presently a extremely popular improvement paradigm between builders and researchers. With the appearance of OMG's MDA and Microsoft's software program Factories, the MDSD method has moved to the centre of the programmer's recognition, turning into the focal point of meetings corresponding to OOPSLA, JAOO and OOP.
- Professional Application Lifecycle Management with Visual Studio 2013
- Oracle Database 10g Administration Workshop II
- Getting Started with Storm
- Professional issues in software engineering
Additional resources for Comparable Corpora and Computer-assisted Translation
Meteor [BAN 05] – takes into account precision and recall calculated on word unigrams and word order. In addition to identical words, Meteor also considers similar words such as morphological variations or synonyms. One of the objectives of this measure is to allow researchers to carry out an assessment at the sentence level, when other measurements only work when the entire translation corpus is evaluated. 44 Comparable Corpora and Computer-assisted Translation TER [SNO 06] – calculates the number of edit operations (insertions, deletions and substitutions) necessary to go from the evaluated translation to the reference translation.
For example, during the 2009 edition of the Workshop on Statistical Machine Translation [CAL 09], the measures which were the most correlated to human judgments were measures which combined several measures or measures based on the correspondences between semantic and syntactic structures. In the 2010 edition of the same workshop [CAL 10], the best measures were those which used surface information such as letter n-grams. Yet the data sets used in the 2009 and 2010 editions were quasi-identical.
6). In our prototype, the size of the context will thus be of three lexical words to the left and three lexical words to the right of the term to be translated, regardless of its frequency. The units to be translated match the terminology entries and generalist entries. Each curve matches a frequency range, and the number of entries in this frequency range is mentioned in brackets. 6. Inﬂuence of the frequency of terms to be translated on the size of the optimal contextual window The number of co-occurrences is standardized with the likelihood ratio [DUN 93].
Comparable Corpora and Computer-assisted Translation by Estelle Maryline Delpech