Intra-language differences and translation quality assessment

An exploratory study on a learner corpus of literary translations

By Rudy Loock (Université Lille-Nord de France & UMR "Savoirs, Textes, Langage" du CNRS, France)

Abstract

Keywords: translation quality assessment, corpus-based translation studies, third code

©inTRAlinea & Rudy Loock (2017).
"Intra-language differences and translation quality assessment An exploratory study on a learner corpus of literary translations"
inTRAlinea Special Issue: Corpora and Literary Translation
Edited by: Titika Dimitroulia and Dionysis Goutsos
This article can be freely reproduced under Creative Commons License.
Stable URL: https://www.intralinea.org/specials/article/2258

Introduction

The aim of this article is to consider the question of whether the measurement of intra-language differences between original language and translated language can be used as a tool for translation quality assessment. To ask such a question is to enter the complex debate on the interpretation of intra-language differences: should we consider translated language as variation comparable to dialectal or geographical variation or should we consider that the over-representation or under-representation of a given linguistic feature means that the quality of the translation should be improved? From an even more general perspective, should we consider that translated language is intrinsically different and represents a “third code” (Frawley 1984) or should we consider that linguistic homogenization (or absence of intra-language differences) between original and translated language goes hand in hand with translation quality, alongside the idea that in translation, “the utopian goal is to make it virtually impossible to tell the translation from an original text in that language” (Teubert 1996: 241)? Although the existence of the third code may be accounted for by translation universals and/or source language interference (see below), no consensus exists on the possible link with translation quality. Through the analysis of a learner corpus (translation tasks from English to French performed by advanced students enrolled in a translation master’s programme at the University of Lille) for two linguistic features which are known to be problematic for translation trainees (derived adverbs and existential constructions), our exploratory study aims to examine whether or not some correlation can be found between the observed intra-language differences and the overall quality of the translation tasks. The genre to be considered here is contemporary fiction texts.

The article is organized as follows. In the first part, we discuss the starting point of our study, namely the existence of intra-language differences and the two types of explanation that have been put forward in the literature to account for their existence: translation universals, which in their strictest sense exclude interference (see for example Baker and Olohan 2000, Olohan 2003), and source language interference (see for instance Rabadán, Labrador and Ramon 2009, Cappelle and Loock 2013). The second and third parts elaborate on the aim and methodology of the study. The fourth part is dedicated to the results of our two studies, (i) for inter-language (or cross-linguistic) comparisons between original English and original French in general, (ii) for comparisons between the English original texts and their translations in French by translation students in our learner corpus. The final discussion then concerns intra-language differences between original and translated French, taking into account the linguistic characteristics of the English original texts, with the final aim of checking whether or not a correlation exists between these results and overall translation quality. We also provide suggestions for further research, as this is meant to be a pilot study as part of a larger project on corpus-based translation evaluation.

1. The starting point: the third code

With the advent of corpus-based translation studies, which could be summed up as the use of the tools of corpus linguistics in descriptive translation studies (Laviosa 2002), many studies have provided evidence in support of what Frawley (1984), who sees translation as a recodification, calls the “third code” and defines as “a code in its own right, setting its own standards and structural presuppositions and entailments, though they are necessarily derivative of the matrix information and target parameters” (Frawley 1984: 169), supposedly different from the source and the target language. Such studies have shown that intra-language differences are a reality: translated language is linguistically different from original language. Following Baker’s (1993) seminal paper, translated language has been considered to be an object worthy of research in itself, and many researchers have uncovered the existence of intra-language differences with original language, sometimes independently of the source language(s) of the translations.[1] For instance, studies on translated English include Baker and Olohan (2000) and Olohan (2003), which have respectively shown, through the comparison between samples of original English and of English translated from a variety of languages, that the use of that is more frequent after reporting verbs (say, tell, report…) in translated English and that contractions are more frequent in original English, in a range of genres. Using a similar methodology, Laviosa (1996, 1997, 1998, 2002) has shown that translated English texts have a relatively lower percentage of content words as opposed to grammatical words (lower lexical density) as well as a higher rate of repetition for the most frequent words (lower lexical variety), as opposed to what is to be found in original English. Xiao (2010) has found similar results to Laviosa’s for Chinese. Jiménez Crespo (2010) has found that the syntactic subject is more often expressed in translated Spanish than in original Spanish – a pro-drop language where the subject does not need to be overt. Working on translations from only one source language, other researchers have also confirmed the existence of intra-language differences: for instance, working on English and French, Cappelle and Loock (2013) have shown that existential constructions (that is there-constructions in English and il y a-constructions in French, see below) are more frequent in French translated from English than in original French, but also less frequent in English translated from French than in original English. Rabadán, Labrador and Ramon (2009) have shown that adjectives are more often preposed in Spanish translated from English than in original Spanish.

What is shown by all these studies – which only represent a sample of studies on intra-language differences between translated and original texts that have been conducted over the last twenty years (see Laviosa 2002, Olohan 2004, Zanettin 2012, or Loock 2016 for further examples) – is that translated texts, whether from one or a range of different source languages, are linguistically different from original texts written in the same language. There is therefore a general consensus on the existence of intra-language differences. In fact, it seems that these differences are so systematic that translated texts can now be detected automatically by computers (see Baroni and Bernardini 2006 and Kurukowa, Goutte and Isabelle 2009). What researchers seem to disagree on, however, is how to account for such differences, as two types of interpretation can be found in the literature. On the one hand, some researchers claim that the differences can be explained with Translation Universals (TUs) as defined originally by Baker: “features which typically occur in translated text rather than original utterances and which are not the result of interference from specific linguistic systems” (Baker 1993 : 243). Baker suggests the following are potential translation universals:

explicitation (“overall tendency to spell things out rather than leave them implicit”),
simplification (“tendency to simplify the language used in translation”),
normalization/conservatism (“tendency to exaggerate features of the target language and to conform to its typical patterns”), and
levelling (“tendency of translated text to gravitate towards the centre of a continuum”).

Other TUs have been added, such as Tirkkonen Condit (2002, 2004)’s unique items hypothesis, according to which a linguistic item that is specific to the target language is under-represented in translated texts. Other suggested candidates are disambiguation and avoidance of repetition (see House 2008 for a critical list). Examples of studies accounting for intra-language differences with translation universals are Baker and Olohan (2000) or Jiménez Crespo (2010), where, the over-representation of that after reporting verbs in English or of the syntactic subject in translated Spanish are seen as the results of explicitation.

On the other hand, some researchers, generally working on translations from one source language, claim that intra-language differences are due to source language interference (SLI), that is, they claim that the linguistic characteristics of the source language have an influence on the linguistic characteristics of translated texts and must explain the differences that can be observed with original language. For instance, because existential constructions are more frequent in original English than in original French, Cappelle and Loock (2013) claim that the intra-language differences that they observe between translated and original texts for English and French “strongly suggest source-language interference” (Cappelle and Loock 2013: 268). It is important to note however that source language interference has been listed as a potential TU (cf. Toury’s 1995 Law of Interference), but TUs à la Baker clearly exclude it (cf. definition above) and most studies on TUs do not mention interference as one of them (see Mauranen 2004 for a discussion of the unclear status of interference in relation to translation universals). If one considers interference as one of the universal “tendencies”, then the issue raised here can be reformulated as determining the influence of source language interference as opposed to other universals like simplification or normalization.

Some of these researchers (for instance Rabadán, Labrador and Ramon 2009, Loock, Mariaule and Oster 2014) then claim that a correlation may exist between intra-language differences and translation quality, equating linguistic homogenization and quality. For instance, Rabadán, Labrador and Ramon 2009: 323) claim that:

The smaller the disparity between native and translated usage in the use of particular grammatical structures associated with specific meanings, the higher the translation rates for quality.

2. Aim of the article

In this article, our goal is to try and see, through the analysis of a learner corpus containing students’ literary translation tasks, whether such a relationship as posited by Rabadán, Labrador and Ramon (2009) between linguistic homogenization and translation quality can be confirmed. Focusing on two linguistic features, derived adverbs and existential constructions, and on English to French translations by postgraduate students who are native speakers of French, we aim to measure the under-/over-representation of these two linguistic features in the translated texts, which have been the object of a prior, independent evaluation by a professional translation trainer, and see whether there is a correlation between this evaluation and the measurement of intra-language differences.

The two linguistic features that have been chosen for this exploratory study are (i) derived adverbs and (ii) existential constructions, which are both known to be problematic for translation trainees translating from English to French. Each of these will be considered in turn as a tertium comparationis for our comparisons, first between original English and original French in general (inter-language comparison) through the use of comparable corpora, and second between English original texts and their translation in French (also an inter-language comparison, but with the aim of discussing intra-language comparisons) through the use of our parallel learner corpus.[2]

The first linguistic feature, derived adverbs, corresponds to adverbs obtained through the addition of a bound derivational suffix to an adjective, –ly in English (finally, properly, honestly, quickly, slowly) and –ment in French (finalement, proprement, honnêtement, rapidement, lentement). The two sets of derived adverbs are generally presented as a case of translational equivalence (see Bertrand 1986), with the exception of morphological and semantic constraints, that is, cases of adverbs which have no equivalents in the other language (for instance successfully, familialement vs. *successeusement and *familially, which are lexical gaps) and false cognates, partial or full (for instance actually/actuellement; eventually/éventuellement). The second linguistic feature is existential constructions, that is, there-constructions in English and il y a-constructions in French, both structures having the same discourse function, that of introducing new elements into the discourse (see Bergen and Plauché 2005: 23 and Lambrecht 1994: 178 for a cross-linguistic comparison):

(1) a. There is a dog in the garden.

b. Il y a un chien dans le jardin.

This means that English –ly adverbs and French –ment adverbs as well as English and French existential constructions can be considered to “convey the same ideational and interpersonal and textual meanings” (James 1980: 178) and to be translationally equivalent. However, in spite of their translational equivalence, the two linguistic features show a highly significant inter-language/cross-linguistic difference in frequencies between English and French (see section 4.1). In particular, derived –ment adverbs have traditionally been considered to lead to poor style (see for instance Ballard 1994: 100; Chuquet and Paillard 1987: 154-155; Vinay and Darbelnet 1977: 126, but this idea is to be found in many other translation textbooks). Because of significant differences in frequencies between original English and French, derived adverbs and existential constructions are generally seen as translationese-prone phenomena, that is, they represent linguistic features that could lead to some interference between original texts and translated texts with an over-/under-representation of the linguistic features (in this case, an over-representation in translated French and an under-representation in translated English).

3. Methodology

For this exploratory study we used two types of corpora. First, to measure inter-language differences between original English and original French, we used samples of literary texts extracted from the 100-million-word British National Corpus (BNC) via the interface set up by Mark Davies (http://corpus.byu.edu/) and from Frantext (http://www.frantext.fr), an electronic corpus containing ca. 250 million words of French literary, philosophical, technical and scientific texts from the 12^th to the 21^st centuries (see details of the samples below). To ensure that our two samples were comparable in the technical sense of the term we focused on post-1980 fiction. The details of these two studies are provided in Cappelle and Loock (2013) for existential constructions and in Loock, Mariaule and Oster (2014) for derived adverbs respectively. In this article, due to lack of space, we only provide the results without any methodological explanations.

Second, we compiled a learner corpus of translations performed by advanced students from English into French, French being their mother tongue. These students were all enrolled in the MéLexTra Translation Master’s programme in the English department at the University of Lille in France. In the first year of the programme, the students’ final assignment is the translation of a short story or a chapter from a novel from English into French. All texts belong to contemporary fiction. We selected 16 translation tasks, which had already been evaluated independently by the students’ translation trainer, who performed a 3-group classification: the best translations (group A), generally correct translations (group B), translations that can be improved (group C). The criteria used for this evaluation were fidelity to the original text as well as fluency of the target language. We collected both the original and translated texts for this learner corpus, whose size amounts to ca. 400,000 words. Table 1 provides a summary of the contents of the learner corpus; Table 2 provides the translation trainer’s 3-group classification.

	Original English texts	Translated French texts
Minimum number of words	7448	8458
Maximum number of words	19,639	22,613
Average number of words	11,271	12,966
Total Number of words	180,337	207,466

Table 1. Content of the learner corpus

(good translation)

(correct translation)

(translation that can be improved)

(50,964 words)

(110,738 words)

(45,764 words)

Table 2. Independent 3-group classification by translation trainer

The methodology was as follows. First we checked the inter-language difference between English and French for the two linguistic features that we had selected using the comparable corpus which was set up using the BNC and Frantext. Then we analyzed our learner corpus of English>French translations and measured the frequencies of each feature in the original texts and in the translated texts, first globally (all translation tasks) and then for each student individually. We finally compared the extent of the difference in frequencies between original and translated texts with the 3-group classification established by the students’ supervisor to see whether there was a correlation between the two.

4. Results

4.1. Inter-language differences between English and French

In this section, we briefly provide inter-language results for original English and original French based on a comparison between the British National Corpus (BNC) for original English and Frantext for original French. Table 2 provides a summary of the samples that were analyzed.

	Existential constructions	Derived adverbs
BNC (original English)	15,909,312	15,644,928
Frantext (original French)	11,365,626	5,990,369

Table 2. Contents of the comparable corpus for inter-language comparisons[3]

The two linguistic features show a highly significant difference in frequencies: in both cases, the feature is much more frequent in English than in French. Cappelle and Loock (2013) and Loock, Mariaule and Oster’s (2014) results are summarized in Figures 1 and 2.

fig1

Figure 1. Existential there is vs. il y a constructions in original English and French (per million words), obtained from the BNC and Frantext (p < 0.0001)

fig2

Figure 2. Derived –ly adverbs vs. –ment adverbs in original English and French (per million words), obtained from the BNC and Frantext (p < 0.0001)

4.2. Inter-language differences between English original texts and their translations in French by students

In this section, bearing in mind the inter-language differences in frequencies discussed in the previous section, we report on the analysis of our parallel learner corpus containing original English texts and their translations in French by advanced students, first for derived adverbs, then for existential constructions.

4.2.1. Results for derived adverbs

When we compare the frequency of –ly adverbs in the original texts with that of –ment adverbs in students’ translations, we can see that the overall inter-language difference discussed above is also present in our learner corpus. Figure 3 below shows the minimum, maximum and average frequencies per million words (pmw). What the results show is that students use fewer –ment adverbs in their translations than original authors use –ly adverbs. However, such general results are limited in that there is a lot of variation as is shown in Figure 3 and also, these results do not say anything about the systematicity of the pattern. This is why it is important to consider individual results, that is, to consider each translation task individually. These results are provided in Figure 4 and show that the pattern is systematic: all students’ translation tasks show a frequency for –ment adverbs that is lower than that of –ly adverbs in the original texts. Although we can see some variation concerning the extent of the difference between the frequencies in the original texts and those in the translated texts, it seems that the behaviour of students is homogeneous, reflecting the fact that in original French, derived –ment adverbs are less frequent than derived –ly adverbs in original English. Note however that the mean frequency for –ment adverbs in translated texts is much higher than that which is found in original French fiction texts (7662 vs. 4540 occurrences pmw) and that intra-language differences with original language therefore exist (the frequency in original literary French is indicated with a black dotted line in Figure 4). Only one translation task (number 9) shows a frequency of –ment adverb that is actually lower than the “norm” to be found in French original texts, but this is probably related to the low frequency of –ly adverbs in the original text in the first place.

fig3

Figure 3. General comparison in the learner corpus between the frequencies of –ly adverbs in English original texts (left) and –ment adverbs in the translated texts (right) (per million words)

fig4

Figure 4. Individual comparisons in the learner corpus between the frequencies of –ly adverbs in English original texts and –ment adverbs in the translated texts (per million words)

Because of the variation observed in the frequency of –ly adverbs in the original texts, which must have some influence on the linguistic characteristics of the translations, we then decided, before investigating the correlation with the A, B, C classification performed by the students’ translation trainer, to calculate differential results (in per cent) between each original text and its translation in French. For instance, student 1, with a frequency of –ly adverbs of 11,499 occurrences pmw in the original text and a frequency of –ment adverbs of 6538 occurrences pmw in the translation, has a differential result of -43 per cent. However, student 7 has a differential result of only 19 per cent, with a frequency of –ly adverbs of 14,635 occurrences pmw in the original text and a frequency of –ment adverbs of 11,823 occurrences pmw in the translation. It is these differential results that we tested for a possible correlation with the evaluation of the students’ supervisor. Figure 5 provides the differential result (in per cent) for each translation task.

fig5

Figure 5. Differential results (in per cent) between the frequency of –ly adverbs in the original texts and the frequency of –ment adverbs in the translated texts.

4.2.4. Results for existential constructions

With the same methodology as for derived adverbs, we investigated our learner corpus to find out the frequencies of existential constructions in original and translated texts. We first checked the general results, which are provided in Figure 6, and which show that the general pattern is in line with what one would expect based on the inter-language difference in frequencies between original English and original French (with existential constructions much more frequent in English). There is also a lot of variation in both original and translated texts. However, the pattern is not systematic when one considers results on an individual basis (see Figure 7). Note that on average, as for derived adverbs, the frequency of existential constructions in the students’ translation task is higher than that which can be found in French original texts (1079 occurrences pmw, shown by the black dotted line).

fig6

Figure 6. General comparison in the learner corpus between the frequencies of existential constructions in English original texts and in the translated texts (per million words)

fig7

Figure 7. Individual comparisons in the learner corpus between the frequencies of existential constructions in English original texts and in the translated texts (per million words)

This examination of individual results shows that the students’ behaviour is very heterogeneous: while some of them do use existential constructions with a lower frequency in their translations than the original authors do in the original texts, others actually use more existential constructions than what can be found in the original texts (students 1, 6, 9, 13, 16). The extensive increase between the number of existential constructions in the original text and in the translated text that can be seen for student 9 needs to be tempered, however, as the results for this student are skewed and need to be removed from our analysis: the over-representation of existential constructions in the translation can be explained by the repetition of Y’a pas le choix (lit. contracted form of ‘There’s no choice’) as a translation for You got no choice (10 occurrences in the original text). Results for student 9 are thus removed for the rest of the analysis.

As a consequence, the differential results are not all negative (note that student 9 has now been removed and the analysis rests on 15 translation tasks), as shown in Figure 8.

fig8

Figure 8. Differential results (in per cent) between the frequency of existential constructions in the original texts and in the translated texts.

4.3. Correlation with the independent evaluation

4.3.1. Derived adverbs

When we now divide these results based on the A, B, C categories and calculate the average differential result associated with each of the 3 categories, a very nice, regular pattern emerges for derived adverbs : group A is associated with the highest differential result, that is, the translation tasks in which the difference between the frequencies of original –ly adverbs and –ment adverbs in the translations is actually the biggest on average (-41 per cent), while group C is associated with the lowest differential results, that is, the translation tasks in which the difference between the frequencies of original –ly adverbs and –ment adverbs in the translation is actually the lowest on average (-32 per cent), with group B lying in the middle (-38 per cent). Figure 9 sums up those results.

fig9

Figure 9. Correlation between differential results for derived adverbs and independent 3-group classification

However, when considering the translation tasks in detail, we can see that the pattern is not so homogeneous (see Table 3). This means that in spite of the general correlation observed in Figure 9, one cannot consider that the correlation holds for each translation task. In other words, there is no direct relationship between an “overuse” of –ment adverb by students in their English-French translation task and the overall quality of this task; an overuse of –ment adverbs is not symptomatic of a poor performance (results range from -27 to -55 per cent for group A; from -19 to -47 per cent for group C, which is actually quite similar).

Group A	Group B	Group C
-31,581352	-43,0936766	-24,3457642
-27,6685392	-39,5467903	-19,2122544
-55,917241	-25,2110374	-36,2510612
-49,1199462	-27,2993623	-47,0507374
	-46,8014748
	-48,1846434
	-47,525997
	-24,9691552
-41,0717696	-37,8290171	-31,7149543

Table 3. Details of the correlation between differential results for derived adverbs and independent 3-group classification

4.3.2. Existential constructions

With the same methodology, a correlation was sought between the differential results between the frequencies of existential constructions in original and translated texts and the A, B, C independent classification provided by the students’ translation trainer. As opposed to what was found for derived adverbs, no such general correlation exists: the highest differential results do not correspond to the best translations (group A) and the lowest differential results do not correspond to the translations in group C (see Figure 10). There is no need here to investigate individual results; the general results suffice to invalidate our hypothesis.

fig10

Figure 10. Correlation between differential results for derived existential constructions and independent 3-group classification

4.3.3. Discussion

The observation of the difference between the frequencies of the two selected linguistic features in the English original texts and their translation in French provides very mixed results and brings no concrete evidence of a correlation between linguistic homogenization and translation quality as far as the two linguistic features that we observed are concerned. This seems to suggest that the comparison of intra-language differences cannot be used as evidence for the quality of a translation task. We need to be very cautious here, as we are dealing with an exploratory study, but the analysis of our learner corpus suggests that there is no direct correlation between the frequency of use of a given linguistic feature and translation quality, even when the frequency of use clearly departs from the expected norm for original language.

It also seems that the behaviour of students is not similar and depends on the linguistic feature itself: a student who reduces the number of derived adverbs in the translation process will not necessarily reduce the number of existential constructions. Figure 11 clearly shows that students do not behave similarly for the two translationese-prone linguistic features that we retained for our study.

fig11

Figure 11. Differential results for both derived adverbs and existential constructions

Another limitation of our study is that we compare the frequency of a linguistic feature in each original text and in each translated text. But such an analysis actually provides no real information on the students’ translational behaviour: a reduced frequency of derived adverbs does not mean that students have only avoided translating the –ly adverbs in the original text with –ment adverbs in the target text; it is possible that they also added a number of derived –ment adverbs in some cases. The differential results cannot only be interpreted in terms of ‘omissions’ as they can also partly correspond to ‘additions’ (terms taken from Johansson 2007). This is something that we already noticed (Loock 2013) when we used both a comparable corpus and a parallel corpus to analyze existential constructions in professional translations. Overall results obtained by comparing frequencies in original and translated texts (such as those obtained in Cappelle & Loock 2013) actually hide more complex results; only the analysis of a parallel corpus can provide information on the omissions and additions of a specific linguistic feature by translators, as well as on translation strategies. The same approach could be adopted for students’ translations to analyze their translational behavior in more detail.

Finally, it is also important to note that the measurement of intra-language differences between original language and translations performed by professional translators also shows that frequencies for our two linguistic features also differ significantly with original language. In Cappelle and Loock (2013) and Loock, Mariaule and Oster (2014), it is clearly shown that such intra-language differences concern both French and English. We do not provide any data here but an analysis of post-1980 French translations from English fiction by professional translators (use of the CorTEx corpus of translated French from English fiction within the CorTEx project – Corpus, Translation, Exploration – and use of a self-collected parallel corpus of English-French contemporary fiction) shows that professional translators do use both derived –ment adverbs and existential constructions with a significantly higher frequency than that which can be found in original French. Similar results have also been found for French to English translations (based on the analysis of samples extracted from the Translational English Corpus), with lower frequencies this time in English translated from French than in original English for both linguistic features. We refer the reader to Cappelle and Loock (2013) and Loock, Mariaule and Oster (2014) for more information on the analysis of professional translators’ translations. If intra-language differences are to be correlated with translation quality as is claimed by the hypothesis which is formulated in Rabadán, Labrador and Ramon (2009) and which we tested in our exploratory study, then this means that translations performed by professional translators are also problematic, since they do not show linguistic homogenization either. Does this mean that there is room for improvement in these translations as well? Or is a certain range of deviation from the “norm” in original language acceptable? More generally, this raises the complex question of the third code: is it acceptable to say that translated language differs from original language (cf. translation universals as a natural phenomenon) or should translators aim to achieve linguistic homogenization between translated language and original language, meaning that third code is tantamount to translationese which as such should be eliminated from good quality translations? This complex question lies way beyond the scope of this article and has been the object of debates for a long time (cf. debates on the invisibility of the translator or the opposition between third code and translationese). Although our hypothesis in this article seems to suggest that we promote linguistic homogenization, we should remain aware that this is far from consensual.

5. Conclusions and future research

In this article, we have reported on an exploratory study on a learner corpus which investigated the possible correlation between, on the one hand, the frequencies of two translationese-prone linguistic features for translations between English and French and, on the other hand, translation quality. By comparing the frequencies of derived adverbs and existential constructions, first in a comparable corpus of original English and French, and then in a parallel learner corpus of English-French translation tasks, we have tested the hypothesis put forward by researchers like Rabadán, Labrador and Ramon (2009) according to which linguistic homogenization and translation quality go hand in hand.

In spite of providing interesting results, our exploratory analysis has not managed to completely confirm the hypothesis. Although results for adverbs do provide an interesting insight into the relation between an overuse of –ment adverb and translation quality, we have seen that such a correlation cannot be systematized for each translation task in our learner corpus. Results for existential constructions have proved to be completely inconsistent: a good translation is not directly linked to the frequency of the construction in the translated text in connection with that in the original text.

These results raise important methodological questions. First, our corpus is not very big and needs to be increased in size for future research. Second, we have used only one independent evaluator to categorize the translations into 3 groups: although this has the advantage of consistency, a second evaluator could confirm or contradict our evaluator’s classification. Also, the two linguistic features that we selected might not be the most discriminating ones: although they are known to be problematic for English-French students who are native speakers of French (cf. translation textbooks like Chuquet and Paillard 1987, Ballard 1994), other linguistic features are known to lead to source language interference: -ing participle forms (vs. –ant participle forms in French), the use of the passive voice (more widespread in English than in French), the position of the adjective (overuse of prenominal position in French translated from English). By combining the results for a series of linguistic features, one might define a check-list of features to be checked in order to determine translation quality.

Finally, we have decided here to focus on the extent of differences in frequencies for two linguistic features in original texts and their translations. There is however another possibility: measuring the extent of the difference between the frequency of a given linguistic feature in a translated text and in original language as a whole. We decided against this option in this article because we wanted to take into consideration the linguistic characteristics of each original text – the texts might not be representative of original/translated language as a whole, as is shown by the variation that we observed. However, this might be another way of looking at the data, which might bring stronger evidence in favour of a correlation between intra-language differences and translation quality. This is currently being done to complement the results of this exploratory study, with translations performed both by students (our learner corpus described here, but with an increase in size) and by professional translators (use of samples from the TEC and of the CorTEx corpus).

References

Baker, Mona (1993) “Corpus Linguistics and Translation Studies: Implications and Applications” in Text and Technology, Mona Baker, Gill Francis and Elena Tognini-Bonelli (eds), Amsterdam, John Benjamins: 233-250.

Baker, Mona, and Maeve Olohan (2000) “Reporting that in Translated English: Evidence for Subconscious Processes of Explicitation?”, Across Languages and Cultures 1, no. 2: 141-58.

Ballard, Michel (1994) [1987] La Traduction de l’anglais au français, 2^nd edition, Paris, Nathan.

Baroni, Marco, and Silvia Bernardini (2006) “A New Approach to the Study of Translationese: Machine-learning the Difference between Original and Translated Text”, Literary and Linguistic Computing 21, no. 3: 259-74.

Bergen, Benjamin K., and Madelaine C. Plauché (2005) “The Convergent Evolution of Radial Constructions: French and English Deictics and Existential”, Cognitive Linguistics 16, no. 1: 1-42.

Bertrand, Chantal (1986) “Quelques remarques sur les adverbes français en –ment et leur rapport sur les adverbes anglais en –ly”, Meta : journal des traducteurs 31, no. 2: 179-203.

Cappelle, Bert, and Rudy Loock (2013) “Is there Interference of Usage Constraints? A Frequency Study of Existential there is and its French Equivalent il y a in Translated vs. Non-Translated texts”, Target 25, no. 2: 252-75.

Chuquet, Hélène, and Michel Paillard (1987) Approche linguistique des problèmes de traduction anglais-français, Paris, Ophrys.

Frawley, William (1984) “Prolegomenon to a Theory of Translation” in Translation: Literary, Linguistic and Philosophical Perspectives, William Frawley (ed), Newark, University of Delaware Press: 250-63.

House, Juliane (2008) “Beyond Intervention: Universals in Translation”. Trans-kom 1: 6-19.

James, Carl (1980) Contrastive Analysis, London, Longman.

Jiménez-Crespo, Miguel A. (2010) “The Future of “Universal” Tendencies: a Review of Papers Using Localized Websites”, Talk given at the UCCTS 2010 conference, Edge Hill University, UK, 27-29 July 2010.

Johansson, Stig (2007) Seeing through Multilingual Corpora, Amsterdam/Philadelphia, John Benjamins.

Kurukowa, David, Goutte, Cyril, and Pierre Isabelle (2009) “Automatic Detection of Translated Text and its Impact on Machine Translation”, Proceedings of MT Summit XII,
URL: http://www.mt-archive.info/MTS-2009-Kurokawa.pdf (accessed 15 July 2014).

Lambrecht, Knud (1994) Information Structure and Sentence Form, Cambridge, Cambridge University Press.

Laviosa, Sara (2002) Corpus-based Translation Studies: Theory, Findings, Applications, Amsterdam, Rodopi.

Laviosa, Sara (1998) “Core patterns of lexical use in a comparable corpus of English narrative prose”, Meta 43, no. 4: 557-70.

Laviosa-Braithwaite, Sara (1997) “Investigating Simplification in an English Comparable Corpus of Newspaper Articles” in Transferre Necesse Est. Proceedings of the 2nd international conference on current trends in studies of translation and interpreting, 5-7 September 1996, Budapest, Hungary, Kinga Klaudy and Janos Kohn (eds), Budapest, Scholastica: 531-40.

Laviosa-Braithwaite, Sara (1996) The English Comparable Corpus (ECC): A Resource and a Methodology for the Empirical Study of Translation. Unpublished PhD diss, Department of Language Engineering, UMIST, Manchester, UK.

Loock, Rudy (2016) La Traductologie de corpus. Lille: Presses Universitaires du Septentrion.

Loock, Rudy (2013) “Close encounters of the third code: quantitative vs. qualitative analyses in corpus-based translation studies” in Interference and normalisation in genre-controlled multilingual corpora, Belgian Journal of Linguistics 27, Marie-Aude Lefer and Svetlana Vogeleer (eds): 61-86.

Loock, Rudy, Mariaule, Mickaël, and Corinne Oster (2014) “Traductologie de corpus et qualité: étude de cas”, Proceedings of the Tralogy II Conference, URL: http://lodel.irevues.inist.fr/tralogy/index.php?id=243 (accessed 15 July 2014).

Mauranen, Anna (2004) “Corpora, universals and interference” in Translation universals: Do they exist?, Translation Universals: Do they exist?, Anna Mauranen and Pekka Kujamäki (eds), Amsterdam/Philadelphia, John Benjamins: 65-82.

Olohan Maeve (2004) Introducing Corpora in Translation Studies, London/New York, Routledge.

Olohan, Maeve (2003) “How Frequent are the Contractions?”, Target 15, no. 1: 59-89.

Rabadán Rosa, Labrador, Bélen, and Noelia Ramon (2009) “Corpus-based Contrastive Analysis and Translation Universals. A Tool for Translation Quality Assessment English-Spanish”, Babel 55, no. 4; 303-28.

Teubert, Wolfgang (1996) “Comparable or Parallel Corpora?”, International Journal of Lexicography 9: 238-64.

Tirkkonen-Condit, Sonja (2004) “Unique Items – Over- or Under-represented in Translated Language?” in Translation Universals: Do they exist?, Anna Mauranen and Pekka Kujamäki (eds), Amsterdam/Philadelphia, John Benjamins: 177-84.

Tirkkonen-Condit, Sonja (2002) “Translationese – a Myth or an Empirical Fact? A Study into the Linguistic Identifiability of Translated Language”, Target 14, no. 2: 207–20.

Vinay, Jean-Paul, and Jean Darbelnet (1977) [1958] Stylistique comparée du français et de l’anglais, Paris, Didier.

Xiao, Richard (2010) “How Different is Translated Chinese from Native Chinese? A Corpus-based Study of Translation Universals”, International Journal of Corpus Linguistics 15, no. 1: 5-35.

Zanettin, Federico (2012) Translation-driven Corpora, Manchester, St Jerome Publishing.

Notes

[1] In this article we use the term “intra-language differences” to describe linguistic differences between original and translated language. Although this may seem to be synonymous with Frawley’s “third code”, we aim to use “intra-language differences” as an objective term, while “third code” already is an interpretation for such differences (see below).

[2] We here refer to the distinction between parallel and comparable corpora, defined for instance by Teubert (1996) respectively as “a bilingual or multilingual corpus that contains one set of texts in two or more languages” and “corpora in two or more languages with the same or similar composition” (Teubert 1996: 245).

[3] The number of words in our samples differ, for the following reasons. First, the automatic retrieval of adverbs ending in –ment in the Frantext corpus requires the use of the tagged part of the corpus (‘Frantext catégorisé’), which is much smaller than the non-tagged part of the corpus, although for existential constructions the whole corpus was used. Second, a different methodological decision was taken for derived adverbs as opposed to existential constructions: Cappelle and Loock’s (2013) article on existential constructions include poetry and drama as fiction texts (ca. 250,000 words) while Loock, Mariaule and Oster’s study (2014) does not: the slight difference does not invalidate Cappelle and Loock’s (2013) study, but for derived adverbs we wanted our samples to be fully comparable and our learner corpus consists only of novels and short stories.

About the author(s)

Rudy Loock is Professor of English Linguistics and Translation Studies in the Applied Languages Department at the University of Lille, France, where he is in charge of the multilingual specialized translation master's programme. His research interests include corpus-based translation studies, the use of corpora as translation tools, the didactics of translation, and translation quality. He is a member of the CNRS research lab "Savoirs, Textes, Langage".

Email: [please login or register to view author's email address]