Real-Time Subtitling in Flanders: Needs and Teaching

By Aline Remael & Bart van der Veer (University of Antwerp, Belgium)

Abstract

English:

Based on the transcription of talk given in Forlì on 17th November 2006.

Keywords: real time subtitling, flanders, vrt, respeaking, curriculum, master, hivt, simultaneous interpreting, sottotitolazione in diretta, fiandre, rispeakeraggio, interpretazione simultanea

©inTRAlinea & Aline Remael & Bart van der Veer (2006).
"Real-Time Subtitling in Flanders: Needs and Teaching"
inTRAlinea Special Issue: Respeaking
Edited by: Carlo Eugeni & Gabriele Mack
This article can be freely reproduced under Creative Commons License.
Stable URL: https://www.intralinea.org/specials/article/1702

Aline REMAEL: Our title reads “Real-Time Subtitling in Flanders: needs and teaching”. We will be looking at “needs” from the point of view of skills; in other words, what kind of skills do respeakers need and what are the implications for teaching, which is the business we are in. First of all, let me say a few things about SDH on Flemish television. Currently, VRT is the only public Dutch speaking channel in Belgium that provides SDH. VRT started as early as the 1980s. The introduction of the service was linked to the introduction of teletext on television. Now, they subtitle about 50% of the Dutch programming. The non-Dutch programmes (and some Dutch programmes in which dialect is spoken) get open subtitling in Flanders, and by 2010, VRT will be covering 100% of Dutch programmes. Again, the rest of programmes will have open subtitles. This arrangement has been included in the New Management Agreement which exists between the Flemish public channel and the Flemish Government.

As for the commercial channels, so far, they provide SDH only very occasionally, and just for some very popular programmes. The amount of subtitling for the deaf and the hard of hearing provided on an annual basis amounts to next to nothing.

Newcomers on the SDH market are the local channels, that is channels whose programmes are broadcast only for a particular province. They have to provide SDH for their news programmes. The Government subsidises investments and training. Unfortunately for us, subtitlers tend to be journalists, that is people already working for the channels.

As for live subtitling on Flemish television, I have to restrict myself to the public channel because the others do not provide any so far. The kind of programmes that are subtitled live is basically sports programmes and the news. The news is subtitled both live and semi-live depending on what passage of the news we are talking about, because some of it can be prepared. The type of subtitles is always pop-on, or block, so we do not have scrolling subtitles. This type of subtitling therefore always involves a degree of editing. To this purpose, there is a style-sheet for SDH generally with a subsection on live subtitling. Some of the differences the stylesheet names are for example:

• Avoid references to the images.

• Short and rhythmic subtitles are preferable to long ones.

Still the stated differences between the style-sheet for live subtitling and the one for pre-recorded SDH remain underdeveloped.

To determine what the current needs are and meet them, we therefore work closely with VRT, the public channel, in the field of training. We are currently setting up training programmes in our faculty so as to provide VRT and hopefully other channels with good respeakers in the future. VRT teletext is especially interested in improving speech recognition, reducing the need for correction and thereby the need for the corrector, who now works in team with the respeaker. The channel is particularly interested in reducing the lag especially in interviews where you have the very common phenomenon of the subtitling appearing late and therefore under the head of the following speaker rather than the current one. Besides, VRT want to improve the performance of respeakers, which is where we come in. Actually, we may also be contributing to improving some of the more the technical aspects of live-subtitling with speech recognition in future, since we are also preparing a joint research programme with VRT teletext and colleagues from the University of Antwerp.

Up until now, subtitlers for the deaf and the hard of hearing and respeakers at VRT have been translators and people with other languages degrees, degrees in communication science etc. However, VRT is now experimenting with interpreters and they are finding that they do a much better job. However, they do not really have a specialized in house training course so they are more or less relying on us and on other translation schools to provide interpreters with the skills required for respeaking. In order to develop such a course, we are now analysing “needs” and “skills”, trying to find answers to these three basic questions:

  • in what sense do the two jobs differ?
  • in what sense are they similar?
  • in what sense do the skills of standard interpreters need to be adapted?

Bart VAN DER VEER: As from this year real-time subtitling is part of the curriculum of the master in interpreting at our institute, HIVT. To be precise it is part of the specialised forms of interpreting, because we also offer community interpreting, liaison interpreting and speech-based real-time subtitling. In less official documents, it is often said that respeaking is linked to simultaneous interpreting, that’s why we’d like here to motivate our decision at our institute to include speech-based real-time subtitling in our master on interpreting curriculum. In the master we usually refer to speech-based real-time subtitling for television, but we also refer to other uses of this technique.
So, the research question is:

• is a good interpreter automatically a good respeaker?

In what follows I will concentrate on similarities and differences between respeaking and simultaneous interpreting in general and other specialised forms of interpreting where there is a clear interface with new technologies, virtual environments, and so on.

As for similarities, both professions involve four main activities as put forward by Rhoderic Jones:

  • listening;
  • understanding;
  • analysing;
  • and re-expressing.

In both cases, the source text is a one-time oral input whereas the target text is an oral output which is not generally reproduced and it is also difficult to correct the target text, the output. This problem is due to time limits.

Then there is the real-time aspect or to quote Seleskovich “les paroles sont prononcées dans une langue contemporaine”, so this is a real-time activity.

But there are of course also obvious differences. The first difference concerns the visual aspect. For the respeaker, the speaker and his/her audience are not directly visible, while a conference interpreter for example has a clear view of the speaker and of his/her audience.

Communication aspects: in a traditional conference there are three parties: the source language, the target language and the interpreter all in one room. So, there is the possibility of feedback as it’s happening here: if there is confusion, technical problems, or whatever feedback is immediately provided. The speaker can be asked to speak more slowly. When the respeaker is in a television studio there is no feedback from the audience.

Then, there are differences as far as the emotional and psychological aspects are concerned. This could be an interesting topic for further research. Both activities involve pressure and emotions. This has to do with the nature of the activity, the pressure of a live activity.

Another big difference is that in real-time subtitling there is a mixed oral and written dimension: an activity which is spoken by the original speaker, to be spoken by the respeaker (it is a kind of hybrid language since it is not a natural language, but it is articulated, punctuation is pronounced, etc.), and the speech is meant to be written and the written speech, in the end, is meant to be read by the viewers at home.

Then, real-time subtitling requires also subtitling skills like segmentation and punctuation. To compare respeaking to a more specialised kind of interpreting, namely remote interpreting, film interpreting and real-time television interpreting. To start with the first, there are some striking similarities between respeaking and remote interpreting, since the latter calls for the interpreter not to be present in the meeting room. He/she works instead from a screen with a headset and without a direct view of both the meeting room and the speaker.

In both cases the interpreter and the respeaker work from a screen. There could also be identical motivations to use this kind of interpreting (remote interpreting or real-time subtitling), for example because of the low costs involved or of physical building constraints like a shortage of booths, or reluctance to install booths in historical meeting rooms.

Then, there is a similarity in the notion of presence, the feeling of being there, which is common to other activities where the tasks are performed in a virtual environment. If you are virtually present, do you feel that you are there or do you feel that you are not there? In this sense, research is carried out by Mouzourakis, for example.

As for the differences, there is a universal agreement for remote interpreters to have a good view of the speaker, at least, but this is not always the case: real-time subtitling for television is most of the times concerned with live interviews but also with programs like sports where there is no speaker visible and this can be problematic to the respeaker.

It is also necessary to bear in mind that the sound and image transmission is different and with different purposes. In the case of remote interpreting, the sound and image transmission correspond, in an ideal situation, to the needs and expectations of the interpreter, whereas in real-time subtitling sound and image transmission correspond to the television programme which is meant to be viewed by television viewers and not primarily by the real-time subtitler.

Film interpreting: this is an oral mode of screen language transfer often involving professional interpreters for film festivals, especially here in Italy. Russo has written some interesting articles about this. Film interpreting shares with respeaking the following aspects:

  • both interpreters and subtitlers work from a screen;
  • there are rigid time constraints, because there is a need for synchronisation;
  • in both cases, text reduction strategies are used;
  • in both cases, both interpreters and subtitlers should pay particular attention to what I call realia, that is to say famous names, geographical references, names of institutions. He/she has to be prepared to all foreign names and words;
  • in both cases, both interpreters and subtitlers have difficulties in keeping up with rapid dialogues.

As for differences: if you translate a film, then you often have the script available so that you can prepare your translation in advance. Obviously, in a live transmission or broadcasting this is impossible. As for the audience, there is a difference in that for film interpreting you are working for a live audience.

There are also differences in presentation skills: film interpreting requires a pleasant voice. Research has shown that a pleasant voice is much appreciated by the audience, whereas in real-time subtitling presentation is subordinated to articulation skills.

Finally, I’d like to compare respeaking to live television interpreting. Here again there are some striking similarities:

  • as far as participants are concerned, there is an on-screen cast (an interviewer and an interviewee) and an off-screen cast (the initiator, a TV channel, and TV viewers[1]);
  • there is little or no preparation because it is live;
  • there can be difficulties in turn taking, dialogues;
  • there can be difficulties because there is no coordination nor eye contact with the on-line speakers (you have to look at the screen and there is no interference between both participants);
  • the décalage has to be kept as short as possible because often there is a rapid pace of utterances;
  • and in both cases, both the interpreter and the real-time subtitler are dependent on screen images (adequacy, sound quality, etc.). For this purpose emergency strategies have to be prepared in case it is difficult or not possible to understand what is said.

As far as the differences are concerned, a big difference is to be found in the setting: for example the interpreter can be visible on television, maybe asked to sit near the interviewee or the interviewer, and of course there is a total lack of privacy because he/she is live on television. Presentation skills are the same as the film interpreter (pleasant voice vs. articulation skills).

To wind up this section, it is probably necessary to refer to Corinne den Boer who claimed, some years ago, that the ideal person for the job is a qualified interpreter and a professional subtitler. In addition, we think that it is extremely convenient if this person has also experience with virtual environments. Therefore special techniques are required.

Aline REMAEL: Considering all this, we have been looking into what we have at our department for translation and interpreting, and what we need in addition. We already have an interpreting section, where students acquire a number of skills: listening and comprehension skills, memory training, oral skills, acquisition of specialised vocabulary, organisation of your documentation and databases, specific reproductive interpreting skills. We believe that all these interpreting skills will be useful for future respeakers as well. But more is needed.

In subtitling and particularly in subtitling for the deaf and the hard of hearing, segmentation of speech flows on the basis of semantic-syntactic criteria is very important. However, there is a significant difference between block subtitling of speech and scrolling subtitling that keeps rolling on to the screen, following speech much more closely. Since Flemish television is making use of edited block subtitles, that is what our course will be focusing on in the first place. This means insight into subtitling concepts such as reading speed and spotting, but also exercises in oral summarizing and reformulation (rewriting in the case of prepared subtitles) are very important, and must later be combined with interpreting skills (e.g. memory training, allowing the respeakers to adapt the subtitling skills to a different environment).

The students will also be required to have some knowledge of the current national style-sheet, in our case the style-sheet provided by VRT, since we will mainly be working with them, to have a general knowledge of translation norms in the subbranch of (open) subtitling, and to get hands-on experience of the currently used software (we are now testing Softel technology).

But we are not out of the woods yet, there is more to respeaking still. First of all, the way students segment speech will have to be adapted to “Dragon Speak” (we are going to use Dragon Naturally Speaking, because it is the only respeaking program having Dutch, at this time). Indeed, rewriting for subtitling and rewriting/reformulating for reading/speaking into speech recognition programmes may be different. The training of an intuitive sense of reading speed for segmentation may also be important. Respeakers do not have the time to do the spotting beforehand, they do not have the time to check how many seconds a subtitle remains on the screen, but I know from experience that my student-subtitlers eventually acquire a good intuitive sense of how long five or six seconds last. Such an intuitive sense of how long five or six seconds last (or whatever the norm is) will allow respeakers to determine when to pause, and enable them to speak in short stretches of text (subtitles), taking the demands of the respeaking software into account.

In as far as respeaking itself is concerned, students will need a lot of reading exercises in order to get used to the speech recognition software. Presentation may be less important than articulation (cf. above), plus reading rhythm and anticipation of mistakes (the software will never be perfect, so if you can anticipate mistakes, so much the better) will be crucial. To conclude, the input of vocabulary into the system and preparing the software for efficient use are also core skills, i.e. students must know how to make use of settings that can anticipate and prevent recognition/production errors.

Eventually, all the skills have to be combined. First the students may have to focus on reformulation and segmentation: viewing the program, preparing the segmentation, and rewriting the text on paper or on the their PC, then re-read with punctuation. Then comes the respeaking stage: viewing part of the programme several times, followed by a respeaking exercise of that programme. Finally the students should move on to respeaking “pure and simple”, with short passages to be subtitled live, and conclude, hopefully, with practical training sessions at the TV channel.

Another important question is what kind of course material must used. We should obviously use material comparable to what our future respeakers will be working with later. Some sports programmes are relatively easy because you do not have to provide comments all the time, only now and again. The respeaker can therfore pause once in awhile and would use transcriptions at first, then work without or with little preparation. An alternative source of material is what we call the “youth news servive”, which is a news programme for young people, using fairly simple vocabulary and simple or coordinated sentences that lend themselves to segmentation. We have been watching and listening to such broadcasts and they really pronounce very neat short sentences, and the pace is much slower. In other words, it is perfect material to start with, perfect material for training. Finally, clips from “real” news programmes will have to be tackled, because that is what is subtitled live most of the time.

This is where we are. The course starts officially in October 2007, when our new master programme in translation and interpreting is launched, but we are are now doing some tests with a number of volunteers (we had too many, so we have had to limit them to four). It is thanks to the reworking of our curriculum for the master that we have been able to introduce this new variant in our interpreting programme. Meanwhile, we are also continuing the research with VRT and with colleagues from the University of Antwerp into the efficiency of block subtitles vs. scrolling subtitles in the production of subtitles and in preventing mistakes. This is being done with the help of logging software developed by Luuk Van Waes and Marielle Leijten at UA. The software records all the mistakes that crop up in the subtitles, records all the differences in delay yielding objective material that will hopefully allow us to evaluate the two systems.

Notes

[1] Gabriele Mack has written about live television interpreting so I can refer to her work about this.