Users’ expectations of zarzuela audio description:

Results from a focus group

By Irene Hermosa-Ramírez and Miquel Edo (Autonomous University of Barcelona)


Audio description (AD) for the scenic arts is no longer assumed to be an add-on service to the production, detached from its users and the creative team. Following recent user-centred proposals in the field such as participatory accessibility (Di Giovanni 2018) and poietic design (Greco 2019), this article aims to assess users’ preferences for zarzuela AD through a two-session focus group. Zarzuela is a lyric theatre genre characterised by the alternation of sung and spoken dialogue, Spanish costumbrist themes, and popular settings. Although zarzuelas are audio described at the Teatro de la Zarzuela and Teatro Real in Madrid, and at the Gran Teatre del Liceu in Barcelona, accessibility is lacking in touring productions. Our focus group has been conducted with older users from Valladolid: experienced theatregoers who have nonetheless never attended an audio described zarzuela. The focus group is divided into two parts: First, participants share their cultural habits, general AD consumption, technological usage, and experience with participatory accessibility. Therein, users make a clear distinction in expectations between pre-recorded/non-social vs. live/social AD and activities, expressing a strong preference for the latter. Second, participants assess different stimuli for zarzuela audio introductions and AD, refusing in-depth scripts in favour of more minimalist assistance. Unlike in opera (Orero et al. 2020), most participants do not see the need for providing audio subtitles for this genre, as the sung numbers can be mostly followed or their lyrics are not that relevant, after all.

Keywords: scenic arts, accessibility, audio description, reception, zarzuela, focus group

©inTRAlinea & Irene Hermosa-Ramírez and Miquel Edo (2022).
"Users’ expectations of zarzuela audio description: Results from a focus group"
inTRAlinea Special Issue: Inclusive Theatre: Translation, Accessibility and Beyond
Edited by: Elena Di Giovanni and Francesca Raffi
This article can be freely reproduced under Creative Commons License.
Stable URL:

1. Introduction

Academic research on media accessibility services has shifted towards reception and user-centred studies (Greco 2019). This has also been the case in audio description (AD), the verbal rendering of the visual elements in a given audiovisual production or cultural product. Reception studies in the context of AD for the scenic arts have found a special interest in a) creative teams collaborating in the creation of accessible services (see Udo and Fels 2009; Udo, Acevedo and Fels 2010; Cavallo 2015; Fryer 2018); and b) the users’ preferences (Udo, Acevedo and Fels 2010) and participation in the creation of these services (Di Giovanni 2018). The involvement of both the creative team and the users of the accessibility services has been coined “poietic design” (Greco 2019).

With this background in mind, the present study first takes a special interest in seeking an updated account of older users’ opinions on state-of-the-art aspects of AD such as the use of synthetic voices (Fernández-Torné and Matamala 2015; Walczak and Fryer 2018), experience with participatory accessibility initiatives (Di Giovanni 2018), and general cultural habits. As its main objective, however, this article aims to assess AD users’ preferences for the genre of zarzuela – a traditional Spanish form of operetta in one to three acts intermingling spoken dialogue with sung numbers – a largely unexplored genre in the context of AD. To do so, a two-session focus group is conducted to gather preferences and good practises for this AD modality.

Previous preference studies on AD employing qualitative methods have successfully applied focus groups, sometimes as a standalone method and others in combination with surveys. Focus groups have been conducted in the field of Media Accessibility to discuss participants’ preferences with regard to novel services such as AD for 360º videos (Fidyka and Matamala 2018), subtitles in 360º videos (Agulló and Matamala 2019), and mobile apps integrating accessibility services (Jankowska 2019); to assess personalisation in AD (Lopez, Kearney and Hofstädter 2018); to gather user feedback about integrated AD (Fryer and Cavallo 2018); and to assess the development of an easy-to-understand AD modality (Arias-Badia and Matamala 2020), to name just a few. Mostly, previous focus groups have involved accessibility services users and professionals, sometimes separately and sometimes together. In our case, the rationale for choosing the focus group methodology is to bring to the fore a particular set of users that often go unnoticed: users with a more limited range of access services because of their geographical location who are also of an older age.

While there have been studies testing the potential of AD for elderly audiences without sensory impairments (see, for instance, Jankowska 2019), a certain lack of focus on older audiences with visual impairment can be observed in the existing literature on AD, and on genres (audiovisual or otherwise) preferred by them. We agree with Chmiel and Mazur (2016: 273) in that it is “necessary to include elderly respondents, who may be less mobile than their younger counterparts, but who constitute a significant portion of the AD target audience”. We take one step in this direction by involving only older adults with varying degrees of visual impairment in this study.

The present article begins with an introduction to the zarzuela genre, followed by a detailed account of the methodology applied. Section 4 presents and discusses the results of the study. The fifth and last section is devoted to conclusions and future research avenues.

2. Zarzuela

Zarzuela is a lyric drama genre, either in one act (género chico) or several (zarzuela grande). Unlike opera, it alternates spoken and sung parts (Plaza 1990: 23; Temes 2014: 16-17). Zarzuela is genuinely Spanish in the sense that it has been mostly nurtured in Hispanic countries, the libretto is written in Spanish, and the settings, the language and the literary forms usually showcase the so-called “costumbrism” (Herrero 1978: 343), namely the literary genre that delves on the collective rituals, ways of life and social habits of a Spanish or Spanish-speaking community. However, despite its popular and almost folkloric dimension, zarzuela is considered to be a cultivated music genre (Alier 2002: 24) and is not devoid of an “utopian impulse” towards social transformation (Ferrera 2015). Perhaps the most accurate definition is the one attributed to Nietzsche: “Simply cannot be emulated […] one would have to be a rogue and the devil of an instinctive fellow – and serious at the same time” (Webber 2019: 109).

Zarzuela was highly successful in the first third of the twentieth century, a period which most musical historians close with the 1936 premiere of La tabernera del puerto, by composer Pablo Sorozábal and libretto by Federico Romero and Guillermo Fernández Shaw, which is precisely the play that features in the stimuli for the present study. Sorozábal’s innovations such as the inclusion of Caribbean rhythms show “what could have been a new horizon for zarzuela” (Temes 2014: 241). Nevertheless, after the Spanish Civil War, the composition of new zarzuelas decreased abruptly. The genre “entered a period of malaise” due to the changes in the social structure that had favoured its rise, not being able “to stage an adequate number of new works” (Marco 1993: 91). Only the old plays attracted the audience (Alier 2002: 127) and only this stagnant repertoire has continued to be performed to the present day. The genre still has a great number of aficionados, particularly among the older generations.

In terms of accessibility, there are regularly audio described zarzuelas at the theatre devoted to the genre in Madrid: the Teatro de la Zarzuela, accompanied by touch tours. Some zarzuelas with AD are sporadically programmed at the Teatro Real, also in Madrid, as well as at the Gran Teatre del Liceu in Barcelona. Touring zarzuela productions with AD, however, are extensively less frequent, as evidenced by the current and previous programme of Teatro Accesible, the project that provides most of the touring ADs in Spain[1].

3. Methodology

A qualitative research approach was chosen to (1) assess current accessibility topics with older end-users, and (2) to gather their preferences, expectations and needs regarding zarzuela AD. An online focus group methodology was applied, replacing the originally planned in‑person focus group to adapt to the COVID-19 pandemic. The focus group was thus held online in two separate sessions. The first one took place on 29 September 2021 and the second on 1 October 2021. The first session intended to contextualise users’ cultural habits and needs and provide an account of their experience with burning topics in the field. The second session was devoted to discussing different stimuli of AD for La tabernera del puerto. Each session lasted approximately one hour. The focus group sessions were organised through Microsoft Teams and, while some participants had issues logging in to the platform as they were unfamiliar with it, they were ultimately able to attend with some guidance via telephone calls. For the analysis of the qualitative data, the transcribed focus group was taken as the main data source. The qualitative software tool ATLAS.ti was utilised to gather the responses around themes or codes, following the terminology of the software. The themes from the first session were, in order of appearance, 1) cultural habits, 2) general experience with accessibility, 3) use of technology, 4) preferences regarding AD outside of the scenic arts, 5) preferences regarding AD for the scenic arts (including content selection criteria and overlapping issues with the dialogue), and 6) previous experience with participatory accessibility. For the second session, the questions were gathered around 1) the participants’ reaction to a zarzuela audio introduction (in terms of content selection and language use), and 2) the participants’ preferences regarding four different stimuli for the same zarzuela scene, from a music-centric AD to a logo-centric AD.

The choice of this method was motivated by the distinguishing characteristic of focus groups (as opposed to group or individual interviews), which, according to Kitzinger (2004: 269), lies in the groups’ interaction. The aim of this article is not only to focus on the interaction of the group, but also to continue the co-construction of meaning with participants after the sessions were over. Specifically, we sent our conclusions back to the participants and invited those who were interested in continuing the conversation to validate them. The intent is to give back some of the agency that is inevitably dominated by the researcher. This is in line with the transformative-emancipatory paradigm (Shannon-Baker 2016), where researchers engage in cyclical reviews of results (Creswell and Plano Clark 2018: 38). It must be noted that we cannot claim to fully adhere to said paradigm because users did not participate fully in the design of the study as co-researchers – for instance, developing questions, selecting the most appropriate research method, etc. –, but we believe that participatory approaches are worth exploring in the field of Media Accessibility (Hermosa-Ramírez forthcoming).

A facilitator led both focus group sessions, accompanied by a note-taker who gathered the responses and later summed up the conclusions in order for the participants to validate them. The focus groups were conducted in the scope of a collaboration between the Castilla y León branch of the Spanish National Organisation of the Blind (ONCE) and the Universitat Autònoma de Barcelona.

3.1. Participants

A purposive sampling strategy was applied, given that the selection of participants was directly linked to the research aims (Bryman 2012: 416). The sampling criteria required for the participants to be of an older age[2], and to live outside the cultural hubs of Madrid and Barcelona. Participants were recruited from the ONCE branch in Castilla y León. Additionally, all of them are amateur actors in the theatre group organised by the local ONCE branch. As such, the results reported throughout this article are not generalisable. Rather, they seek to paint a picture outside the capital cities’ cultural hubs.

Four participants took part in the first session of the focus group and two more joined for the second (overall two females and four males). They all had varying degrees of visual impairment and were well acquainted with each other from their theatre group. Although the number of participants was admittedly rather limited, it was deemed acceptable as smaller groups are recommendable when participants are expected to have much to say on the subject or display an emotional attachment to it (Morgan 1998). This was certainly the case, as was apparent from previous conversations with the coordinator of cultural activities at the ONCE branch before the focus group, and because of their active involvement in theatre.

Before the study commenced, detailed information about the project was shared with the participants and they orally gave their consent in taking part in the focus group and allowing for the sessions to be recorded. The study had previously been approved by the Ethics Committee of the Universitat Autònoma de Barcelona and ethical guidelines were followed to ensure the anonymity and privacy of the participants. For anonymisation purposes, we have coded the participants’ names with colours whenever verbatim interactions are quoted. For brevity, these interactions – originally in Spanish – are presented directly in their English translation by the authors, as are the quoted fragments from the stimuli.

3.2. Stimuli

In order to test different content and style approaches to zarzuela AD, a video recording of the 2018 production of La tabernera del puerto was utilised. The recording is hosted in the free digital library Teatroteca[3] and explicit permission was requested and granted by the Centro de Documentación de las Artes Escénicas y de la Música to show these fragments, with our added AD, for research purposes. The first stimulus was an audio introduction to the entire play. Audio introductions are described by Fryer and Romero-Fresco (2014: 11) as “pieces of continuous prose, spoken by a single voice or a combination of voices lasting between five and 15 minutes” typically provided right before a show begins. The authors prepared an extended audio introduction for participants to pinpoint the most relevant content of La tabernera del puerto. Following Di Giovanni (2014), audio introductions will include an “overall […] presentation, genre and structure; synopsis; information about the visual style; characters; locations and, last but not least, cast and production details”. Along with these contents, our proposal introduced historical and musicological information about the author and the play. More details on this audio introduction can be found in section 4.2.

Subsequently, the last scene of the first act of La tabernera del puerto was selected to test four different approaches to zarzuela AD: from a less intrusive music-centred version to a more intrusive overlapping approach (see Table 1 for more information on each version). For context, in the fishing village where the action takes place, a new tavern is open, run by Marola, a stranger whose beauty has enthralled the local men. The first act closes with a number featuring the female choir and Marola. This interaction is described following a minimal AD, a musical AD, a theatrical AD, and an AD with audio subtitles.


AD versions


Minimal AD

A foreshadowing AD between scenes that allows for uninterrupted listening of the musical number. The AD overlaps with the (spoken) ending of the previous scene. The assumption is that the sung dialogue will be understandable for the audience with minimal aid. For illustration purposes, the minimal AD reads as follows: “The boy Abel leaves, outraged. Next, the female choir storms into the scene to intimidate Marola. There are about twenty women, led by Antigua. They accuse Marola of seducing their husbands and shaking up the village. She defends herself just as vehemently, going as far as giving them advice on how to treat their husbands.”

Musical AD

A version of the AD still prioritising the music yet sometimes overlapping with the sung dialogue. Instead of foreshadowing, this description is synchronised with the action, all while making an effort to avoid the peak musical moments. The assumption is also that the sung dialogue will be understandable for the audience.

Theatrical AD

A version of the AD focusing on the theatrical signs of the scene. This version somewhat neglects the music in favour of a more detailed description of the characters’ appearance and actions. Audio subtitles are not read aloud, but they are paraphrased, as the assumption here is that the sung dialogue will not be understandable enough, even if the audience is made up of native Spanish speakers.

AD with audio subtitles

After some brief introductory sentences (“Marola is left alone in the harbour” and “A mob of about 15 women from the village approach Marola”), the AD reads aloud the audio subtitles. Two different voices (one male, one female) not only distinguish between the AD and the audio subtitles (Braun and Orero 2010), but also each take on one of the roles (Marola vs. the choir). Thus, the voiceover constantly overlaps with the sung dialogue, as the assumption is that it is not understandable at all, and that comprehension should be prioritised at the expense of the musical elements.

Table 1: Stimuli description

4. Results and Discussion

4.1. Results regarding cultural habits, use of technologies and participatory accessibility

Faced with the general question of describing their preferred cultural habits, most participants preferred activities outside the home, such as performing theatre themselves, going on trips, and attending theatre plays and film screenings. However, they also highlighted listening to the radio and to music, reading books, etc. Screen readers, accessible mobile applications and DAISY players were mentioned as valuable aids to access said activities. Specifically, Participant White mentioned Google Maps as an accessible navigation tool to get around when travelling: “I say: ‘I want to go to that place’ and Maps [orally] explains to me how to arrive: ‘You have to turn right, over the second corner, X street’ […]. It takes me a while, but I get there”. ONCE’s own lending system for films and books is also routinely utilised by most participants. Perhaps more unexpectedly, users also alluded to the fact that they were eager to take pictures with their phones: they utilise the app Seeing AI, which automatically describes the image when the camera application is open. For instance: one small face, two faces. Seeing AI similarly describes photos sent to them by others and works as an OCR, among other features (see Aafaq et al. 2019, for technical information on automatic video and image description). Interestingly, they reported that the application makes it possible for users to manually indicate who appears in the photo, and the app consequently recognises this person in other photos. This leads us to suggest that the technological gap between younger and older users is narrower than what one may expect.

Aside from this, users made a very clear distinction between technological applications that facilitate everyday life – which are based on text-to-speech features, working on the basis of artificial intelligence or both – and accessible services intended for cultural activities. That is, quality, in their view, necessarily incorporates a human element. They vehemently opposed the application of synthetic voices to cultural products, thus contradicting previous studies on AD that deemed text-to-speech AD acceptable for film (Fernández-Torné and Matamala 2015), and especially certain genres such as documentary (Walczak and Fryer 2018). Professional (human) voice acting was deemed an utmost priority for AD of any genre, and even more so in the case of audiobooks:

[Participant Blue]: I despise synthetic voices [...]. And not only them but also sometimes certain human voices. [...] We have had some great voice actors who, in general, belong to the staff of Radio Nacional de España and the radio station Cadena Ser. They were great readers. However, I have come across some books [...] that were appalling, as if you were listening to a catechism class.

[Participant Red]: I feel the same way. Synthetic voices are not natural or normal and I don’t like them. The same goes for audiobooks. Sometimes, just by listening to who’s reading you feel like giving up. Other times you fall asleep listening to it, it’s that bad.

As a final note regarding general technology, Participant Blue reported that he was a braille user and preferred to read with his braille display, for example, rather than listen to audiobooks: “I’m like those readers that prefer to read on paper [rather than on an e-reader], I’m the same with braille”. He felt that braille users are being underserved, as audio-based assistive solutions have gained popularity.

Moving on to the subject of AD in general, participants organically made yet another distinction in consumption and expectations between higher and lower culture activities. Most showed disdain towards television and its AD. One of the users even quoted Groucho Marx, as he’d rather his television remains off. Among their shared complaints towards AD for television and film, participants highlighted a dissonance that can be cognitive: “it seems that we are attending to a film, on the one hand, and the narration of the AD, on the other hand”, or in terms of the tone of the audiovisual production: the AD register is too formal, polite, or pedantic. Users did clarify that the quality of AD for television and film has nonetheless improved in the last few years.

By contrast, participants praised the quality of AD for theatre, where they know and trust the audio describer of the theatre they most commonly attend. Importantly to them, this person is an insider from the theatre world: she is an actress herself, which, according to the users, provides her with a special sensitivity to select the most relevant information for the AD, time the AD fragments with precision and avoid disturbing the play. Involving an insider in the creation of an AD is line with Fryer’s (2018) integrated AD proposal and Romero Fresco’s proposal of accessible filmmaking (2019), although admittedly this actress does not perform in the productions she describes. However, the connection between quality and the involvement of creators and members of the production (Greco and Jankowska 2019: 3) is, at least partially, remarked on by the users.

Moreover, the antithesis between television/film – and even bad audiobooks – and theatre in terms of AD quality can also be traced to the difference in expectations for at-home activities vs. outside-the-home activities. Outside-the-home activities are met with a positive predisposition, while at-home activities are subject to harsher criticism. In a nutshell, there is a strong preference for social activities. Participants try to never miss any of the collective activities organised by ONCE in their town. They sometimes attend plays even when there are no accessible services on offer, and they are members of a theatre group and demonstrate true enthusiasm when the facilitator suggests the idea of organising an accessible scenic arts trip to Madrid. Even when discussing their use of mobile applications, they highlight their exchange and communication possibilities, such as content and photo sharing, and they joke about their almost transhuman powers, i.e., taking photos or spoiling the action to their (sighted) partners when listening to an audio introduction.

For this group of users, this bias towards social activities also determines the aforementioned rejection of the least human side of technology: synthetic voices. And theatre differs from television and film not only in the presence of our companions but also of the actors: “there is an equally valid sense which shows movies to be the mediated art and theatre the unmediated one. We see what happens on the stage with our own eyes. We see on the screen what the camera sees” (Sontag 1966: 30). The fact that the blind and visually impaired audience is able to listen to the characters live, just a few metres away, is not a minor detail.

Moving on to their previous experience with audio described sung theatre (opera, musical theatre, and zarzuela), participants reported that they had no experience with AD for those genres. Conversely, some had attended plays of this nature without AD, and one participant owned several DVDs of recorded zarzuelas and was a great fan of the genre. Nonetheless, participants were eager to attend such audio described performances if they were available to them.

Regarding the participants’ consumption of recorded scenic arts outside the physical theatre, they reported not to be aware of audio described theatre plays, operas or zarzuelas on TV, on streaming platforms, or through any other medium (i.e., the ONCE lending system). This leads us to believe that greater efforts could be made in the advertising and dissemination of such services so that they can reach their target audience, as participants did, yet again, report an interest in accessing such recordings. For instance, at the time of writing this article, the digital public library Teatroteca includes 34 audio described theatre plays, and the Liceu opera house in Barcelona provided a streamed version of Don Giovanni with AD as an alternative to in-person attendance at the height of the COVID-19 crisis. We anticipate, however, that removing both the social and the immersive aspect of theatre, opera, and zarzuela could take a toll on the enjoyment of these recordings.

Finally, participants were asked about their involvement in participatory accessibility activities. They reported to have never taken part in any such activities and repeated their absolute trust in their theatre audio describer. This person often asks for their feedback and consequently incorporates it. They deem this to be enough. The fact that the audio describer is always the same person – and a well-known local actress on top of that – has contributed to the sense of reliance and familiarity. Nonetheless, participants expressed an interest in a future participatory project, for instance, in creating a user-led AD for a performance of their own theatre group.

4.2. Results regarding content selection in audio introductions

The second session, devoted to assessing different stimuli for zarzuela AD, began with an extended audio introduction with a double purpose: to get feedback from users regarding content selection and language use in audio introductions, and to get acquainted with the play itself. We called this audio introduction extended because it intentionally gathered a great deal of details for users to categorise as fundamental, secondary, or irrelevant. Content-wise, the audio introduction was organised as follows:

  1. Background information about the composer, the librettists, and the genesis and historical significance of the play
  2. Synopsis of the three acts
  3. Scene and location information for each act
  4. Description of characters’ appearance and costumes
  5. Description of what we see immediately prior to and after the curtain rising (overlapping with the overture)

The audio introduction was 11’ 38’’ long, and participants deemed it long and burdensome, which sometimes made them switch off. Participant Blue draw an enlightening comparison: “To me it feels like listening to a classical music radio station, where you’d get a presentation of the piece with all of those details”. Instead, they would put the emphasis first and foremost on the scene and location (3), the characters’ appearance and costume (4), and the initial description overlapping with the overture (5). Conversely, they would keep the historic-musicological contextualisation of the play (1) to a minimum, and, most notably, eliminate the synopsis of the three acts (2). Hence, in the scope of the audio introduction functions[4] proposed by Reviers, Roofthooft and Remael (2021: 75), participants mostly prioritised the foreshadowing function of the audio introduction: a “description of the set, lighting, the characters, their physical characteristics and costumes”. They only appreciated to some extent parts of the informative function of the audio introduction (for instance, knowing that the character Juan de Eguía is a baritone and that Leandro is a tenor will later help them identify them), and blatantly disregarded the narrative aspect to it (i.e., the plot disclosure). Regarding the explanatory or expressive function, it was not met with enthusiasm nor criticism. Because of this gap, we suggest it would be advisable to tackle this precise theme in future research. There were also no comments regarding the instructive function of the audio introduction.

The participants’ disapproval of the synopsis (2) brings out a desire of keeping the suspense in the plot development. Participant Orange believes that having this sort of information on the [theatre] website would be enough: “That way one can check what the play is about and decide whether it may be interesting to attend”. This suggests that the participants are somewhat conditioned by those practises most widespread in the theatrical and film AD modalities. That is, they don’t expect for the plot to be spoiled before the AD. As most of them do not usually attend operas or zarzuelas, we may point to a certain lack of awareness of the fact that there is a general expectation for audiences of these genres to be acquainted with the plot. In contrast, participants mostly appreciated the last segment of the audio introduction (5), that is, the explanation for the (recorded) audience applauding as the orchestra conductor enters and the description of the images that are projected on the curtain as the overture plays. Curiously, they preferred the audio introduction fragment that resembled AD the most. To them, the work of an audio describer focuses on the hic et nunc, i.e., on descriptions closely synchronised with the action and relative to it.

Alternatively, users expressed their interest in a possible extended audio introduction truly outside (Di Giovanni 2014) the performance or the physical theatre. This format of audio introduction could well be more thorough, following the format of print programme, and include, for instance, the input of creative teams. One of the participants made the following connection:

[Participant Blue]: The introduction itself is very good because it provides you with a lot of information, but it reminds me of listening to Radio Clásica, where they usually offer a presentation including all these details. [...] It’s interesting if you want to learn more about the play, but I believe that when you are attending a show, that information should not be provided. It can be provided in a television or radio programme where they present the play before broadcasting its recording.

Hence, we proposed a scenario where an extended audio introduction would be published on the theatre’s website or be sent to users on CD in advance (Cabeza and Matamala 2007, Fryer and Romero-Fresco 2014). This practice is currently applied by VocalEyes in the UK and has recently been incorporated to the Liceu opera house[5]. Meanwhile, the zarzuela audio introduction in situ would be more minimal and focused on the foreshadowing function (closer to the hic et nunc). Thus, users may choose to consult the one at home to be informed about the show. This dual possibility did meet the approval of participants.

4.3. Results regarding feedback to the audio description stimuli

Regarding the four zarzuela stimuli, a general comment is made suggesting that the AD be lower in volume for the purposes of not disrupting the original sung dialogue. We acknowledge that this would only be a problem in pre-recorded productions, since users physically attending a zarzuela can generally adjust the volume themselves. For scenic arts recordings, however, it is worth hiring a user to perform a quality control check on technical issues.

The feedback regarding the four AD stimuli (see section 3.2) corroborates what has been pointed out when it comes to the general AD and the audio introduction stimuli: participants prefer the more concise over the more verbose stimuli. The fourth AD (including the verbal rendering of the sung dialogue) was disregarded, as shown below.

Stimuli one to three are preferred by users, although the second and third stimuli should be more condensed. In particular, historic-musicological content should have no place in zarzuela AD: it is contextual information that audiences can consult themselves before the performance. Participants also deem it unnecessary to be reminded of (and thus repeat) information about the scenography or costume design that has already been provided in the audio introduction: once is enough. It would only be relevant to succinctly discuss costumes, scenography, or lighting if a change occurs, or use brief audio introductions in between acts to supply this information. To them, the action reigns supreme.

Action is understood by some participants in a broader sense: “what is happening in each scene, [...] what happens to the character […], why he’s like this”. That is to say, they are open to instances of interpretative AD, “when a describer explicitly explains or draws a conclusion from an action, basing the output on visual or aural evidence from the scene” (Ramos and Rojo 2020: 218). Most participants, however, understand action as what the characters are factually doing: entering or leaving the stage from the right or the left, lifting a stone, hugging each other, turning around, and gesturing resignation.

4.3.1. Discussion of Stimuli

Assessing each stimulus separately, we can assume the first stimulus (see section 3.2) would be acceptable in its entirety for the defendants of action in a broader sense but would require deleting the summary of the dialogue (“They accuse her […] their husbands”) for those who understand action in a narrower sense, and for participants who believe that the sung dialogue is comprehensible enough. In the second stimulus, as reproduced below, the crossed-out segments would be redundant for all participants because of their elements of repetition regarding the audio introduction (costumes and scenography) or due to their musicological nature. Those that support a narrower definition of action would likely also leave out the underlined segments in the third, sixth and seventh interventions:

  1. 00:00:04-00:00:21 Marola is left alone in the harbour. Her red knitted cardigan stands out against the grey street and the wooden tables and chairs from the tavern. The tables are covered with plaid tablecloths. A furious mob of about 15 women, led by Antigua, jump on the innkeeper impetuously. They wear aprons, long skirts, and blouses in muted tones, grey and blue, their hair covered with scarves.
  2. 00:00:34-00:00:44 The fifth musical number from the first act, inspired by the operetta genre, entails a heated argument between the female choir and Marola: “¡Aquí está la culpable!” (Here’s the guilty one!).
  3. 00:01:05-00:01:16 The women complain that Marola gets their husbands drunk and seduces them, but she defends herself arguing that she just treats them well. It’s actually their fault: they are shabby and unpleasant.
  4. 00:01:33-00:01:36 Some women raise their fists menacingly.
  5. 00:02:05-00:02:10 Marola remarks on their tattered clothes. They all stare at Marola with aversion and disdain.
  6. 00:02:19-00:02:26 The heated argument now turns into a waltz where Marola advises the group on how to please their husbands.
  7. 00:03:34-00:03:41 The women smile smugly, as if Marola had nothing to teach them. Insults and threats are thrown around.
  8. 00:03:44-00:03:53 Juan de Eguía interrupts them.

The third version (coined theatrical AD), which is not reproduced here due to space constraints, would require even more omissions. The participants’ message is therefore clear: the priority when attending a zarzuela is to enjoy the music. Participant Green stated the following: “I would go to the zarzuela to listen to the singers and the choir, and if you can only hear the AD…”, to which Participant Blue responded that “the second and third AD versions should be synthesised”. In fact, a constant leitmotif throughout both sessions is the participants’ opposition to excessive or disruptive AD. They dislike wordiness and ask for the overlapping of AD with dialogues or music to always be avoided. That is, our results closely match those of the ADLAB PRO project (2017: 28) in terms of quality of information[6]. On a final note, the word “support” is highlighted many times throughout the focus group. This term alludes to the collateral role of AD, which, also in the context of zarzuela, should not overshadow the production itself.

4.4. Results regarding audio subtitles

The last subsection of the results is devoted to the discussion of the fourth AD stimulus, in particular: 1) the comprehensibility of zarzuela’s sung numbers for native speakers and 2) the possibility of adding audio subtitles to zarzuela AD, following current practise in operatic plays (Orero et al. 2020). The fourth stimulus, combining AD with audio subtitles in two voices, sparked great criticism. A music-centric sentiment was shared by the majority of participants, and most deemed the sung dialogue understandable enough not to need audio subtitles:

[Participant Green]: I flat out discard the fourth version.



[Facilitator]: Participant Green is categorical. “We don’t want to hear about the dialogue”. This makes me question: Did you understand what the choir and Marola were singing?

[Participant Blue]: Yes, it is understandable.

[Participant Green]: Yes, one can understand quite a lot.

[Participant Orange]: Yes.

[Participant Black]: Look, I liked all four ADs. [...] I even like the one you are complaining about, the one where you read aloud the dialogues.

[Participant Green]: Regarding the last one, I believe that everyone would say to you: “Well, I go to a zarzuela to listen to the singers, the choir… and if I can only hear the AD…”

As seen in the interaction, only one participant contradicted the others, taking a logocentric position and being open to the varying AD possibilities. Another user disagreed with the others in a different way, i.e., at the comprehensibility level:

[Participant White]: One cannot understand the [sung dialogue in the] second and third versions, at least I could not understand what they were singing very well, but I don’t mind. In the fourth one, as you were narrating, I could understand the singers’ vocalisation.

On balance, they did all agree with the fact that zarzuela has different requirements than opera (mostly sung in Italian or German), where they all concurred with the need for audio subtitles to overcome the language barrier.

5. Conclusions

This article has presented the results of an online focus group elaborating on the accessibility expectations of a particular set of users, older adults, and a genre so far unexplored in the field of AD, zarzuela.

The general conclusions regarding end users’ cultural habits have pointed to a discrepancy in expectations about at-home activities (namely, watching audio described TV and films, and reading audiobooks) and outside-the-home, social activities (namely, going to the theatre, performing theatre, and travelling). Users display a much more positive predisposition towards the latter and are not hesitant to criticise bad practices in the non-social AD modalities (generally, pre-recorded audiovisual products). To them, there is a clear correspondence between the human side and quality, as they are satisfied with the use of synthetic voices for daily applications, such as screen readers and mobile applications, but strongly reject its application to cultural activities. AD outside the theatre, i.e., television and film AD, is critiqued because of its excessive or overbearing nature: users disapprove of overlapping effects, the loud volume of AD, and a general dissonance between AD and the original production. These practices are thus to be avoided in zarzuela AD as well.

Regarding good practices, audio introductions can be subject to personalisation (Lopez, Kearney and Hofstädter 2018), one of the principles of poietic design: as one size does not fit all, “we need to design artefacts that can respond to the specificities of each individual” (Greco 2019: 25). In our case, two audio introductions can be prepared for a given play, a short one in person right before the play, highlighting the foreshadowing function, and another published online some weeks prior to the performance.

Along with the general demand of fostering the human factor in all accessible services and the possibilities of personalisation, the participants’ preferences regarding zarzuela AD practices are the main contribution of this study. In terms of the zarzuela AD stimuli, participants’ choices insist on terseness and non-overlapping strategies. When asked about the content that should be prioritised, the emphasis is put on the action. Participants clarify that they do not need to be reminded of information already provided in the audio introduction, or any kind of musicological information for that matter. Conversely, they are mostly interested in scenography, costume and lighting changes, and, above all, in the characters’ acting and movement. Interestingly, audio subtitles are disregarded by all users but one. Most deem audio subtitles too disruptive and the chant understandable enough not to need them. This feedback is somewhat conflicting with the fact that zarzuelas in Spain are often accompanied by intralingual surtitles. Mateo (2007: 138) associates this surtitling practice with the assumption that, even though the singers’ voice quality is good, their diction is weaker, consequently, some audiences are unable to follow less popular zarzuelas. Against this idea, the focus on the hic et nunc perspective and the participants’ reluctances to the need for audio subtitles would be the defining factors of the zarzuela modality, as opposed to opera AD.

Among the limitations of the study is its small sample size, particularly in the first session of the focus group. Paradoxically, one of the strengths of this study is precisely the focus on an idiosyncratic group of participants, engaged in the practice of theatre and of an older age. Romero Fresco (2021: 293) has vindicated for such an approach in the scope of Creative Media Accessibility: “placing the focus back on the individual as a necessary complement to the currently prevailing emphasis on quantitative studies set by experimental research in this field”.

With these conclusions in mind, the most immediate avenue of research opened up by this article includes taking the very same rationale and applying it to other groups of participants to test how generalisable the obtained results are. An in-person activity would also better foster ecological validity, as it must be acknowledged that the zarzuela stimuli were pre-recorded and displayed online. Moreover, the cognitive and linguistic register dissonances of AD with the original production merit further inquiry, as end-users reported that ADs sometimes require too much effort and cause fatigue, thus contradicting principle 6 of universal design: “the design can be used efficiently and comfortably and with a minimum of fatigue” (Connell et al. 1997). This effort is exemplified by users stating that they lose track of the plot or become uninterested whenever there are synchronisation errors, the tone of the AD does not match that of the original, or the AD becomes interruptive. Said cognitive and linguistic dissonances could be further explored through experimental research with physiological instruments.


Aafaq, Nayyer, Ajmal Mian, Wei Liu, Syed Zulqarnain Gilani, and Mubarak Shah (2019) “Video Description: A Survey of Methods, Datasets and Evaluation Metrics”, ACM Computer Surveys 52, no. 6: 1–28.

ADLAB PRO (2017) Report on IO2: Audio Description Professional: Profile Definition, URL: (accessed 8 December 2021).

Agulló, Belén, and Anna Matamala (2019) “Subtitling for the Deaf and Hard-of-Hearing in Immersive Environments: Results from a Focus Group”, The Journal of Specialised Translation 32: 217–35.

Alier, Roger (2002) La Zarzuela. Barcelona, Robinbook.

Arias-Badia, Blanca, and Anna Matamala (2020) “Audio Description Meets Easy-to-Read and Plain Language: Results from a Questionnaire and a Focus Group in Catalonia”, Zeitschrift für Katalanistik 33: 251–70. 

Braun, Sabine, and Pilar Orero (2010) “Audio Description with Audio Subtitling – An Emergent Modality of Audiovisual Localisation”, Perspectives: Studies in Translatology 18, no. 3: 173–88.

Bryman, Alan (2012) Social Research Methods. 4th ed. Oxford, Oxford University Press.

Cavallo, Amelia (2015) “Seeing the Word, Hearing the Image: The Artistic Possibilities of Audio Description in Theatrical Performance”, Research in Drama Education: The Journal of Applied Theatre and Performance 20, no. 1: 125–34.

Chmiel, Agnieszka, and Iwona Mazur (2016) “Researching Preferences of Audio Description Users – Limitations and Solutions”, Across Languages and Cultures 17, no. 2: 271–88.

Connell, Bettye Rose, Mike Jones, Ron Mace, Abir Mullick, Elaine Ostroff, Jon Sanford, Ed Steinfeld, Molly Story, and Gregg Vanderheiden (1997) The Principles of Universal design, URL: (accessed 18 April 2022).

Creswell, John W., and Vicki L. Plano Clark (2017) Designing and Conducting Mixed Methods Research. 3rd ed. Thousand Oaks, Sage.

Di Giovanni, Elena (2014) “Audio Introduction Meets Audio Description”, InTRAlinea, Special Issue, URL: (accessed 8 December 2021).

Di Giovanni, Elena (2018) “Participatory Accessibility: Creating Audio Description with Blind and Non-Blind Children”, Journal of Audiovisual Translation 1, no. 1: 155–69.

Fernández-Torné, Anna, and Anna Matamala (2015) “Text-to-Speech vs. Human Voiced Audio Descriptions: A Reception Study in Films Dubbed into Catalan”, The Journal of Specialised Translation 24: 61–88.

Ferrera, Carlos (2015) “Utopian Views of Spanish Zarzuela”, Utopian Studies 26, no. 2: 366–82.

Fidyka, Anita, and Anna Matamala (2018) “Audio Description in 360° Videos: Results from Focus Groups in Barcelona and Kraków”, Translation Spaces 7: 285–303.

Fryer, Louise (2018) “Staging the Audio Describer: An Exploration of Integrated Audio Description”, Disability Studies Quarterly 38, no. 3, URL: (accessed 8 December 2021).

Fryer, Louise, and Amelia Cavallo (2018) Integrated Access Inquiry 2017-18 Report. London, Extant, URL: [url=][/url] (accessed 8 December 2021).

Fryer, Louise, and Pablo Romero-Fresco (2014) “Audiointroductions” in Audio Description: New Perspectives Illustrated, Anna Maszerowska, Anna Matamala, and Pilar Orero (eds), Amsterdam, John Benjamins: 11–28.

Greco, Gian Maria (2019) “Accessibility Studies: Abuses, Misuses and the Method of Poietic Design” in 21st International Conference on Human-Computer Interaction. HCI International 2019 – Late Breaking Papers, Constantine Stephanidis (ed), Cham, Springer: 15–27.

Greco, Gian Maria, and Anna Jankowska (2019) “Framing Media Accessibility Quality.” Journal of Audiovisual Translation 2, no. 2: 1–10.

Hermosa-Ramírez, Irene (forthcoming) “Leading by Example: Embracing Community-Based Participatory Research in Media Accessibility”, The Journal of Specialised Translation 39.

Herrero, Javier (1978) “El Naranjo Romántico: Esencia Del Costumbrismo”, Hispanic Review 46, no. 3: 343–54.

Jankowska, Anna (2019) “Accessibility Mainstreaming and Beyond – Senior Citizens as Secondary Users of Audio Subtitles in Cinemas”, International Journal of Language, Translation and Intercultural Communication 8: 28–47.

Kitzinger, Jenny (2004) “The Methodology of Focus Groups: The Importance of Interaction between Research Participants” in Social Research Methods: A Reader, Clive Seale (ed), London and New York, Routledge: 269–72.

Lopez, Mariana, Gavin Kearney, and Krisztián Hofstädter (2018) “Audio Description in the UK: What Works, What Doesn’t, and Understanding the Need for Personalising Access”, British Journal of Visual Impairment 36, no. 3: 27491.

Marco, Tomás (1993) Spanish Music in the Twentieth Century. Cambridge, MA, and London, Harvard University Press.

Mateo, Marta (2007) “Surtitling Nowadays: New Uses, Attitudes and Developments”, Linguistica Antverpiensia, New Series 6: 135–54.

Morgan, David L. (1998) Planning Focus Groups. Thousand Oaks, Sage.

Orero, Pilar, Mario Montagud, Jordi Mata, Enric Torres, and Anna Matamala (2020) “Audio Subtitles or Spoken Subtitles/Captions: An Ecological Media Accessibility Service” in Translation Studies and Information Technology – New Pathways for Researchers, Teachers and Professionals, Daniel Dejica, Carlo Eugeni, and Anca Dejica-Carţiş (eds), Timişoara, Editura Politehnica: 149–61.

Plaza, Sixto (1990) “La Zarzuela, Género Olvidado o Malentendido”, Hispania 73, no. 1: 22–31.

Ramos, Marina, and Ana Rojo (2020) “Analysing the AD Process: Creativity, Accuracy and Experience”, The Journal of Specialised Translation 33: 212–232.

Reviers, Nina, Hanne Roofthooft, and Aline Remael (2021) “Translating Multisemiotic Texts: The Case of Audio Introductions for the Performing Arts”, The Journal of Specialised Translation 35: 69–95.

Romero-Fresco, Pablo (2021) “Creative Media Accessibility: Placing the Focus Back on the Individual” in International Conference on Human-Computer Interaction, Margherita Antona, and Constantine Stephanidis (eds), Cham, Springer: 291–307.

Shannon-Baker, Peggy (2016) “Making Paradigms Meaningful in Mixed Methods Research”, Journal of Mixed Methods Research 10, no. 4: 319–34.

Sontag, Susan (1966) “Film and Theatre”, The Tulane Drama Review 11, no. 1: 24–37.

Temes, José Luis (2014) El Siglo de la Zarzuela. 1850-1950. Madrid, Siruela.

Udo, John Patrick, and Deborah I. Fels (2009) “‘Suit the Action to the Word, the Word to the Action’: An Unconventional Approach to Describing Shakespeare’s Hamlet”, Journal of Visual Impairment & Blindness 103, no. 3: 178–83.

Udo, John Patrick, Bertha Acevedo, and Deborah I. Fels (2010) “Horatio Audio-Describes Shakespeare’s Hamlet: Blind and Low-Vision Theatre-Goers Evaluate an Unconventional Audio Description Strategy”, British Journal of Visual Impairment 28, no. 2: 139–56.

Walczak, Agnieszka, and Louise Fryer (2018) “Vocal Delivery of Audio Description by Genre: Measuring Users’ Presence”, Perspectives: Studies in Translatology 26, no. 1: 69–83.

Webber, Christopher (2019) “Spain and Zarzuela” in The Cambridge Companion to Operetta, Anastasia Belina and Derek B. Scott (eds), Cambridge, Cambridge University Press: 103–19.


[1] See (accessed 15 April 2022).

[2] The 2018-2019 Survey of Cultural Habits and Practices in Spain reports that most zarzuela-goers fall into the 55 to 64 and 65 to 74 age ranges. (accessed 19 November 2021).

[3] See (accessed 19 November 2021).

[4] Informative (contents included in the booklet), narrative (plot development), explanatory and expressive (complex theatrical illusions), foreshadowing (visual elements of the performance, such as the set and the characters’ physical appearance), and instructive (how to use the AD transmitter, for instance).

[5] La Prèvia is a recent podcast provided by the Liceu opera house that closely resembles audio introduction, all while being targeted at the general audience: (accessed 17 December 2021).

[6] The participants surveyed in the ADLAB PRO project pointed to synchronisation with the dialogue, sound effects and images, the audio describer talking over the dialogue or critical sound effects, along with issues of coherence as the aspects they most disliked in AD.

About the author(s)

Irene Hermosa-Ramírez holds a BA in Translation and Interpreting from the University of the Basque Country and a MA in Audiovisual Translation from the Universitat Autònoma de Barcelona (UAB). Since being awarded a PhD grant by the Catalan Government (2019FI_B 00327), she has joined the TransMedia Catalonia research group where she currently collaborates in the project Researching Audio Description: Translation, Delivery and New Scenarios (RAD). Her thesis focuses on opera audio description. She is also the secretary of the Catalan Association for the Promotion of Accessibility (ACPA).

Miquel Edo, PhD in Romance Philology, has been lecturing at the Faculty of Translation and Interpreting, Universitat Autònoma de Barcelona, since 1992. He is currently Vice Dean for Professional Development. His research interests are mainly focused on the reception of 19th and 20th-century Italian poetry in Spanish speaking countries, including translation of librettos. Alongside his academic work, he has provided opera subtitles for the Barcelona Grec Festival as well as translations and audio descriptions for the Gran Teatre del Liceu. He is a member of TransMedia Catalonia research group.

Email: [please login or register to view author's email address]

©inTRAlinea & Irene Hermosa-Ramírez and Miquel Edo (2022).
"Users’ expectations of zarzuela audio description: Results from a focus group"
inTRAlinea Special Issue: Inclusive Theatre: Translation, Accessibility and Beyond
Edited by: Elena Di Giovanni and Francesca Raffi
This article can be freely reproduced under Creative Commons License.
Stable URL:

Go to top of page