Categoría - Tecnología

Testing AppTek’s AVT specialized MT vs. Google’s general MT when subtitling a Mexican soap opera into English

[Estimated reading time: 10 minutes, 36 seconds. (2,121 words)]

As machine translation usage becomes more and more popular for subtitling, with leading companies now using MT in projects for the most important streaming services out there, I felt the need of testing it to see how helpful it can be for freelance AVT professionals. As I’ve been collaborating with AppTek in recent months, I decided to test theirs against Google’s to see if there are any differences when working with an MT service specialized in subtitling (AppTek) and a more general one (Google). Both services will soon be available on different translation platforms which are used on a daily basis by freelance subtitlers, so I think we need to be prepared to choose the right one for our projects.

For this test, I’ve chosen an episode of the Mexican soap opera “Te doy la vida,” which is currently being aired in Argentina with great success. My experiment was quite simple: I used an SRT transcript in Spanish of one episode (814 subtitles) and translated it with Google Translate using Subtitle Edit. Additionally, I asked AppTek to machine translate the file for me. Then, I proceeded to randomize the translations and put them together with the source text in an Excel file that looked like this:

Then, I watched the whole episode with my Excel file and went through every single subtitle trying to choose which one would be better for post-editing. I will share below some of the most relevant findings (and the final head-to-head results!), but first, let me tell you the basis for my decisions on each subtitle.

When you’re post-editing, the most important thing is to actually save time when using this work method in comparison with translating from scratch, but even though we might believe that there’s only one way of saving time with post-editing (that’s choosing the best translation, right?), I put my focus not only on recognizing which service did better at grasping the meaning of the original, but also which service rendered better-segmented subtitles. Getting help with segmentation is useful, and that’s something that only a specialized service like AppTek can give you, right? Well, that’s what I wanted to find out. Let’s dive into my findings and results.

  1. Segmentation

Since we’re talking about subtitling, let’s see some of the differences in segmentation that I was able to find in the comparison.

Example 1:

Although we could say it’s not a crime to have two lines in example 1, we can see that Google doesn’t keep the units together and breaks the “tells/me” in two lines. Nonetheless, it’s particularly helpful to have the MT do something clever here and decide to merge two lines into one for the translation, as AppTek did.

Example 2:

Just like in example 1, the sense-unit “to watch/over…” is broken in Google’s case. Additionally, here we notice something that occured several times in the comparison: Google does not use a contraction for their translation in some subtitles, while AppTek does. We should probably have to dig deeper to understand the reason why this happens but suffice to say that on a very informal audiovisual text like this soap opera, it’s always better to use contractions.

Example 3:


Here we have an example of segmentation across subtitles, and we can see that Google’s version is wrongly segmented, while we could perfectly use AppTek’s in a professional subtitle.

Throughout the file, there were some cases in which Google did a better job with line breaks within subtitles and across subtitles, but AppTek was the winner here.

  1. Translation of names and places

Another interesting point in my comparison was to see how both MT services managed the (unnecessary) translation of people’s names and places. In a project like this, in which we’re dealing with a drama, there’s no need to be translating the characters’ names, or the names of the places where they eat, for instance. However, I was able to find that some names were translated.

In the cases of names like Elena, Nico, Nelson, Gabriela, Isabel, Samuel, and Gina, that don’t have a clear version in English, they were never translated. Also, the name Ernesto, which could have been translated into Ernest, was always left in Spanish. On the other hand, the name Pedro did cause some trouble, and for some reason, AppTek’s MT always translated to Peter (10 occurrences), while Google kept the original Pedro. Something similar happened with the name Agustín, which was translated to Augustine by AppTek, as well as Domingo, which was translated to Sunday. An interesting case was the word doctor, which was never capitalized by Google, but properly capitalized by AppTek for the cases in which it was used in a direct address.

The funniest of all translations was provided by AppTek, for this subtitle: – Oye, ¿ya llegó Catita?/– ¿Quién es Catita? (Catita is an affectionate version of Catalina), which was translated as: – Hey, is catty here yet?/– Who’s pussy?. Google came closer: Hey, is Catita here?/– Who is Catita?

As for the name of places (mostly names of restaurants in “Te doy la vida”) that shouldn’t be translated, we had the case of Cazuela de Lola which was translated by Google as Lola’s Cazuela, and by AppTek as Lola’s casserole, which I thought was pretty funny. Also, Pierangelo, which appeared in this sentence: ¿Qué tal le parece el Pierangelo?, and was translated by AppTek as How’s that for the pears?, and more properly by Google as How do you like the Pierangelo?

  1. Formal and informal treatment

As a general analysis of the usage of MT in subtitling, we could conclude that dealing with formal language is easier for the machine, which means that these services could be more helpful when working with a documentary, for instance, than very informal audiovisual texts. In the case of “Te doy la vida,” we have some serious characters (yeah, you guessed it, the villains) and some goofy characters, which provided nearly impossible-to-solve puzzles both for Google and AppTek.

Let’s see some examples of useful translations in formal contexts:

Example 4:


Both translations are useful, but AppTek’s has better segmentation.

Example 5:

Same as before, both translations are useful, but AppTek’s has better segmentation.

Example 6:


AppTek’s “under the name” conveys the meaning better, but both subtitles are useful.

Example 7:


I would prefer “have left” instead of “remains” here.

Struggling with informal texts:

Example 8:


The “rendida a tus pies” could be translated as “she’ll be at your feet” or “you’ll have her at your feet”. Also, the word “ánimo” is properly translated to “cheer up” by Google here.

Example 9:


The phrase “metele enjundia a la labia” is a very Mexican expression which in this case means “work on your words”. A possible translation here would be: Just work on your words/because you’re an ignorant, okay? Also here, donkey is awfully out of context.

Example 10:


Funny enough, the Spanish source uses a common English expression, which should have been translated as: Because you’re ugly with a capital U. Both Google and AppTek didn’t recognize the word efe as the letter F and that’s why they left it in Spanish.

Example 11:


In this last example, the screwed proposed by AppTek does work as a translation of the informal fregados.

  1. Miscellaneous

There are many other things that we could analyze in detail, but I’m going to mention here some specific cases that caught my attention.

Example 12:

Sadly, there were many cases in which the MT needed to understand the context to translate properly because it was not able to do it otherwise. In this example, the source is an answer of one of the characters to a dinner invitation. AppTek was able to produce a good translation with no context in Nice to meet you, but Google failed catastrophically with a Haunted. The correct translation here would have been I’d love to.

Example 13:

Gender was also a big problem throughout the translation, with several cases of using the male pronoun him when it should have been the female pronoun her. Here the character is talking about a woman, so the proper translation would be we started sending her messages. Although this problem appeared throughout the whole translation, I found that AppTek did a better job in grasping these situations and using the correct pronoun.

Example 14:

In some cases, the original dialogues were not so good, which led to confusion during the translation. In this example, the original should have been: Pero tú tenías/que habérselo dicho,*y juntos hubieran buscado una solución. It’s just one word that’s wrong, but that leads to incorrect translations from both parts. A possible version could be: But you should have told her*and together,/you could have found a solution.

Example 15:


In this case, Google wrongly used the context to translate She’s only…, but AppTek did good in the second subtitle, but not on the second line of the first subtitle. A proper translation would have been: At this point, even if she knows it,/it won’t do any good.*It’s just gonna destroy Gina’s life.

  1. Results and conclusions

So, you wanted to know the results? Well, here they are: Out of 814 subtitles, I chose AppTek on 464 subtitles (57.0%), Google on 211 subtitles (25.9%), and both AppTek/Google (mostly cases in which both translations were the same, but also cases in which both translations were useless) on 139 subtitles (17.1%).

The most important conclusion I took out of this experiment is that MT can be helpful for the translation of soap operas unless they have an extremely large amount of informal language, which was not the case in this series—maybe next time, we’ll analyze a different type of content. Also, I find that an integral part of the post-editing process is for the translator to make the translation less artificial; I noticed that it is difficult for both MTs to move away from the original text, thus rendering translations which are close to the source, both in terms of vocabulary but also in terms of syntax.

As for my choice of MT service, I think it’s obvious that AppTek did a better job, and I believe that’s mostly because it is an MT specialized in subtitling, so I’m looking forward to seeing it integrated into subtitling software platforms for us freelance translators to use on our projects.

(13)

So, do you want to know why I love Ooona so much?

[Estimated reading time: 4 minutes, 18 seconds. (862 words)]

Technology and translation are my two main professional addictions, no wonder I’m also known as Tradugeek. And when it comes to translation technology and subtitling, you can say I’m sort of a software sommelier. I have been playing and working with subtitling software for over a decade now in my everyday life. I went from open source to commercial software from my clients to what is now my subtitling paradise, Ooona. So, if you want to know why I looove Ooona so much, and why I think you should try it if you’re a freelance subtitler like myself, please continue reading.

Flexibility

Working from a Mac as a translator has always been a challenge as many of the leading software both for technical and audiovisual translation usually don’t offer a MacOS version. Ooona solves that problem as it is an online subtitling tool that you can access from Google Chrome on a Mac or a PC. So, I think that’s already a great benefit, particularly for the possibility of working online in the Cloud and having your projects at hand wherever you go.

Translate and Create

Another thing I like about Ooona is that all the tools you need for different tasks are separated, so you don’t have to face a UI with a thousand options when the only thing you’re trying to do is to translate or create subtitles. And of course, in that set of tools Ooona offers, Translate and Create are the most important ones for freelance translators. Translate allows you to translate from a timed master template, while Create allows you to work from scratch when you’re subtitling directly from audio. In these two tools, you can choose between the Standard and Pro version. Both are great, but the Pro version includes the audio-wave and shot-change detection, so that’s the one I like the most.

Ooona’s Create

Of all the tools I’ve tried—and believe me, I’ve tried a lot of tools—, this is the most intuitive one with an in-your-face interface that makes our work extremely easy, while still offering an unmatched set of tools for the translation process, like comprehensive import/export options, hotkeys personalization, short forms and autocorrect, personal dictionaries, a myriad of timecoding options for easy spotting, and easily accessible text editing options for positioning, text format, splitting and merging subtitles, and more. Additionally, Ooona offers a robust QC and Spellcheck option—which you can personalize for every client—and you can see the errors on your subtitles on the fly.

Ooona’s hotkeys window

QCing in Ooona

Burning subtitles

Did you ever needed to burn subtitles in videos and struggle with free software or even with professional video editing tools and couldn’t manage? Don’t worry, the Burn & Encode tool has you covered. You can load subtitles and media, customize the look of your subtitles, preview your changes, and even trim the output video.

When I discovered this tool, I went crazy by just knowing I wasn’t going to need to use two or even three different apps to do just one simple task. I love it, and it has made my life so much easier.

Burning subtitles with Ooona is incredibly easy

Teamwork: reviewing subtitles

Ooona also offers a Review tool which you can use in case you work in pairs with a colleague and commonly proofread their translations and vice versa—which is something I highly recommend doing for high quality translations. Review allows you to share a project with another user inside Ooona and track the changes—it includes tags and error codes. There is even a Compare tool to see all changes made to an original subtitle by comparing it with another.

Transcribing made easy

If transcribing is something you do in your everyday life as a professional translator, you can use Ooona’s Transcribe tool to create dialogue lists and production scripts. The tool lets you easily set the characters of your video, customize the layout to add any other information you deem necessary, and export the final document to Word or Excel.

You should probably dive into The Poool

In 2020, Ooona created a directory for professionals working in the audiovisual localization industry. The Poool is an online platform for LSPs and individuals trying to find professional audiovisual translators specializing in different fields and language pairs. I believe it’s the first open directory of this kind in the whole world where every professional can join. I signed in as soon as it launched, and you should probably register too.

Convenient payment options

If you’re interested in one of the tools I’ve mentioned, you should know that they’ve made it easy for you. You can subscribe to the tools they offer for a week, a month, a semester, and a year.

Subscribing for a week helps you deal with your projects, if you only work from time to time with videos

Looking ahead

To conclude, I would like to say that Ooona is an ever-evolving tool, so if you are like me and think that subtitling tools could add more features, hang on, as the team is already preparing integration with CAT tools, speech recognition, machine translation expansion, and there’s even an AVT Pro Certification Program in the works. So, the future looks promising for them, and for us.

(166)

El detrás de escena de «The Translation Show»

[Tiempo estimado de lectura: 1.50 min]

Hace unos dos meses, con el comienzo de un 2017 repleto de nuevos proyectos, comencé a intercambiar correos con mi amigo Rafa López, quien resultó ser un gran consejero. Entre medio de tanto intercambio y de preparación de algunos proyectos que ya teníamos armados para trabajar juntos durante el año, comenzamos a charlar sobre nuestras ganas de incursionar en la plataforma social de mayor impacto de la actualidad, YouTube.

Si bien Rafa y yo tenemos muchas cosas en común sobre las cuales podríamos armar un canal de YouTube (ni les puedo empezar a contar), coincidimos en que nuestros perfiles profesionales son tan parecidos que valdría la pena encarar un proyecto en el que hiciéramos un programa sobre traducción, con el eje puesto en la tecnología, la traducción audiovisual, la capacitación y las noticias del mundo de la traducción.

Solo se necesitaron un par de intercambios de correos para que decidiéramos ir adelante con el proyecto y naciera The Translation Show.

Para que todo funcionara correctamente, el primer punto para considerar fue la posibilidad de meternos de lleno en algo en lo que los dos pudiéramos aportar, más o menos, la misma cantidad de tiempo. Rafa es el genio del diseño y de los videos, así que a él le tocó la parte más difícil. Todos los logos, efectos, transiciones, sonidos y demás son parte de su creación. Sin dudas, su labor es darle vida al programa. Debo confesar que yo no podría haberme animado a hacer esto solo. De haberlo hecho, el resultado técnico no hubiera sido tan bueno. (Aunque claro, todavía queda mucho por mejorar; paciencia, estamos empezando).

Mi papel en The Translation Show es investigar y aportar contenido para el programa, difundir todos los videos en las diferentes redes sociales (si nos escriben, sepan que seguramente seré yo quien está detrás de Instagram y Facebook) y contactar a los posibles invitados que seguramente engalanarán al canal.

Con todo listo y ordenado, nos lanzamos al mundo con un primer programa sobre noticias y recursos de la traducción, en el que charlamos sobre fansubs, sobre traducción de títulos de películas, sobre los impuestos a la tecnología en la Argentina y más, y también dimos recursos web, y recomendamos un libro y dos posgrados. El video tuvo muy buena aceptación y ya estamos por arriba de las 1500 vistas.

El pasado lunes, llegó el segundo video del canal. En este caso, se trató de una entrevista. Como no podía ser de otra manera, dimos el puntapié inicial con un gran amigo, Xosé Castro Roig. En una entrevista íntima, Xosé nos contó cómo divide su tiempo entre las diferentes ocupaciones de su vida diaria y la traducción, cómo se prepara para dar sus exitosas charlas y cómo enfrenta las críticas, que se dan de tanto en tanto. Además, nos habló sobre el presente de la traducción audiovisual y sobre si es posible vivir dedicándose exclusivamente a esa especialización.

Así las cosas, ahora YouTube tiene un nuevo canal; y el mundo de la traducción tiene un espacio creado específicamente para unir todavía más a los profesionales de la Argentina y América Latina y de España, difundir la profesión y, por qué no, reírse un poquito con nuestro día a día. Ojalá que lo disfruten.

(556)

Actualización en Nuevas Tecnologías de la Traducción de la UBA

[Tiempo estimado de lectura: 2.58 min]

Hace unos años, gracias a Santiago Murias, quien me puso en contacto con Gabriela Urthiague, secretaria del Traductorado Público de la UBA, comenzó un sueño complejo y con muchas aristas, que finalmente pudimos concretar bajo la gestión de la nueva directora de la carrera, la traductora pública Beatriz Rodriguez. El sueño se trataba ni más ni menos de crear el primer plan de posgrado en nuevas tecnologías de la traducción en toda América Latina.

La idea de este plan surgió de un profundo análisis de la enseñanza actual en la Argentina de la traducción pública y la traducción científico-técnica literaria, tanto en el ámbito privado como en el público, que determinó que una de las carencias más importantes de los alumnos era el dominio tecnológico. Todos los graduados que cuentan con cierta experiencia en el área de la informática aplicada a la traducción lo hacen por su propio interés y no por conocimientos adquiridos en el aula. Para resolver ese problema, y para abrir las puertas a los traductores egresados a otras áreas de la traducción, se creó la Actualización en Nuevas Tecnologías de la Traducción de la Universidad de Buenos Aires mediante la Resolución 4825/16 del Consejo Directivo de la Facultad de Derecho de la UBA. El posgrado abre sus puertas el 4 de mayo de 2017, y el proceso de admisión cierra el 20 de abril de 2017.

Leer más

(901)

¿Cuánto debo ganar por hora como freelance para ser rentable?

[Tiempo estimado de lectura: 3.13 min]

Habitualmente, en los grupos de traductores e intérpretes en las redes sociales, vemos muchas consultas sobre las (miserables) tarifas que ofrecen gran parte de las agencias de traducción y suele surgir un fuerte debate entre aquellos que las aceptan porque tienen gastos mínimos y los que creemos que, independientemente de los gastos que tengamos, las tarifas deben tener un promedio general que dignifique la profesión.

No me voy a meter en la discusión de las tarifas ahora, pero aquí surge una cuestión importante relacionada con una gran duda que tenemos todos los traductores independientes: con todos los gastos fijos que tengo por mes, ¿cuánto tengo que ganar por hora para lograr rentabilidad? Los que tienen estos números bien claros saben que jamás se puede ser rentable si traducimos por ARS 0,15 la palabra y que jamás se puede ser buen traductor si tenemos que traducir 10.000 palabras por día para obtener rentabilidad con esa tarifa.

Leer más

(7784)

Format Factory, la máquina de conversión de archivos de audio, imagen y video

[Tiempo estimado de lectura: 1.28 min]

Hay tantas necesidades de tener una excelente herramienta de conversión de archivos como usos que podamos darles a las computadoras.

En el caso del universo de los traductores, este tipo de programas son obligatorios para todos los que nos dedicamos, por ejemplo, a la traducción audiovisual ya que muchas veces nuestro programa favorito para subtitular no admite ese formato de video o de audio que nuestro cliente nos pasó para trabajar.

Si bien existe una gran cantidad de software para este tipo de tareas (y también varios sitios web que realizan conversiones), mi favorito es el Format Factory. Este pequeño programita, totalmente gratuito, nos permite convertir casi todos los formatos más populares de audio, imagen y video,

Funciona de una forma muy sencilla. Lo primero que tenemos que hacer es ingresar a la sección de descargas de su sitio web y, luego de descargarlo, seguir los sencillos pasos para su instalación.

Leer más

(867)

¿Cuál es el límite de la importancia de la tecnología?

[Tiempo de lectura estimado: 2.34 min]

Después de muchísimo tiempo, y de varios intentos fallidos, finalmente he decidido darle una tercera y última oportunidad a este proyecto que, por cuestiones de tiempo, se ha visto demorado en varias ocasiones.
Tradugeek es un blog, principalmente, de informática aplicada a la traducción, pero como buen geek, además, voy a escribir sobre tecnología en general. Y, como buen traductor, también pondré artículos sobre otros temas relacionados con la traducción ya que todos los profesionales de la lengua somos muy inquietos y tenemos intereses muy diversos, al mismo tiempo.

Como primer artículo de este año y de esta nueva etapa, voy a empezar con un análisis (muy breve) sobre una miniserie de televisión británica escrita por Charlie Brooker, el mismo de Dead Set, que tiene como hilo conductor, justamente, la tecnología. Se trata de Black Mirror.

Leer más

(202)