Stabilizing Live Speech Translation in Google Translate - Deepstash

Bite-sized knowledge

to upgrade

your career

Ideas from books, articles & podcasts.

created 9 ideas

GOOGLEBLOG

Stabilizing Live Speech Translation in Google Translate

Stabilizing Live Speech Translation in Google Translate

ai.googleblog.com

STASHED IN:

144 reads

The transcription feature in the Google Translate app may be used to create a live, translated transcription for events like meetings and speeches, or for a story at the dinner table. In such settings, it is useful for the translated text to be displayed promptly to help keep the reader engaged.

The new version of the Google Translate app that significantly reduces translation revisions and improves the user experience. The research enabling this is presented in two papers. The first formulates an evaluation framework tailored to live transl...

Erasure: Measures the additional reading burden on the user due to instability. It is the number of words that are erased and replaced for every word in the final translation.

Lag: Measures the average time that has passed between when a user utters a word ...

It is important to recognize the inherent trade-offs between these different aspects of quality. Transcribe enables live-translation by stacking machine translation on top of real-time automatic speech recognition. For each update to the recognized transcript, a fresh translation is gene...

One straightforward solution to reduce erasure is to decrease the frequency with which translations are updated. Along this line, “streaming translation” models (for example, STACL and MILk) intelligentl...

In our paper, “Re-translation versus Streaming for Simultaneous Translation”, we show that our original “re-translation” approach to live translation can be fine-tuned to reduce erasure and achieve a more favourable erasure/lag/BLEU trade-off. Withou...

The end of an on-going translation tends to flicker because it is more likely to have dependencies on source words that have yet to arrive. We reduce this by truncating some number of words from the translation until the end of the source sentence has been observed. This masking process ...

The combination of masking and biasing, produces a re-translation system with high quality and low latency, while virtually eliminating erasure. The table below shows how the metrics react to the heuristics we introduced and how they compare to the other systems discussed above. The graph demonst...

The solution outlined above returns a decent translation very quickly, while allowing it to be revised as more of the source sentence is spoken. The simple structure of re-translation enables the application of our best speech and translation models with minimal effort. However, reducing erasure ...

React

Comment

MORE LIKE THIS

created 1 idea

Are we keeping a close eye on machine learning and it's ability to replicate what we teach it?

React

Comment

29 reads

It's time to

READ

LIKE

A PRO!

Jump-start your

reading habits

, gather your

knowledge

,

remember what you read

and stay ahead of the crowd!

Takes just 5 minutes a day.


TRY THE DEEPSTASH APP

+2M Installs

4.7 App Score