Global AMTA Summit 2021: Tarjama AI Experts Present Their Data-Centric Approach to Tailored NMT for Arabic

Global AMTA Summit 2021: Tarjama AI Experts Present Their Data-Centric Approach to Tailored NMT for Arabic

Dubai, UAE – 23 August, 2021 –  Tarjama AI experts present their latest research at the international MT Summit 2021 hosted by the Association for Machine Translation in the Americas (AMTA) amongst other leading global MT providers. In their presentation, Rebecca Jonsson (PhD), Head of AI Products at Tarjama together with Ruba Jaikat, AI Lead at Tarjama presented the company’s approach to building tailor-made neural machine translation (NMT) to fit the needs of businesses seeking to scale in the Arabic-speaking world.  

The 18th biennial conference took place from August 16 – 20, 2021 and aimed to discuss today’s most pressing issues and research around MT technology through a global gathering of MT researchers, developers, providers, and users.  Tarjama was one of the 50 global participants at the event which hosted practitioners at top-tier universities, leading MT providers, and global enterprises.  

Dr. Rebecca explained how Tarjama built its NMT engine using in-house high-quality data translated by expert linguists over the past 13 years since the company was founded. The NMT engine specialized in various business domains including legal, consultancy, e-commerce, health, and more. Compared to the BLEU score of MT engines of leading global tech companies, Tarjama NMT’s score outperformed them when it comes to translating from English to Arabic.  

“We take the customization process to a whole new level. Not only are our clients’ data partially used to train the NMT engine, but we have a defined process which we call ‘a data-centric approach,” Ruba explains, “1. We select only the gold nuggets of the customer data. 2. We take into consideration the translation guidelines of each client. 3. We make sure it’s generalizing well on other datasets … and 4. Our LQA experts are kept in the loop to deliver the highest-quality NMT engine that performs best in class on the customer data.”  

Tarjama showcased the quality and impact of its tailored NMT model for one of the leading e-commerce companies in the MENA region. To build the e-commerce NMT model, Tarjama’s AI team used high-quality data from 3 million bilingual (EN-AR) segments.  

“We measured the impact of the tailored NMT for our e-commerce client and found that they were delivering triple the volume of translations, translation costs were down by 50% and they could see improved consistency and quality of the translations delivered,” Dr. Rebecca said.  

According to Ruba, the quality of the data matters more than the quantity. Using the feedback of the company’s expert linguists, internal translators, and QA specialists, Tarjama is constantly improving and fine-tuning its models and methodology.  

At the end of the session, attendees were eager to ask questions about the challenges around Arabic data and Tarjama’s methodology in building its NMT models. 

About Tarjama: Tarjama is a smart language solutions company helping businesses grow their global presence with seamless, enterprise-grade content. Tarjama’s unique combination of language services and technologies helps companies overcome their content challenges of volume, speed, quality , and security. Tarjama’s capabilities cover a full suite of AI-enabled language solutions from translation and localization to content creation, subtitling and content advisory as well as a range of innovative language technologies include a translation management system, machine translation, client portal and many more.  | @Tarjama |  

Keep Reading