Unbabel says its new AI model has dethroned OpenAI’s GPT-4 as the tech industry’s best language translator

Jeremy Kahn

June 6, 2024 at 7:00 a.m.·4 min read

Unbabel, a tech company that provides both machine and human-based translation services for businesses, has created a new AI model that it says beats OpenAI’s GPT-4o and other commercially-available AI systems on translation between English and six commonly-spoken European and Asian languages.

Translation has been one of the more attractive business use cases for large language models (LLMs), the kind of AI systems that underpin chatbots like OpenAI’s ChatGPT, Google’s Gemini, and Anthropic’s Claude. And to date, GPT-4o, the latest version of OpenAI's most powerful AI model, has outperformed all competitors when it came to translating languages for which large amounts of digital text exists. (GPT-4’s performance on “low resource languages,” which have far fewer digital documents to train from, has never been as good.)

Unbabel tested its AI model, which it calls TowerLLM, against GPT-4o and the original GPT-4, as well as OpenAI's GPT-3.5 and competing models from Google and the language translation company DeepL. It looked at translation from English to Spanish, French, German, Portugese, Italian, and Korean. In almost every case, TowerLLM narrowly edged out GPT-4o and GPT-4. TowerLLM's highest accuracy came in English-Korean translations, where it beat OpenAI's best models by about 1.5%. On English-German translations, GPT-4 and GPT-4o were a fraction of a percentage point better.

Unbabel also tested its model on translations of documents for specific professional domains, such as finance, medicine, law, and technical writing. Here again, TowerLLM performed between 1% and 2% better than OpenAI's best models.

Unbabel's results have not been independently verified, but if confirmed, the fact that GPT-4 has now been bested at translation may indicate that the model, which has remained the top performing LLM on most language benchmarks despite having debuted 15 months ago—an eternity in the fast-paced world of AI development—may now be vulnerable to newer AI systems being trained with different methods. OpenAI is reportedly training a more powerful, newer LLM—although its release date remains uncertain.

Unbabel, which has headquarters in both San Francisco and Lisbon, said TowerLLM was trained to be multilingual on a large public dataset of multilingual text. This means the model also performs better on reasoning tasks in multiple languages than some competing open-source AI models of a similar size created by companies such as Meta and French AI startup Mistral.

TowerLLM was then fine tuned with carefully curated dataset of high-quality translations between language pairs. Unbabel was able to use another AI model that it had trained to assess translation quality—which is called COMETKiwi—to help curate this fine-tuning dataset.

João Graça, Unbabel’s chief technology officer, told Fortune that most other LLMs have a higher proportion of English-language text in their initial training set and only pick up the ability to translate coincidentally. But TowerLLM was trained on a dataset that was specifically designed to include a large amount of multilingual text. He also said that finetuning on the smaller, curated dataset of high-quality translations was key to the resulting model’s superior performance.

It was one several recent examples in which smaller AI models have equalled or exceeded the performance of much larger ones when trained on better quality datasets. For instance, Microsoft created a small language model called Phi 3, with just 3.8 billion parameters (the tunable variables in the model), that outperforms models more than double that size by creating what Microsoft called a “textbook-quality” dataset. “The insight from Phi is that people should focus on the quality of the data,” Graça said. He noted that all AI companies are now using the same basic algorithmic design with some subtle variations. What differentiates the models is data. “It’s all about the data and the training curriculum, which is how you give the data to the model,” he said.

TowerLLM is currently available in two sizes, one with seven billion parameters and one with 13 billion. An earlier version of the model, which debuted in January, came close to GPT-4’s performance, but didn’t quite exceed it. That model also only worked for 10 language pairs. The new model edges past GPT-4 and supports 18 language pairs.

The model has only been tested against GPT-4o for translation, meaning that GPT-4 may still have an advantage at other tasks such as reasoning, coding, writing, and summarization.

Graça said that Unbabel plans to expand the number of languages TowerLLM supports, adding 10 additional ones soon. The model is also being finetuned to work on very specific translation tasks that businesses often care most about—such as translating complex legal documents or patent and copyright information. It has been trained to get better at “transcreation,” the skill of translating a piece of content not word-for-word, but so that it captures very subtle cultural nuances, such as using colloquial expressions or slang that a native from a certain generation would use, Graça said.

This story was originally featured on Fortune.com

Business Insider
Ground robots may be the 'next game-changer technology' of the war, senior Ukrainian official says
Ukraine is looking to build a fleet of ground robots for assault, minelaying, and logistics missions, Mykhailo Fedorov told Business Insider.
CNN Business
Huawei isn’t just back from the dead. It’s a force to be reckoned with
Huawei is in the midst of one of the most stunning comebacks in the history of the tech industry.
Reuters
Microsoft-G42 deal positive because it cut Huawei ties, White House official says
WASHINGTON (Reuters) -Microsoft's deal to invest $1.5 billion in artificial intelligence firm G42 is "generally a positive development" because it forced the United Arab Emirates-based company to sever ties to China's Huawei, a White House official said on Monday. "In a place like UAE ... where you had G42 working very closely with Huawei, for example, we have an interest in changing that picture," White House technology advisor Tarun Chhabra said.
Reuters
Exclusive-US probing China Telecom, China Mobile over internet, cloud risks
WASHINGTON (Reuters) -The Biden administration is investigating China Mobile, China Telecom and China Unicom over concerns the firms could exploit access to American data through their U.S. cloud and internet businesses by providing it to Beijing, three sources familiar with the matter said. Authorities at the Commerce Department are running the investigation, which has not been previously reported.
The Canadian Press
'Hamster' crypto craze gripping Iran highlights its economic malaise ahead of presidential election
DUBAI, United Arab Emirates (AP) — Cab drivers and bikers tap away furiously on their mobile phones as they wait at red lights in the Iranian capital during an early June heatwave. Some pedestrians in Tehran are doing the same. They all believe they could get rich.
Yahoo Canada Style
Deal alert: This Amazon laptop is a whopping 54% off right now — 'surprisingly lightweight'
Save a whopping $475 on this "sleek" laptop with this limited-time deal.
Autoblog
2025 Infiniti QX80 First Drive Review: So close to being great
The QX80 is completely redesigned for 2015 (after 14 years!), not surprisingly being substantially improved in the process.
Investing.com
Wedbush: Apple AI-driven super cycle about to begin
Wedbush analysts believe the market is starting to recognize that with Apple (NASDAQ:AAPL) Intelligence on the horizon, Apple’s “AI-driven super cycle is about to begin.”
Autoblog
The best FM transmitters of 2024
If you have an older car without Bluetooth and you wish you could play your own music while driving, then consider getting a FM transmitter.
South China Morning Post
Tencent's Dungeon & Fighter Mobile bucks weak local market with strong first-month performance
"The future performance will depend on Tencent's operations and continuous updates to the game," Lu added. The title crossed the US$100 million mark just 10 days after its launch, topping the revenue growth chart, and its 11-day sales run in May surpassed the combined revenues of Tencent's long-time blockbusters Honour of Kings and Peacekeeper Elite over the same period, Sensor Tower data showed.
Reuters
AI-focused manufacturing startup raises $106 million, from Nvidia and others
Software and robotics startup Bright Machines raised $106 million in a Series C funding round that included Nvidia and Microsoft as investors, the company said on Tuesday. The San Francisco-based company makes equipment and software that helps automate a range of manufacturing tasks through the use of artificial intelligence and machine learning. Other investors in the Series C funding round included venture capital firm Eclipse Ventures, robotics maker Jabil and BlackRock.
Bloomberg
Oracle Warns That a TikTok Ban Would Dent Revenue and Profit
(Bloomberg) -- Oracle Corp. warned investors that a new law potentially banning TikTok in the US threatens to hurt its financial results.Most Read from BloombergYouTuber Dr Disrespect Was Allegedly Kicked Off Twitch for Messaging MinorNvidia Rout Takes Breather as Traders Scour Charts for SupportTrump Could Actually Lose Florida. Here’s Why.Rivian Gets $5 Billion Lifeline in Joint Venture With VolkswagenPaul Singer Is Pitching Wall Street's Own Brand of MAGAThe law signed by President Joe Biden
Yahoo Finance UK
Trending tickers: Apple, Bitcoin, Target and PepsiCo
The latest investor updates on stocks that are trending on Monday.
USA TODAY
Monica Lewinsky wants Judge Aileen Cannon overseeing Trump classified docs case impeached
Monica Lewinsky said she hopes Judge Aileen Cannon, who is overseeing former President Donald Trump’s classified documents case, is impeached.
HuffPost
George Conway Tells Trump What The Rest Of The World Really Thinks About Him
He also revealed what will happen if Joe Biden provokes Trump at this week's debate.
People
Utah Couple with 6 Children Found Dead by Relative in Apparent Murder-Suicide: 'Many Are Devastated'
Police believe Olin Johnson, 57, fatally shot wife Kerilyn Johnson, 52, before turning the gun on himself
Bloomberg
‘This Changes Everything’: Trudeau Stung by Loss in Toronto
(Bloomberg) -- Canada’s Conservative Party won a special election in a district in central Toronto, dealing a substantial blow to Prime Minister Justin Trudeau’s Liberal Party ahead of a national vote expected next year.Most Read from BloombergYouTuber Dr Disrespect Was Allegedly Kicked Off Twitch for Messaging MinorNvidia Rout Takes Breather as Traders Scour Charts for SupportTrump Could Actually Lose Florida. Here’s Why.Rivian Gets $5 Billion Lifeline in Joint Venture With VolkswagenPaul Singe
The Independent
Hillary Clinton warns of Trump’s debate stage chaos: ‘He starts with nonsense and then digresses into blather’
Hillary Clinton slammed Donald Trump, saying ‘expectations for him are so low that if he doesn’t literally light himself on fire on Thursday evening, some will say he was downright presidential’
INSIDER
Forget 'eat your veggies' — a Blue Zone expert says you should prioritize these 2 protein-rich foods to live a longer, healthier life
Healthy eating doesn't have to be expensive. Two cheap staples can make delicious, protein-rich meals that are great for your brain and body.
HuffPost
Ex-Aide Predicts Future Of Donald And Melania Trump's Relationship
Stephanie Winston Wolkoff weighed in on the former first lady's absence from the campaign trail.

S&P/TSX

S&P 500

DOW

CAD/USD

CRUDE OIL

Bitcoin CAD

CMC Crypto 200

GOLD FUTURES

RUSSELL 2000

10-Yr Bond

NASDAQ futures

VOLATILITY

FTSE

NIKKEI 225

CAD/EUR

Unbabel says its new AI model has dethroned OpenAI’s GPT-4 as the tech industry’s best language translator

Latest Stories

Ground robots may be the 'next game-changer technology' of the war, senior Ukrainian official says

Huawei isn’t just back from the dead. It’s a force to be reckoned with

Microsoft-G42 deal positive because it cut Huawei ties, White House official says

Exclusive-US probing China Telecom, China Mobile over internet, cloud risks

'Hamster' crypto craze gripping Iran highlights its economic malaise ahead of presidential election

Deal alert: This Amazon laptop is a whopping 54% off right now — 'surprisingly lightweight'

2025 Infiniti QX80 First Drive Review: So close to being great

Wedbush: Apple AI-driven super cycle about to begin

The best FM transmitters of 2024

Tencent's Dungeon & Fighter Mobile bucks weak local market with strong first-month performance

AI-focused manufacturing startup raises $106 million, from Nvidia and others

Oracle Warns That a TikTok Ban Would Dent Revenue and Profit

Trending tickers: Apple, Bitcoin, Target and PepsiCo

Monica Lewinsky wants Judge Aileen Cannon overseeing Trump classified docs case impeached

George Conway Tells Trump What The Rest Of The World Really Thinks About Him

Utah Couple with 6 Children Found Dead by Relative in Apparent Murder-Suicide: 'Many Are Devastated'

‘This Changes Everything’: Trudeau Stung by Loss in Toronto

Hillary Clinton warns of Trump’s debate stage chaos: ‘He starts with nonsense and then digresses into blather’

Forget 'eat your veggies' — a Blue Zone expert says you should prioritize these 2 protein-rich foods to live a longer, healthier life

Ex-Aide Predicts Future Of Donald And Melania Trump's Relationship