Unbabel uses AI to determine translation quality. Trained on millions of lines of data from multiple domains and languages, our Quality Estimation technology represents the state-of-the-art approach to automated translation evaluation, eliminating the industry-wide gaps in translation quality while providing our customers with a unique, comprehensive, and accurate view of translation quality.
By leveraging the ability to evaluate accurately all completed translation requests, we grant access to real-time, fully AI-generated reports, providing full visibility into the translation activity. We can ensure:
- Unprecedented Visibility into Translation Quality: Customers have full transparency with automatic quality scores on all translations, providing a comprehensive, in-depth view of translation performance with no more blind spots.
- Time and Cost Efficiency: By automating the quality evaluation process, we eliminate the need for time-consuming and expensive manual reviews, letting business users consume and understand translation scores in real-time as part of an Unbabel subscription.
- Enhanced Quality Control & Continuous Improvement: Users can easily identify recurring issues, anticipate potential quality challenges, and proactively address them for precise improvements, all before customer feedback or negative outcomes arise.
How does it work?
As the current industry leader in Quality Estimation, Unbabel uses large amounts of proprietary data to train AI systems to mimic human reviewers. The systems annotate and compare the translation to the original text and accurately estimate the MQM score that the segment would receive were it sent for human evaluation. We apply this technology to every single translation which allows us to understand and communicate translation quality on a massive scale, where relying on humans to provide feedback would otherwise be impractical.
What is a MQM score?
Unbabel uses MQM (Multidimensional Quality Metric), a human-driven industry standard metric to measure the quality of translations. MQM is produced by a combination of (1) number of words of the translation, (2) number of minor errors, (3) number of major errors, (4) number of critical errors plus some other weights according to the severity.
Each translation is scored and grouped under a quality bucket according to our Customer Utility Analysis (CUA) framework, which you can read more about here.
Are humans still involved?
Until recently, Quality reporting in the Unbabel Portal has depended exclusively on annotations from our Community members - these just cover a fraction of translations and are only available when a specific volume threshold is reached.
While automated by the use of AI, our estimation models are constantly improved and calibrated by human intervention. Professional linguists step in at different times of the process - such as model retraining and manual annotation. Their effort contributes to the continuous strengthening of our AI estimation resulting in a positive feedback loop.