SYSTRAN has been wholeheartedly involved in open source development over the past few years via the OpenNMT initiative,whose goal is to build a ready-to-use, fully inclusive, industry and research ready development framework for Neural Machine Translation (NMT). OpenNMT guarantees state-of-the-art systems to be integrated into SYSTRAN products and motivates us to continuously innovate.
In 2017, we published OpenNMT-tf, an open source toolkit for neural machine translation. This project is integrated into SYSTRAN’s model training architecture and plays a key role in the production of the 2nd generation of NMT engines.
OpenNMT-tf is based on TensorFlow, a machine learning platform that powers the artificial intelligence systems of many companies. On September 30, the TensorFlow team announced the release of TensorFlow 2.0, the first major update of the framework:
TensorFlow 2.0 is driven by the community telling us they want an easy-to-use platform that is both flexible and powerful, and which supports deployment to any platform. TensorFlow 2.0 provides a comprehensive ecosystem of tools for developers, enterprises, and researchers who want to push the state-of-the-art in machine learning and build scalable ML-powered applications.
The very next day, OpenNMT developers released OpenNMT-tf 2.0. Below are some highlights of this update and what it means for SYSTRAN products.
The changes introduced by OpenNMT-tf 2.0 are closely tied to TensorFlow 2.0, which simplifies the definition and training of machine learning models. OpenNMT-tf 2.0 features:
- More modular implementation that allows training new model architectures to improve the translation quality and speed
- Quicker delivery of new models due to faster model training on multiple GPUs, which makes better use of available hardware resources
- More user-friendly as a result of a revisited and simplified execution flow to improve the debugging and understanding of model training
In order to release an update at the same time as TensorFlow 2.0, the OpenNMT developers started planning this transition in the beginning of 2019. They participated in defining and developing TensorFlow 2.0 by fixing bugs and contributing to TensorFlow Addons, a repository of advanced components for machine learning applications.
This update is a new milestone for OpenNMT. Started in 2016, the project has played a central role in defining today’s machine translation open source landscape. We continuously improve the OpenNMT ecosystem to make the technology more accessible and powerful. OpenNMT-tf 2.0 follows this trend by integrating the latest best practices in model training and deployment.
Guillaume Klein, OpenNMT lead developer
For SYSTRAN, contributing to OpenNMT-tf and keeping it up to date is a good investment in this fast-moving field: it makes it easier to support new hardware for a faster training and ensures that the project can benefit from the latest improvements in this domain.
Why Open Source
In 2016, after 50 years of closed-source engineering, we decided to jump over to the other side and get involved in open-source development. We started with OpenNMT, which has been by far the most successful. It was followed by several additional modules from our Research Team. The move to open source development was not an easy task. It required development teams to adopt open-source dynamics and to convince the sales team that we wouldn’t losing our competitive edge.
Three years down the road, the benefits for us and our customers clearly dwarf the obvious disadvantages and fears such as seeing other companies adopt our technology to build competitor solutions. Amongst the invaluable benefits are the higher code quality, a higher notoriety, a faster pace set by the community and the satisfaction of contributing to the larger field of deep learning application that is progressively changing the world in which we live
From OpenNMT and the experience of more than thousands of models trained in-house with OpenNMT-tf and deployed on https://translate.systran.net, we are currently going further in sharing our know-how via a simplified environment and a complete suite for language experts to train and deploy their own translation models.