SYSTRAN, the pioneer of neural machine translation solutions and technology, recently launched SYSTRAN Model Studio to help language experts build powerful and robust domain-specific translation models. By converging SYSTRAN’s world-class neural machine translation technologies with a global network of talented language and translation experts, SYSTRAN Model Studio unlocks higher translation quality and in-domain specialization for niche industries and businesses and allows LSPs to profit further from their data.
Currently, the SYSTRAN Model Studio boasts over 256 production-ready domain-specialized translation models and thousands of seed models — with new models being added every day. To facilitate this growth, SYSTRAN partners with a global community of hyper-talented trainers and data providers to bring tangibility to some of the most complex, demanding, and intricate languages, terminologies, and dialects.
The Need for More Nuanced Translation Services
Despite rapid growth in the neural machine translation vertical, accessing quality translation tools for niche industries and granular business dialect is still difficult. The average business does not have the resources to accurately train neural engines. This often leads to wide learning gaps that force machine learning to fill-in-the-blanks on niche language requirements. While this is entirely possible using an industry-leading neural translation framework like OpenNMT, it is a time-consuming and data-science intensive process that can create immediate frictions despite the long-term payoff.
SYSTRAN Model Studio aims to eliminate that gap. With language experts sourced from around the globe, SYSTRAN acts as a bridge between hyper-technical language requirements, end-users, and language trainers. As many businesses lack the resources and expertise to train their language engines, SYSTRAN Model Studio provides opportunities for both business users looking for immediate translation value and translation experts keen on flexing their language expertise. Beyond this trainer/user element, the model’s experts create may be offered to the global community for consumption, wherein the data/language expert sets the price and keeps 70 percent of royalties. A very straight forward business model, SYSTRAN provides the structure and platform for experts to grow and enhance their business.
The Value of SYSTRAN Model Studio for Translators and LSPs
SYSTRAN Model Studios seeks to rally business users behind valuable solutions while providing LSPs and translation experts a viable business model to grow their services. SYSTRAN operates with the highest level of value on data ownership and intellectual property rights. Ownership of all data stays with the model trainers, and corpora used to build models are traceable throughout the data lifecycle. SYSTRAN does not lock expertise into the platform. The models are reversible, meaning if the trainer desires to take their models outside, they can export an OpenNMT compatible version of the model for use elsewhere.
In addition to maintaining data control and integrity, SYSTRAN allows trainers and providers to set their own prices in the SYSTRAN Marketplace. By acting as both the developer of the neural translation technology and the liaison between trainers and business users, SYSTRAN helps LSPs build sustainable business models and reap the rewards of their hard-earned work on translation specialization.
Built on the Powerful OpenNMT Platform
SYSTRAN has been committed to developing and delivering state-of-the-art translation services for over 50 years. In 2016, SYSTRAN partnered with Harvard NLP to create OpenNMT — the world’s leading open-source neural machine translation framework. Providing both a PyTorch-based and TensorFlow-based execution, OpenNMT consistently ranks first across metrics on the WNGT 2020 Efficiency Shared Task.
Unlike many other players in the neural machine translation space, SYSTRAN both maintains OpenNMT and provides B2B secure, proven translation solutions for companies using its custom-built OpenNMT-based platform. Currently, OpenNMT has over 500 publications, 3,000 GitHub stars, and several major awards, making it an incredibly popular and powerful framework in the NMT industry.
This incredibly powerful and dynamic OpenNMT core allows SYSTRAN to deliver unparalleled value and best-of-breed base model quality. Users can work in a variety of environments, and SYSTRAN provides the API, interfaces, plug-ins, and tools necessary to facilitate dynamic and conductive language communications. By layering its solution upon an open-source core, SYSTRAN allows for nearly unlimited customizability and flexibility to creators and end-users, earning them a strong market position and a healthy connection to businesses and LSPs — paving the way for SYSTRAN Model Studio’s game-changing business model.
How the Model Studio Works
SYSTRAN partnered with OVH, a global cloud provider, to provide a state-of-the-art responsible solution while eliminating wasteful compute cycles. All training begins from pre-built SYSTRAN models, either generic or domain-specific. There is no need to build a translation model from scratch. Rather, you are incrementally enhancing existing models on the platform which have already been built and perfected by language experts.
SYSTRAN has done much of the upfront work to get you started, but even optimizing generic and pre-existing models is much easier thanks to SYSTRAN’s game-changing features.
Upload your bilingual or monolingual in-domain corpus (Spanish-English for example) into the system’s data repository to prepare the model for training. The data will remain completely secure during the training process and will not be used for purposes outside your own model training. SYSTRAN’s proprietary technologies are used to clean and prepare the data for neural model training.
Building a translation model from scratch is an arduous task. SYSTRAN Model Studio allows you to select from within SYSTRAN’s large translation model catalog for the starting point model that you will enhance with your own domain-specific data to specialize for your own translation needs.
By specializing an already trained SYSTRAN model, you will benefit from SYSTRAN’s proprietary technologies, such as embedded UD Sampling, Augmentation, Filtering, Noising and Tokenization.
Evaluation and Publication
Evaluate your specialized model’s evolution at each training iteration with SYSTRAN Model Studio’s scoring module. Within SYSTRAN Model Studio, it is easy to compare the BLEU score evolution of your models on more than 50 gold test sets curated by SYSTRAN’s data scientists and categorized by domains. You can also add your own test set to check the model’s progress on your very specific domain.
Founded in 1968, SYSTRAN is at the cutting-edge of the neural machine translation industry. With offices in Paris (France), San Diego (USA), Seoul (South Korea), Tokyo (Japan), and Mexico City (Mexico), SYSTRAN brings its OpenNMT-based Pure Neural™ Machine Translation (PNMT™) engine to businesses, governments, and public institutions around the globe. SYSTRAN technology is used across the globe to maximize communication, fuel collaboration, and break down language barriers that prevent meaningful interactions.
By combining best-in-class machine learning models with powerful artificial neural networks, SYSTRAN continues to deliver hyper-customizable neural machine translation platforms that bridge communication gaps. SYSTRAN currently boasts over 140+ direct language engines and thousands of language combinations.