Papers and publications

icon-search

About Systran

With more than 50 years of experience in translation technologies, SYSTRAN has pioneered the greatest innovations in the field, including the first web-based translation portals and the first neural translation engines combining artificial intelligence and neural networks for businesses and public organizations.

SYSTRAN provides business users with advanced and secure automated translation solutions in various areas such as: global collaboration, multilingual content production, customer support, electronic investigation, Big Data analysis, e-commerce, etc. SYSTRAN offers a tailor-made solution with an open and scalable architecture that enables seamless integration into existing third-party applications and IT infrastructures.

Fast Approximate String Matching with Suffix Arrays and A* Parsing [PDF]

Fast Approximate String Matching with Suffix Arrays and A* Parsing [PDF]

We present a novel exact solution to the approximate string matching problem in the context of translation memories, where a text segment has to be matched against a large corpus, while allowing for errors. We use suffix arrays to detect exact n-gram matches, A* search heuristics to discard matches and A* parsing to validate candidate … Continued

Philipp Koehn, Jean Senellart

AMTA, October 2010.

SYSTRAN Chinese-English and English-Chinese Hybrid Machine Translation Systems [PDF]

SYSTRAN Chinese-English and English-Chinese Hybrid Machine Translation Systems [PDF]

This report describes both of SYSTRAN’s Chinese-English and English-Chinese machine translation systems that participated in the CWMT2009 machine translation evaluation tasks. The base systems are SYSTRAN rule-based machine translation systems, augmented with various statistical techniques. Based on the translations of the rule-based systems, we perform statistical post-editing with the provided bilingual and monolingual training corpora. … Continued

Jin Yang, Satoshi Enoue, Jean Senellart, Tristan Croiset

November 2009, CWMT

Selective addition of corpus-extracted phrasal lexical rules to a rule-based machine translation system [PDF]

Selective addition of corpus-extracted phrasal lexical rules to a rule-based machine translation system [PDF]

In this work, we show how an existing rule-based, general-purpose machine translation system may be improved and adapted automatically to a given domain, whenever parallel corpora are available. We perform this adaptation by extracting dictionary entries from the parallel data. From this initial set, the application of these rules is tested against the baseline performance. … Continued

Loic Dugast, Jean Senellart, Philipp Koehn

MT Summit, August 2009.

SMT and SPE Machine Translation Systems for WMT'09 [PDF]

SMT and SPE Machine Translation Systems for WMT'09 [PDF]

This paper describes the development of several machine translation systems for the 2009 WMT shared task evaluation. We only consider the translation between French and English. We describe a statistical system based on the Moses decoder and a statistical post-editing system using SYSTRAN’s rule-based system. We also investigated techniques to automatically extract additional bilingual texts … Continued

Holger Schwenk, Sadaf Abdul Rauf, Loic Barrault, Jean Senellart

Mars 2009

Statistical Post Editing and Dictionary Extraction: SYSTRAN/Edinburgh submissions for ACL-WMT2009 [PDF]

Statistical Post Editing and Dictionary Extraction: SYSTRAN/Edinburgh submissions for ACL-WMT2009 [PDF]

Abstract: We describe here the two Systran/University of Edinburgh submissions for WMT2009. They involve a statistical post-editing model with a particular handling of named entities (English to French and German to English) and the extraction of phrasal rules (English to French).

Loïc Dugast, Jean Senellart, Philipp Koehn

March 2009

Can we Relearn an RBMT System? [PDF]

Can we Relearn an RBMT System? [PDF]

This paper describes SYSTRAN submissions for the shared task of the third Workshop on Statistical Machine Translation at ACL. Our main contribution consists in a French-English statistical model trained without the use of any human-translated parallel corpus. In substitution, we translated a monolingual corpus with SYSTRAN rule-based translation engine to produce the parallel corpus. The … Continued

Loïc Dugast, Jean Senellart, Philipp Koehn

June 2008.

SYSTRAN Purely Neural MT Engines for WMT2017

SYSTRAN Purely Neural MT Engines for WMT2017

This paper describes SYSTRAN’s systems submitted to the WMT 2017 shared news translation task for English-German, in both translation directions. Our systems are built using OpenNMT1, an opensource neural machine translation system, implementing sequence-to-sequence models with LSTM encoder/decoders and attention. We experimented using monolingual data automatically back-translated. Our resulting models are further hyperspecialised with an … Continued

Yongchao Deng, Jungi Kim, Guillaume Klein, Catherine Kobus, Natalia Segal, Christophe Servan, Bo Wang, Dakun Zhang, Josep Crego, Jean Senellart

Published in "Proceedings of the Second Conference on Machine Translation", pages 265--270, Association for Computational Linguistics, 2017, Copenhagen, Denmark

SYSTRAN Translation Stylesheets: Machine Translation driven by XSLT [PDF]

SYSTRAN Translation Stylesheets: Machine Translation driven by XSLT [PDF]

XSL Transformation stylesheets are usually used to transform a document described in an XML formalism into another XML formalism, to modify an XML document, or to publish content stored into an XML document to a publishing format (XSL-FO, (X)HTML…). SYSTRAN Translation Stylesheets (STS) use XSLT to drive and control the machine translation of XML documents … Continued

Pierre Senellart, Jean Senellart

September 2005

SYSTRAN Intuitive Coding Technology [PDF]

SYSTRAN Intuitive Coding Technology [PDF]

Customizing a general-purpose MT system is an effective way to improve machine translation quality for specific usages. Building a user-specific dictionary is the first and most important step in the customization process. An intuitive dictionary-coding tool was developed and is now utilized to allow the user to build user dictionaries easily and intelligently. SYSTRAN’s innovative … Continued

Jean Senellart, Jin Yang, Anabel Rebollo

MT Summit IX; September 22-26, 2003

SYSTRAN Review Manager [PDF]

SYSTRAN Review Manager [PDF]

The SYSTRAN Review Manager (SRM) is one of the components that comprise the SYSTRAN Linguistics Platform (SLP), a comprehensive enterprise solution for managing MT customization and localization projects. The SRM is a productivity tool used for the review, quality assessment and maintenance of linguistic resources combined with a SYSTRAN solution. The SRM is used in-house … Continued

Jean-Cédric Costa, Christiane Panissod

MT Summit IX; September 22-26, 2003.