Translating Sensitive Data in a Regulated Context

With digital access available to everyone, secure sensitive data translations is at the heart of business challenges. Even more so in the face of the many regulations that govern their use! But how do we guarantee compliance? What are the solutions for secure translation?

Our experts provide in-depth answers to these questions in a white paper.

Download our E-Book !

What are the solutions for secure translation?

The use of sensitive data is supervised and subject to strict regulations.

And, under the GDPR, the European Union strictly prohibits its processing, unless:
– the person concerned has given his consent,
– this information is made public,
– the use of the data in question is authorized by the CNIL.
Exceptions to the processing of sensitive data are not limited to these three cases… For a comprehensive view, see the full GDPR

What are the challenges of translating your sensitive data?

The risks associated with data translation

It is difficult to secure the translation of data when the risks associated with it are underestimated or even ignored by professionals.

Shadow IT. This unsuspected risk of data translation comes from inside a company. It is a practice that consists of using unsecured software without the agreement of the DSI. The risks: cyberattacks, cybercrime, leak / theft / loss of data, non-compliance with the GDPR, etc.

Cybercrime. It is a criminal activity carried out on a computer or any other device connected to a computer network. The translated data can thus be stolen by cybercriminals and misused: identity theft, extortion of money, sale of private data, etc.

Data leakage. The use of free online translation software is not always reliable and can sometimes lead to the recovery and subsequent use of sensitive data by third parties.

Solutions to secure your data translation

If the translation of sensitive data is difficult, regulated or even risky, it is sometimes useful:

  • communication between international branches within a group;
  • translation of client documents;
  • international after-sales solutions with automatic translation into the operator’s language.

To better manage your sensitive data, you must therefore treat it differently and secure it in accordance with the regulations in force.

Several solutions are possible: machine translation, Cloud, On-premises solutions… It’s up to you to find the one best suited to your needs.

Ensuring the confidentiality of sensitive and personal data and maintaining company compliance with the GDPR is at the heart of business challenges. There are various options available to you for this. Regulations, risks, solutions…

Discover all the issues and solutions relative to secure translation explained by our experts in our E-Book :

SYSTRAN x OVHcloud: book your seat at Open Trusted Cloud Day!

On october 12th, SYSTRAN will participate in the second edition of the Open Trusted Cloud Day, organized by OVH, which brings together many digital players. What is it? Why? For whom?

What are some of the key issues at stake in this upcoming event ?

Increased digitalization means that the digital players need alternative software solutions and GDPR-compliant Cloud Hosting.

This is what the Open Trusted Cloud program is all about: 

Cobuilding a common platform and an ecosystem of solutions that comply with european laws and values, and hosted in an open and reliable cloud.

The second edition of the Open Trusted Cloud Day event brings together many software publishers, and SaaS and PaaS solutions providers and aims to demonstrate the strenght of a common platform, and the performance of a co-buildt model.

What are the objectives of this event?

The objective of the event is to create an Open Trusted Cloud Label in order to reassure Cloud ans SaaS solutions users about the respect of data sovereignty standards.

This label includes the provision of an Open Trusted Cloud communication kit, the establishment of a service catalog to ensure the clear use and security of OVH servers and the promotion of the solution to European public Stakeholders.

Who are the guests?

The Open Trusted Cloud event brings together more than 200 French software vendors, SaaS and PaaS providers, Government representatives and consulting company (including Capgemini).

Who is OVH ?

OVHcloud is a global player and Europe’s leading cloud service provider, operating more than 400 000 servers in its 33 data centers on 4 continents. It offers a secure, compliant and sustainable cloud environment, by providing high-performance, cost-effective services for managing, protecting and scaling up the data .

It is this unique approach that enables OVHcloud to cover all the use of its 1.5 millions customers in more than 130 countries.

As a pure french player, OVHcloud is committed to the European digital ecosystem and thus, offers an excellent level of security and data sovereignty to its customers. OVHcloud became a partner for SYSTRAN, not only a provider. It is with OVHcloud that SYSTRAN was able to develop its machine translation service and also develop the SYSTRAN Marketplace offering 40 languages and 400 translation models.

As a privileged and historical partner, SYSTRAN is participating to this event, on october 12th.

The goal ? To communicate about data sovereignty, while offering service cloud solutions (e.g for translations).

Making efficient neural machine translation available to everyone with OpenNMT

OpenNMT is an open-source ecosystem for neural machine translation started in 2016 by SYSTRAN and the Harvard NLP group. The project has been used in numerous research and industry applications, including SYSTRAN Translate and SYSTRAN Model Studio.

OpenNMT’s main goal is to make neural machine translation accessible to everyone. However, neural machine translation is notoriously expensive to run as MT models often require a lot of memory and compute power. Early in this project, SYSTRAN engineers focused on improving the efficiency of OpenNMT inference to reduce cost and improve productivity.

The computational challenge of neural machine translation

Neural machine translation models are usually based on the Transformer architecture which powers many recent advances in natural language processing. A common variant known as “big Transformer” contains about 300 million parameters that are tuned during a training phase. Since the parameters are stored using 32-bit floating-point numbers, the model alone takes at least 1.2 GB on disk and in memory.

Continue reading

Real-time Speech Translation

An increasing number of live events such as conferences, meetings, lectures, debates, radio and TV shows, etc. are nowadays being live streamed on video channels and social networks. These events are transmitted in real time to a large audience, on all types of devices and anywhere in the world.

Captioning and live translation1 are seen as essential in order to ensure that these events reach a growing international audience. How to optimise the comfort and understanding experience of such large audience raises the issue of multilingualism that we discuss in this post. 

In the context of the upcoming French Presidency of European Union in January 2022, SYSTRAN has developed a tool called Speech Translator for real-time captioning and translation of single-speaker speeches or multi-speaker meetings. Starting with French or English as the source spoken language, Speech Translator : 

  1. transcribes the original speech, partnering for this task with Vocapia Automatic Speech Recognition2
  2. punctuates and segments the automatic speech recognition (ASR) output, making this automatically formatted and corrected transcription available to human reviewer and audience (speech transcription/captioning),
  3. simultaneously runs machine translation (MT) powered by our best quality translation models towards European Union languages (speech translation/subtitling),  

all of this with the lowest latency and in a dedicated and user-friendly interface. The task closely resembles simultaneous interpreting, which performs real-time multilingual translations. The next figure shows a screenshot of our live ST system interface where captions (left) as well as the corresponding English translations (right) are displayed. 

SpeechTranslator: Live speech translation system
SpeechTranslator: Live speech translation system. 
Continue reading

Exploit your TMs to boost NMT performance

Over decades, the translation industry has been proposing the use of “similar” translations in CAT tools, allowing human translators to visualize one or several matches retrieved from a translation memory (TM) when translating new documents. A translation memory (TM) is a database that stores segments of text and their corresponding translations. Segments can be sentences, paragraphs or sentence-like units (headings, titles, elements in a list, etc.). While the ideal situation is to find perfect matches, these are not always available. In such a case, translators resort to matches showing sufficient content in common with the document to be translated. These partial matches are then slightly “repaired” to achieve correct translations. 

The use of TM matches relies on the idea that repairing a given TM match requires less effort than producing a translation from scratch, thus leading to higher productivity and consistency rates. The following figure illustrates human translation via repairing a TM match. The English sentence How long does the flight last? is translated into French considering the TM match How long does a flu last?  Quelle est la durée d’une grippe?

Continue reading

10 Ways to Improve Your Translation Output

10 Ways to Improve Your Translation Output

As Globalization 4.0 rears its head and the convergence of Industry 4.0 and remote work become commonplace in the business ecosystem, translation is an increasingly important component of productivity, engagement, and communication.  

But how do you iron out the knots? You need to effectively communicate with team members, colleagues, and customers across physical and linguistic borders. Unfortunately, there’s a tiny road bump in the road— language. 

Translation engines allow you to seamlessly communicate across language barriers. But creating a well-oiled, hyper-engaging translation solution isn’t always easy. Obviously, the source of your engine is important. Modern Neural Machine Translation (NMT) uses intelligent neural networks to instantly contextualize, digest, and output translations in micro-seconds.  

Continue reading

Introducing SYSTRAN Model Studio: Improving Value for Businesses and LSPs

SYSTRAN Model Studio

SYSTRAN, the pioneer of neural machine translation solutions and technology, recently launched SYSTRAN Model Studio to help language experts build powerful and robust domain-specific translation models. By converging SYSTRAN’s world-class neural machine translation technologies with a global network of talented language and translation experts, SYSTRAN Model Studio unlocks higher translation quality and in-domain specialization for niche industries and businesses and allows LSPs to profit further from their data. 

Language experts, grow your business with SYSTRAN!

Continue reading

The Part Language Plays in Monitoring Activity and Maintaining Global Compliance

The Part Language Plays in Monitoring Activity and Maintaining Global Compliance

Fifty-seven percent of executives list risk and compliance as their two largest barriers to success, and a mere six percent of board members feel their company is adequately prepared to manage risk. In today’s hyper-complex risk landscape, compliance is the single greatest threat to productivity and liquidity.  

Even though noncompliance costs twice as much as building compliance frameworks, most organizations have difficulty integrating compliance into their day-to-day business model.  

It’s not easy. 

Continue reading

NMT Tech Enables Diversity and Inclusion Efforts

NMT tech enables diversity and inclusion efforts

Diversity & Inclusion (D&I) have quickly become staples of HR playbooks, yet organizations still struggle with how to fully integrate the practice across both people culture and enablement tools (technology).

The data supporting D&I practice is clear. Sales teams with the highest levels of racial diversity see 15x greater sales. Companies with diverse employee pools (across age, race, gender, culture, religion, etc.) see 2.3x higher bottom-line revenue.

Studies have shown that diverse teams are 87 percent better at decision-making — a largely intangible asset. In addition to being a moral imperative, D&I is a legitimate competitive advantage with echoing consequences and rewards across the business hierarchy. Although we’ve seen a focus in D&I HR strategies around hiring and leadership growth, businesses often fail to invest in technology frameworks that perpetuate D&I strategy internally, and as a customer engagement proposition.

Continue reading

Terminology & Neural Machine Translation: Our User Dictionary feature Explained!

Glossaries usually prove helpful to welcome a new colleague in your team, what if they were one of the best entry point to your domain for our models?

In various workplaces, a lot of knowledge is accumulated in lexicons, which uncover a wide variety of usages, from specifying specialized terms to introducing brand names and business concepts.

Based on more than 50 years of dedicated experience, our research team have presented at COLING 2020 the technique behind the User Dictionary feature, designed to polish machine translation and give it an appropriate flavor through words. This presentation has been recorded and is available here.

Continue reading