Let's have a quick demonstration about what is language translation and what is proper language translation for a website or more importantly an internet accessible cloud-based platform means in 2020.
It was the best of times, it was the worst of times, it was the age of wisdom, it was the age of foolishness, it was the epoch of belief, it was the epoch of incredulity, it was the season of Light, it was the season of Darkness, it was the spring of hope, it was the winter of despair, we had everything before us, we had nothing before us, we were all going direct to Heaven, we were all going direct the other way—in short, the period was so far like the present period, that some of its noisiest authorities insisted on its being received, for good or for evil, in the superlative degree of comparison only.
Opening text from A Tale of Two Cities by Charles Dickens
About Google Translate
Google had a free translate tool known as a widget. That widget has been deprecated as of April 2019. The widget is still working for most installations but will going away entirely before the start of 2020. The service is available in a number of other ways through Google APIs as a paid service. You can find more information here. While this tool was a simple and easy way to translate the content on a single page it did not take advantage of all the necessary SEO and metadata requirements to have the content properly distributed and indexed.
Understanding proper language translation and content best practices requires a mental paradigm shift. To often developers and content creators view the task of creating content from their own perspective. If you want an effective strategy you must view the efforts from the user's perspective and recognize how and where the user will come across the content. This brings us to Search Engines like Google, Bing, Yahoo, Baidu, and Yandex. In order to take full advantage of content with language translations you must consider the following:
URL - when a piece of content is properly translated the URL is adjusted to reflect the change. For example:
Your original piece of content will show up at the original URL in a site's default language:
Once translated, a two letter designation is added to the URL:
In this case, the es stands for, and directs the browser to load, the Spanish translation.
hreflang metadata tag - websites and platforms look one way to the human eye, but look altogether different to other computers; and [ro]bots, that index those sites and deliver the content to the users need slightly different information. In the case of language translation, the original content will be in the site's default language and the original URL will serve as the content's canonical [origin]. Each subsequent translation will have it's own unique URL and will pass relevant hreflang tags in the metadata.
To view the source code on any piece of content, or webpage, simply add 'view-source:' to the front of any given URL:
Another browser tab will open and display all of the metadata for the URL:
The metadata can pass a considerable amount of data that is all directly relevant to how effectively a site is indexed and delivered to users. With proper hreflang tags the content can be indexed and served to users in their chosen language based on their country of origin, device settings, and previous search history. As a contrary example: if a site has a piece of content in a language othe than its default, but it does not pass the proper tags and metadata, then that content will be mis-categorized and indexed incorrectly with negative affects upon its search result relevancy.
Sitemap - for sites with considerable content that would like to be fully and properly indexed by the search engines, an XML Sitemap of the entire site including all content and translations must be submitted to the engines interface [e.g. Google Search Console].
This process delivers the entire site's content list, with hierarchy, and alternates in a link format.
Sites and platforms have a number of ways that they can automatically [programmatically] determine the proper language to display. Simply and probably the most obviously, a site can call a user's IP address and serve a language setting based on the Country of Origin. While it may not seem obvious, English can be different in the UK, US, or Australia the same way the Spanish can be different from Spain, Mexico, or Argentina; so, context must be considered.
Equally, web browsers themselves now have language and location based settings that my request/require that a user sign in and will display options for translations when content is detected in a language that is other than usual for a given user.
Search engine results themselves can be in any multitude of languages; and, if a user selects results of a given language that directs them to a site/platform the destination can capture a 'cookie' for the remainder of that users interaction that will translate all other content to that language by default.
Many accounts that are Web 2.0 or later [that allow a user to sign in to an account] can offer those settings for the user to designate and that will follow that user as they navigate the site/platform.
Finally, in a way similar to the search engine results cookie, if a user chooses a given language on a site for a particular piece of content - that choice will follow them through the remainder of the visit.
It is important to note: none of these solutions or methods of determination provide any increased access or relive a developer of the best practices required to properly distribute the content.
Accuracy within language translation is always an issue. Recently, Google launched an updated translation tool that utilizes sophisticated artificial intelligence to produce startlingly accurate language translations. The new system, which uses deep machine learning to mimic the functioning of a human brain, is called the Google Neural Machine Translation [GNMT] system.
To test the system, Google had human raters evaluate translations on a Interagency Language Roundtable [IRL] scale from 0 to 6. The IRL scale is a set of descriptions of abilities to communicate in a language. It is the standard grading scale for language proficiency in the United States's Federal-level service used by the United States Foreign Service Institute.
Translating from English to Spanish, the new Google tool’s translation was rated an average of 5.43; human translators earned an average of 5.5. For Chinese to English, the only public-facing option that currently utilizes the new system, Google Translate was rated an average of 4.3 while human translators got 4.6. At the end of the day, automation tools are very good; but, human interaction still gives a greater context.
About Drupal CMS
Drupal, specifically Core version 8, is currently the largest and most robust open-source Content Management System [CMS] available. As part of Drupal 8 - language and translations were included automatically and they reference over 100 language libraries as provided by Google. Both Drupal and the Google libraries are being utilized and improved constantly. Not only does Drupal 8 allow for translation of content, but it also translates the interface and configuration so that native speakers of multiple languages can all work in a singular platform without any barrier of entry and at 'cloud' scale.