Simultaneous release of software in several languages at the same time is a common requirement today. So tools for construction of software today need to assume an international audience.
The process for construction of internationalized software and localized products depends on whether there is an assumption of an American user base, as in the past before lower cost computers and the internet.
Source code embedded with language strings are "internationalized" by extracting "hard coded" text strings and replacing them with references to a resource file external from the source code. This has the advantage of unicode compliance. This separation of logic and data also makes it easier to distribute the work of translating strings to other languages.
Internationalization is the process of designing applications so that they can be localized (adapted) to various languages and regions without engineering changes.
This word is commonly abbreviated to I18N by removing the 18 characters between the first character (I) and last character (n) of the word.
Instead of hard coding application text intermingled among logic code, internationalized code obtain translated strings and objects using a key that is in every localized resource file. In each resource file is the text that goes with each key. Applications obtain text to a different language simply by looking in a different resource file.
The actual string or object that gets loaded depends on the current user interface Thread.CultureInfo.currentUIculture property for each thread.
So, unlike Java, .NET applications need to start again after every switch in culture. Bummer, I know.
How to Internationalize preents a good overview.
Localization may include creating graphics files containing localized images (such as flags) as well as the translation of text in message resource properties files.
For application code that has not been internationalized, the localization process may also include adapting or adding software components for a specific locale. There are software programs that look for application text strings in programs.
An important part of localization is testing, to detect when localized text (such as long words in German) do not fit on a screen designed for smaller English words.
A translation memory (TM) is a database of translation assets, usually spanning over several projects of an organization. TMs are created using CAT software (Computer Aided Translation) and localization software. The best known offerings include enterprise-priced Trados and the less expensive Wordfast.
Trados setup the translationzone.com portal.
Each early translation software vendor developed their own proprietary file formats. But most translation software now support the TMX (Translation Memory eXchange) industry-wide open standard developed by lisa.org/tmx, a project of the OSCAR (Open Standards for Container/Content Allowing Re-use) Special Interest Groups of LISA (Localization Industry Standards Assn).
TuMatXa is a web server application (that runs in Linux Zope) for managing a TM repository.
translate.google.com enables the simple sharing of TXM files on the internet while improving on their automated translation.
The TMX XML format consists of a header and a body. An example of the header where <TMX takes the place of more familiar <HTML:
<?xml version="1.0" ?> <tmx version="1.4"> <header creationtool="XYZTool" creationtoolversion="1.01-023" datatype="PlainText" segtype="sentence" adminlang="en-us" srclang="EN" o-tmf="ABCTransMem"> </header>
Within the <body are tu (translation unit) tags identified by a tuid attribute. Translated text is between <seg> and </seg> (segment) tags within a <tuv> (translation unit value) tag with an attribute such as xml:lang="en" identifying its language.
<body> <tu tuid="hello" datatype="plaintext"> <tuv xml:lang="en"> <seg>hello</seg> </tuv> <tuv xml:lang="it"> <seg>ciao</seg> </tuv> </tu> <tu tuid="world" datatype="plaintext"> <tuv xml:lang="en"> <seg>world</seg> </tuv> <tuv xml:lang="it"> <seg>mondo</seg> </tuv> </tu> </body> </tmx>Nicola Asuni implemented Masaki Itagaki's initial proposal with code on SourceForge which extends the Java ResourceBundle class so that it directly reads (usually large) TMX XML text files.