Tuesday, May 24, 2005

Spellcheckers

These are an important part of creating a complete language specific view of
the operating system. Even English speakers prefer to see correct spell
checkers for their locale. Certain languages by their nature do not need or
cannot use the wordlist type spellcheckers found on Linux.
There are 3 main spellcheckers in Linux:
* Ispell
* Aspell
* MySpell
Ispell is the original and includes affix compression. Aspell is dubbed as a
replacement for Ispell and has better algorithms for quessing missing words.
MySpell is used by OpenOffice.org and Mozilla and will work on both Windows and
Linux, it uses the affix compression found in Ispell.
Resourceshttp://speling.org/
Scandinavian spell checkers website with some useful tools.
Web based corpus buildingOr finding new words in my language by scanning the web. You can make use of
corpusbuilder and text_cat (How to use these TODO)
Once you have a list of potential words you can use the new-words script in
src/wordlist to identify words that are not in your language. Review these
words and add them to you master wordlist.
Letter FrequenciesThe translate project has a simple python script that creates letter frequncies
that can be used in the MySpell TRY line.
See translate/src/wordlist/letter-frequency.py
BuildingThe easiest way to build your spellcheckers is to use our project spellchecker
build framework. This will build MySpell and Aspell (Ispell temporarily disabled)
spellcherckers from a common wordlist or wordlists. Look at the Afrikaans and
Zulu dictionaries for a template of the process.
Making it workMake sure that your language is included in:
http://cvs.gnome.org/viewcvs/gnome-spell/gnome-spell/dictionary.c
So that Gnome applications such as Evolution can make use of your aspell
spellchecker.
Publishing* OpenOffice.org
To get the spellchecker onto the OpenOffice.org pages and thus downloadable
from within OpenOffice.org. You will need to submit a bug report. Here is and
example issue:
http://www.openoffice.org/issues/show_bug.cgi?id=23201
* ASpell
TODO
* Mozilla
TODO: have tried requesting updates on the Mozilla dictionary site but no
responce.
ReferencesDebian Spellchecker packaging policies
http://dict-common.alioth.debian.org/dsdt-policy.txt
OpenOffice.org Lingucomponent

Online Translation Tools

There are a number of online translation tools that you might want to use to
assist your translation projects. Being online they allow you to work from
anywhere, it also allows many people to participate without having to install
software or configure a PO editing tool.
Obviously the writers of this document are biased in that the Translate
Toolkit also has a web-based translation tool - Pootle
Web-based Tools* Pootle - http://pootle.wordforge.org
A PO file based translation and translation project management system.
Actively being developed and used by the Translate.org.za project as part of
the Translate Toolkit.
* Kartouche - http://i18n.kde.org/tools/kartouche/
Developed for the Welsh localisation team. See also Omnivore their online
compendium tool.
The Welsh team have a live site here:
http://www.kyfieithu.co.uk/
* IRMA - http://info.linspire.com/irma/
Developed by Linspire aiming to make things as simple as possible
See also: http://www.linspire.com/lindows_michaelsminutes_archives.php?id=147&all=1
* Rosetta - http://launchpad.ubuntu.com/rosetta
Developed by Ubuntu Linux. Aims to make it easy to translate software.
* WeBabel
A PHP & MySQL translation portal
http://kazit.berlios.de/webabel/
* An OpenOffice.org specific tool
Used by the Danish team and developed by: Søren Thing Pedersen
http://www.things.dk/webtranslation

POEdit

POEdit is a PO editing tool that will run on Linux or Windows. It has a clean
yet simple interface as well as quite a good translation memory. Its catalog
manager does not seem logical at first but is powerful in that it allows you to
manage multiple projects.
Visit POEdits home page for more details:
http://poedit.sourceforge.net/
Downloadhttp://poedit.sourceforge.net/download.php
SetupWhen you first run poEdit you will be asked for some information. These items
are stored in the header of the PO files and include your name, email address
etc. The tabbed dialogue that appears needs the following information:
* Personalize: Supply your name and email address
* Editor: you can safely leave these unchanged
* Translation memory: leave unchanged for now
* Parser: unchanged
Click OK and you are presented with the PO editing interface.
The first time with POEditOnce POEdit is started you will see the translation interface.
Click file open to open and existing PO file. If you have no PO file or only a
POT file. Then simply copy the POT file so that it ends in .po
With a PO file open you will see all strings in the top part of the interface.
Strings are shown in different colours:
Cyan: untranslated
Yellow: fuzzy - needs to be examined and might need to be retranslated
White: translated
You translate in the lower half of the interface. The original string or
English string appears in the left. You type your translations in the bottom
right hand section. It seems that keyboard navigation does not work properly
so you will have to select each new string with the mouse.
Don't forget to save regularly Ctrl-S
Press F1 at anytime for the POEdit user manual. We recommend that you browse
it now.
Using POeditTODO
Using the Catalogue ManagerThe catalogue manager allows you to manage projects with multiple PO files.
File -> Catalog manager
Click on create new translation project. Then for each directory that has PO
files add it to the project. Unfortunately it will not automatically recurse
through directories.
Using Translation MemoryTODO

Using PO editing tools

You need a PO editor to edit PO files which is the main format used by Free Software, it is also the format the the Translate Tollkit uses when converting Mozilla and OpenOffice.org files for editing
There are a few PO editing tools available. Your choice really depends on which platform you are running:
I use Windows
Install poEdit or IniTranslator
I use Linux
I use KDE
Install or use KBabel
I use Gnome
Install or use GTranslator
I am not allowed to install software on my computer
If possible use a web-based translation tool. Not all translation projects have a web interface for translation. Possible options are Pootle, Rosetta and Kartouche
I am not allowed to use the web
If you cannot install software or use the web then you might want to look at using CSV files edited in a Spreadsheet. You will have to have access to email though.

Other alternatives
For the masochists:
CSV — You can use a Spreadsheet to edit PO files that have been converted into CSV. It works quite well but is not highly recommended if you have good access to a PO editor listed above.
vi — You can use the vi editor. Make sure you install the updated PO syntax highlighter available here: http://www.vim.org/scripts/script.php?script_id=913
Xliff — This is an emerging translation interchange standard. We will see much more available in this format, convertors are being created to move PO files to Xliff and a number of editors both GPL and commercial are being made available.

Gnome

Gnome is a Linux graphical desktop environment. The project has quite an
established localisation initiative.
ReadingPlease familiarise yourself with these before proceeding:
* Gnome localisation style guide (These are actually Sun Solaris style guides)
http://developer.gnome.org/projects/gtp/style-guides/
* Localisation Guide
http://developer.gnome.org/projects/gtp/l10n-guide/
* Localisation section in the developers website
http://developer.gnome.org/arch/i18n/l10n.html
* Gnome Translation Project
http://developer.gnome.org/projects/gtp/
Useful URLs* Status Page
http://l10n-status.gnome.org/
* Resources Page
http://developer.gnome.org/projects/gtp/resources.html
TranslatingThis is a roughly sequential outline of the steps you need to take to translate
Gnome into your language.
Joining the mailing listSubscribe to the gnome-i18n mailing list:
http://mail.gnome.org/mailman/listinfo/gnome-i18n/
Gnome GlossaryStart with this to create consistency across future Gnome localisations.
http://developer.gnome.org/projects/gtp/glossary/
Latest versions of the CSV glossary:
http://cvs.gnome.org/viewcvs/gnome-i18n/glossary/GnomeGlossary.csv?view=log
Use the csv-to-pot.sh script to convert the .csv to a pot file. The layout is
slightly different from the layout created by po2csv so you cannot convert the
created POT file to a Translate Toolkit style CSV file.
If you need to translate using CSV rather edit the GnomeGlossary.csv and
manipulate it into a PO file later.
What files firstAdvice from Christian Rose on the Gnome team.
gtk+ 1126 (toolkit, very largish, but many messages are
developer-oriented and can safely be ignored to begin with, but some few
messages here are very visible to the end user)
libgnomeui 305 (many user-visible menus and stuff)
gnome-mime-data 350 (user-visible file type desc. etc)
libbonoboui 96
gnome-vfs 80 (file size formatting etc)
yelp 71 (help browser)
gedit 640 (text editor)
nautilus 1449 (file manager, also very largish, but not all
messages here are immediately visible to the end user, even though many are)
gnome-desktop 88
gnome-panel 587
gnome-session 103
gnome-control-center 649
gdm2 629 (login manager)
eog 170 (image viewer)
... and then the rest in desktop + developer-libs.
TODO: use podebug to get a better targeting on these files.
Getting a CVS accountYou will need a CVS account to commit translations:
http://developer.gnome.org/doc/policies/accounts/requesting.html
Most probably you will only get one if you have supplied translations already
or are a new team leader.
You will need to read this to ensure that you commit correctly to CVS:
http://developer.gnome.org/doc/tutorials/gnome-i18n/translator.html
The steps are well laid out and very clear, you can't go wrong if you follow it
carefully. So old time users of CVS you must read it.
Your translated PO files are placed within the package in GNU style, ie in the
po/ directory unlike the KDE system of all languages in one module. This means
that you will have to checkout and add files to the various modules that you
use. Eg to add translations of gnome-mime-data you will need to checkout the
module by that name.
Targeting a releaseGnome follows a regular 6 monthly development cycle with even numbered stable
releases and odd number development releases.
You can see the release schedule here:
http://www.gnome.org/start/unstable/
If your team is moving quickly it might be good to target a stable minor
release. This will also be the platform that most users will be on. It also
presents the chance to have multiple releases as you move through each minor
release.
Translation Status PageYour languages translation status page at http://l10n-status.gnome.org/ will be
updated as soon as your first file is committed to CVS.
Setting up your Bugzilla ComponentYou need a Bugzilla component so that users of your language can report errors.
Instructions for setting one up are here:
http://developer.gnome.org/projects/bugsquad/maintainers.html
This information courtesy of Christian Rose. You should return these details
to him at: menthos at gnome oeg.
You need to supply:
* language code
* language name (in English)
* language name (spelled in the language itself. We actually don't
use this info in Bugzilla but on the http://www.gnome.org/i18n/
page. Please replace non-ASCII characters with proper HTML escape
sequences. See the HTML source code of that page for examples)
* default owner (must be a valid bugzilla account)
The default owner is the person who should be assigned the bugs by
default. If he or she doesn't have a bugzilla account, he or she can
create one at http://bugzilla.gnome.org/createaccount.cgi.
* default qa contact (must be a valid bugzilla account)
The default QA contact is usually the person who should make sure
the bug was fixed properly by the assignee. If the qa contact person
doesn't yet have a bugzilla account, he or she can create one at
http://bugzilla.gnome.org/createaccount.cgi. This field is optional,
you don't need to decide on a default qa contact if you don't want
to.
* component description
Usually of the form "Here you can place your bugs about
$LANGUAGENAME [$LANGUAGECODE] translations". Example: "Here you can
place your bugs about Swedish [sv] translations".
If you have the possibility, try also to translate this into
ASCII-only English, and we'll use the translation as well.
You have the option of assigning this the bug reports to a mailing list:
If you want, there's also the possibility to use a mailing list instead
of an individual for the default owner and/or default qa contact fields.
It's a bit more complicated; among other things you need access to the
mailing list configuration. Here is what you should do if you want a
mailing list in one or both of the fields above:
1) Create a bugzilla account for your mailing list, i.e. a Bugzilla
account with your list's address as account name.
2) Subscribe the bugzilla deamon address
(bugzilla-daemon@widget.gnome.org) to your mailing list, but also
disable *ALL* mail from the mailing list to this address (If it's a
Mailman mailing list you can change bugzilla-daemon@widget.gnome.org's
mailing list options to NOMAIL).
Application SpecificThere are some applications that need specific treatment. These are those:
* gdm2
The login manager needs patches to gui/gdmlanguages.c and config/locale.alias to
add your languages. Email your patch to "George"
Suggested bug report and related email for adding English (Canadian), use as a reference:
http://mail.gnome.org/archives/gnome-i18n/2004-February/msg00256.html
http://bugzilla.gnome.org/show_bug.cgi?id=135053
Also Arabic issue highlights how it all fits together:
http://mail.gnome.org/archives/gnome-i18n/2004-March/msg00177.html
Actual CVS diffs to add Afrikaans, Northern Sotho and South African English
http://cvs.gnome.org/viewcvs/gdm2/config/locale.alias?r1=1.38&r2=1.39
http://cvs.gnome.org/viewcvs/gdm2/gui/gdmlanguages.c?r1=1.41&r2=1.42
Translating Documentation On the Gnomr-i18n mailing list Christian Rose says, "At the moment,
we don't translate documentation the same way we translate the user
interfaces (i.e. with "po" files). However, we hope
to do so at some point, since po files provide several essential
advantages compared to maintaining translations of plain XML. One such
advantage is that it divides documents into smaller pieces (messages),
allowing you to see exactly what parts have an inconsistent translation
and need updating."
"For the moment, what you may want to do is to use the "xml2po" utility
in the "gnome-doc-utils" package/module. This will allow you to
transform the XML/DocBook source of a document into a pot file that you
can translate and maintain. Also, it allows you to reverse the process
and create a translated XML file out of the po file later on."

The GNOME Translation Project

About the GTP
GNOME has support for internationalization (I18N) and localization (L10N), so the goal of the GNOME Translation Project is to translate GNOME applications and documentation to every language in existance.
We are mainly volunteers working in our spare time and thus always searching for new contributors. We could use help translating applications and their manuals and would like you to join us. See our Joining the GTP, and Tasks pages for more information.