incubator-ooo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andre Fischer ...@a-w-f.de>
Subject Re: Pootle Data
Date Thu, 09 Feb 2012 15:09:57 GMT
On 09.02.2012 15:23, Huaidong Qiu wrote:
> Where can we get the data? I can help to check and understand the toolset
> and process.

That sounds great.  Please have a look at

http://people.apache.org/~af/index.html

-Andre

>
> On Thu, Feb 9, 2012 at 1:04 AM, Louis Suárez-Potts
> <lsuarezpotts@gmail.com>wrote:
>
>> Hi
>>
>> On 8 February 2012 11:49, Andre Fischer<af@a-w-f.de>  wrote:
>>> On 08.02.2012 17:31, Stuart Swales wrote:
>>>>
>>>> On 07/02/2012 14:02, Andre Fischer wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> I recently had a little time to look at the pootle data. Here is what
I
>>>>> have found out so far. Please keep in mind that this is new for me and
>>>>> that my interpretations may be wrong.
>>>>>
>>>>> For context I will start with a short description of the directory
>>>>> structure of the 80 GB of the backup disk:
>>>>>
>>>>> In the top-level podirectory/ there is a sub-directory openoffice_org/
>>>>> that probably is the translation data of OpenOffice.org. It contains
>>>>> sub-directories for most languages (more on the exact set below.)
>>>>> The content of podirectory is available at [1].
>>>>>
>>>>> Below the top-level backup/ there are two directories DEV_m103/ and
>>>>> DEV_94/ for two milestones. Below these you can find directories like
>>>>> backconvert-110326/ that probably contain backups for certain dates
>>>>> (March 26 2011 in this example. The most recent is
>>>>> DEV_m103/backconvert-110401 from April 1st of last year.
>>>>>
>>>>> After comparing time stamps I now think that we can disregard the whole
>>>>> backup/ directory. There are .po files under podirectory/ that are from
>>>>> later then April 1st. Some files are from May.
>>>>>
>>>>> I then tried to find out whether the pootle data are older or newer
>> than
>>>>> the data in the extras/l10n module in our SVN repository. The
>> timestamps
>>>>> in the .sdf files are useless, our tools set them all to 2002-02-02.
>> The
>>>>> file time stamps can not be used directly because of the differing
>>>>> directory structures.
>>>>>
>>>>> Comparing the set of lanuages of the pootle server and that in
>>>>> extras/l10n/ was also inconclusive:
>>>>> The set of languages that are present in both data sets is
>>>>> af ar as ast bg bn bo bs ca cs cy da dz es et fa fr fur ga gd gl gu he
>>>>> hi hu id is it ja jbo ka kab kn ko ku lt lv ml mr my nb nl nn nr nso
ny
>>>>> oc om or pap pl ps pt ru sc si sk so sq ss st sv ta te th tn tr ts ug
>>>>> uk uz ve vi xh zu
>>>>>
>>>>> Languages only in extras/l10n/ are:
>>>>> be-BY br brx de dgo el eo eu fi hr kid kk km kok ks ky mai mk mn mni
ne
>>>>> pa-IN ro rw sa-IN sat sd sh sl sr sw-TZ tg
>>>>>
>>>>> Languages only on the pootle server are:
>>>>> pyg son tk tlh
>>>>>
>>>>> See [2] for a list of language ids. (tlh for example is klingon)
>>>>>
>>>>>
>>>>> So, we probably have to merge both data sets and hope for the best.
>>>>> Any information from people who know the localization process better
is
>>>>> welcome.
>>>>>
>>>>>
>>>>> Regards,
>>>>> Andre
>>>>>
>>>>>
>>>>> [1] http://people.apache.org/~af/index.html
>>>>> [2] http://www.loc.gov/standards/iso639-2/php/code_list.php
>>>>
>>>>
>>>>
>>>> And what has happened to en-GB and en-ZA ?
>>>
>>>
>>> Ah, at least one person who reads my mails :-)
>>>
>>> I forgot to add the following languages as being present in both
>> locations:
>>>     ca-XV en-GB en-ZA pt-BR zh-CN zh-TW
>>>
>>> Reason: These six language ids are written slightly differently on the
>>> pootle server (with a '_' (underline) in the middle) and in l10n/ (with a
>>> '-' (dash)).  I sorted them differently and then forgot about them.
>> Sorry.
>>
>> Thanks. And I too actually read your mail messages :-)--and deep
>> appreciate the work.
>>
>> ciao
>> louis
>>>
>>> -Andre
>>
>

Mime
View raw message