incubator-ooo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dave Fisher <dave2w...@comcast.net>
Subject Re: Pootle Data
Date Thu, 09 Feb 2012 15:45:43 GMT
Hi Andre,

On Feb 9, 2012, at 7:09 AM, Andre Fischer wrote:

> On 09.02.2012 15:23, Huaidong Qiu wrote:
>> Where can we get the data? I can help to check and understand the toolset
>> and process.
> 
> That sounds great.  Please have a look at
> 
> http://people.apache.org/~af/index.html

It would be a good idea to add an INFRA issue to JIRA to track loading of this data to the
Apache Pootle Server.

Regards,
Dave


> 
> -Andre
> 
>> 
>> On Thu, Feb 9, 2012 at 1:04 AM, Louis Suárez-Potts
>> <lsuarezpotts@gmail.com>wrote:
>> 
>>> Hi
>>> 
>>> On 8 February 2012 11:49, Andre Fischer<af@a-w-f.de>  wrote:
>>>> On 08.02.2012 17:31, Stuart Swales wrote:
>>>>> 
>>>>> On 07/02/2012 14:02, Andre Fischer wrote:
>>>>>> 
>>>>>> Hi,
>>>>>> 
>>>>>> I recently had a little time to look at the pootle data. Here is
what I
>>>>>> have found out so far. Please keep in mind that this is new for me
and
>>>>>> that my interpretations may be wrong.
>>>>>> 
>>>>>> For context I will start with a short description of the directory
>>>>>> structure of the 80 GB of the backup disk:
>>>>>> 
>>>>>> In the top-level podirectory/ there is a sub-directory openoffice_org/
>>>>>> that probably is the translation data of OpenOffice.org. It contains
>>>>>> sub-directories for most languages (more on the exact set below.)
>>>>>> The content of podirectory is available at [1].
>>>>>> 
>>>>>> Below the top-level backup/ there are two directories DEV_m103/ and
>>>>>> DEV_94/ for two milestones. Below these you can find directories
like
>>>>>> backconvert-110326/ that probably contain backups for certain dates
>>>>>> (March 26 2011 in this example. The most recent is
>>>>>> DEV_m103/backconvert-110401 from April 1st of last year.
>>>>>> 
>>>>>> After comparing time stamps I now think that we can disregard the
whole
>>>>>> backup/ directory. There are .po files under podirectory/ that are
from
>>>>>> later then April 1st. Some files are from May.
>>>>>> 
>>>>>> I then tried to find out whether the pootle data are older or newer
>>> than
>>>>>> the data in the extras/l10n module in our SVN repository. The
>>> timestamps
>>>>>> in the .sdf files are useless, our tools set them all to 2002-02-02.
>>> The
>>>>>> file time stamps can not be used directly because of the differing
>>>>>> directory structures.
>>>>>> 
>>>>>> Comparing the set of lanuages of the pootle server and that in
>>>>>> extras/l10n/ was also inconclusive:
>>>>>> The set of languages that are present in both data sets is
>>>>>> af ar as ast bg bn bo bs ca cs cy da dz es et fa fr fur ga gd gl
gu he
>>>>>> hi hu id is it ja jbo ka kab kn ko ku lt lv ml mr my nb nl nn nr
nso ny
>>>>>> oc om or pap pl ps pt ru sc si sk so sq ss st sv ta te th tn tr ts
ug
>>>>>> uk uz ve vi xh zu
>>>>>> 
>>>>>> Languages only in extras/l10n/ are:
>>>>>> be-BY br brx de dgo el eo eu fi hr kid kk km kok ks ky mai mk mn
mni ne
>>>>>> pa-IN ro rw sa-IN sat sd sh sl sr sw-TZ tg
>>>>>> 
>>>>>> Languages only on the pootle server are:
>>>>>> pyg son tk tlh
>>>>>> 
>>>>>> See [2] for a list of language ids. (tlh for example is klingon)
>>>>>> 
>>>>>> 
>>>>>> So, we probably have to merge both data sets and hope for the best.
>>>>>> Any information from people who know the localization process better
is
>>>>>> welcome.
>>>>>> 
>>>>>> 
>>>>>> Regards,
>>>>>> Andre
>>>>>> 
>>>>>> 
>>>>>> [1] http://people.apache.org/~af/index.html
>>>>>> [2] http://www.loc.gov/standards/iso639-2/php/code_list.php
>>>>> 
>>>>> 
>>>>> 
>>>>> And what has happened to en-GB and en-ZA ?
>>>> 
>>>> 
>>>> Ah, at least one person who reads my mails :-)
>>>> 
>>>> I forgot to add the following languages as being present in both
>>> locations:
>>>>    ca-XV en-GB en-ZA pt-BR zh-CN zh-TW
>>>> 
>>>> Reason: These six language ids are written slightly differently on the
>>>> pootle server (with a '_' (underline) in the middle) and in l10n/ (with a
>>>> '-' (dash)).  I sorted them differently and then forgot about them.
>>> Sorry.
>>> 
>>> Thanks. And I too actually read your mail messages :-)--and deep
>>> appreciate the work.
>>> 
>>> ciao
>>> louis
>>>> 
>>>> -Andre
>>> 
>> 


Mime
View raw message