incubator-ooo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dave Fisher <dave2w...@comcast.net>
Subject Re: [RELEASE][3.4.1]: Include only one en-US dictionary extension
Date Tue, 19 Jun 2012 14:01:10 GMT
But none of you are native English speakers. English varies from country to country around
the world. Why not disable or rename the second version when the name collides?

Regards,
Dave

Sent from my iPhone

On Jun 19, 2012, at 7:07 AM, J├╝rgen Schmidt <jogischmidt@googlemail.com> wrote:

> On 6/19/12 9:17 AM, Andre Fischer wrote:
>> On 19.06.2012 05:07, Ariel Constenla-Haile wrote:
>>> Hi there,
>>> 
>>> there have been some reports of users complaining that the Thesaurus
>>> does not work.
>>> 
>>> The root of the issue is in the dictionary extensions we are shipping:
>>> two of them collide due to lack of uniqueness in the configuration node
>>> name, namely dict-en.oxt (the generic EN dictionary) and
>>> dict-en-nz-2008-12-03.oxt. The conflict happens on the Thesaurus node:
>>> 
>>> * dict-en.oxt:
>>> 
>>> <node oor:name="ThesDic_en-US" oor:op="fuse">
>>>     <prop oor:name="Locations" oor:type="oor:string-list">
>>>         <value>%origin%/th_en_US_v2.dat</value>
>>>     </prop>
>>>     <prop oor:name="Format" oor:type="xs:string">
>>>         <value>DICT_THES</value>
>>>     </prop>
>>>     <prop oor:name="Locales" oor:type="oor:string-list">
>>>         <value>en-GB en-US en-ZA en-AU en-CA</value>
>>>     </prop>
>>> </node>
>>> 
>>> * dict-en-nz-2008-12-03.oxt:
>>> 
>>> <node oor:name="ThesDic_en-US" oor:op="fuse">
>>>     <prop oor:name="Locations" oor:type="oor:string-list">
>>>         <value>%origin%/th_en_US_v2.dat</value>
>>>     </prop>
>>>     <prop oor:name="Format" oor:type="xs:string">
>>>         <value>DICT_THES</value>
>>>     </prop>
>>>     <prop oor:name="Locales" oor:type="oor:string-list">
>>>         <value>en-NZ</value>
>>>     </prop>
>>> </node>
>>> 
>>> As you see, they have the same name, "ThesDic_en-US", despite the fact
>>> that the official documentation states clearly that dictionary extension
>>> developers should use a unique node name, see
>>> http://wiki.services.openoffice.org/wiki/Extension_Dictionaries#Dictionary_entries_.28must_be_provided.29
>>> 
>>> specially "About node names for the dictionaries".
>> 
>> The thesaurus file in dict-en-au-2008-12-15 did rename the thesaurus
>> file to th_en_AU_v2.dat.  That avoids the conflict but still wastes 18MB
>> of disk space.
>> 
>>> 
>>> I didn't research what the fuse operation is *supposed* to do there
>>> (it's applied to the node, not to the properties), but the documentation
>>> is clear in stating that the node name must be unique. And the result is
>>> that the properties are not fused but replaced, having as effect that
>>> the en-NZ dictionary installed disables the thesaurus for en-US.
>>> 
>>> As this bug has its root in the dictionary extensions, the only thing we
>>> can do to fix it is just provide only one extension, in this case
>>> dict-en.oxt.
>> 
>> Dropping the other english dictionaries is a good idea for other
>> reasons, too.  Issue 119272
>> (https://issues.apache.org/ooo/show_bug.cgi?id=119272) describes the
>> problem of all dictionaries using more than 160MB, most of this are the
>> large thesaurus files.  Including only one english dictionary would
>> reduce this number considerably.  Besides, it contains support for most
>> variants of English anyway.
>> 
> 
> +1 for reducing the number of "en" dictionaries. Please propose a
> related issue for 3.4.1 that we can track it.
> 
> Please no commits on AOO34 without a 3.4.1 issue!
> 
> Juergen
> 
> 
>> -Andre
>> 
>> 
>>> 
>>> Note that I only discovered this bug in the English dictionary
>>> extensions, I didn't check other languages, but we should do so in the
>>> cases where we're providing more than one dictionary extension.
>>> 
>>> 
>>> Regards
>>> 
>> 
> 
> 

Mime
View raw message