incubator-ooo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From J├╝rgen Schmidt <jogischm...@googlemail.com>
Subject Re: [RELEASE][3.4.1]: Include only one en-US dictionary extension
Date Tue, 19 Jun 2012 11:07:38 GMT
On 6/19/12 9:17 AM, Andre Fischer wrote:
> On 19.06.2012 05:07, Ariel Constenla-Haile wrote:
>> Hi there,
>>
>> there have been some reports of users complaining that the Thesaurus
>> does not work.
>>
>> The root of the issue is in the dictionary extensions we are shipping:
>> two of them collide due to lack of uniqueness in the configuration node
>> name, namely dict-en.oxt (the generic EN dictionary) and
>> dict-en-nz-2008-12-03.oxt. The conflict happens on the Thesaurus node:
>>
>> * dict-en.oxt:
>>
>> <node oor:name="ThesDic_en-US" oor:op="fuse">
>>      <prop oor:name="Locations" oor:type="oor:string-list">
>>          <value>%origin%/th_en_US_v2.dat</value>
>>      </prop>
>>      <prop oor:name="Format" oor:type="xs:string">
>>          <value>DICT_THES</value>
>>      </prop>
>>      <prop oor:name="Locales" oor:type="oor:string-list">
>>          <value>en-GB en-US en-ZA en-AU en-CA</value>
>>      </prop>
>> </node>
>>
>> * dict-en-nz-2008-12-03.oxt:
>>
>> <node oor:name="ThesDic_en-US" oor:op="fuse">
>>      <prop oor:name="Locations" oor:type="oor:string-list">
>>          <value>%origin%/th_en_US_v2.dat</value>
>>      </prop>
>>      <prop oor:name="Format" oor:type="xs:string">
>>          <value>DICT_THES</value>
>>      </prop>
>>      <prop oor:name="Locales" oor:type="oor:string-list">
>>          <value>en-NZ</value>
>>      </prop>
>> </node>
>>
>> As you see, they have the same name, "ThesDic_en-US", despite the fact
>> that the official documentation states clearly that dictionary extension
>> developers should use a unique node name, see
>> http://wiki.services.openoffice.org/wiki/Extension_Dictionaries#Dictionary_entries_.28must_be_provided.29
>>
>> specially "About node names for the dictionaries".
> 
> The thesaurus file in dict-en-au-2008-12-15 did rename the thesaurus
> file to th_en_AU_v2.dat.  That avoids the conflict but still wastes 18MB
> of disk space.
> 
>>
>> I didn't research what the fuse operation is *supposed* to do there
>> (it's applied to the node, not to the properties), but the documentation
>> is clear in stating that the node name must be unique. And the result is
>> that the properties are not fused but replaced, having as effect that
>> the en-NZ dictionary installed disables the thesaurus for en-US.
>>
>> As this bug has its root in the dictionary extensions, the only thing we
>> can do to fix it is just provide only one extension, in this case
>> dict-en.oxt.
> 
> Dropping the other english dictionaries is a good idea for other
> reasons, too.  Issue 119272
> (https://issues.apache.org/ooo/show_bug.cgi?id=119272) describes the
> problem of all dictionaries using more than 160MB, most of this are the
> large thesaurus files.  Including only one english dictionary would
> reduce this number considerably.  Besides, it contains support for most
> variants of English anyway.
> 

+1 for reducing the number of "en" dictionaries. Please propose a
related issue for 3.4.1 that we can track it.

Please no commits on AOO34 without a 3.4.1 issue!

Juergen


> -Andre
> 
> 
>>
>> Note that I only discovered this bug in the English dictionary
>> extensions, I didn't check other languages, but we should do so in the
>> cases where we're providing more than one dictionary extension.
>>
>>
>> Regards
>>
> 



Mime
View raw message