incubator-ooo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andre Fischer ...@a-w-f.de>
Subject Re: [RELEASE][3.4.1]: Include only one en-US dictionary extension
Date Tue, 19 Jun 2012 07:17:06 GMT
On 19.06.2012 05:07, Ariel Constenla-Haile wrote:
> Hi there,
>
> there have been some reports of users complaining that the Thesaurus
> does not work.
>
> The root of the issue is in the dictionary extensions we are shipping:
> two of them collide due to lack of uniqueness in the configuration node
> name, namely dict-en.oxt (the generic EN dictionary) and
> dict-en-nz-2008-12-03.oxt. The conflict happens on the Thesaurus node:
>
> * dict-en.oxt:
>
> <node oor:name="ThesDic_en-US" oor:op="fuse">
>      <prop oor:name="Locations" oor:type="oor:string-list">
>          <value>%origin%/th_en_US_v2.dat</value>
>      </prop>
>      <prop oor:name="Format" oor:type="xs:string">
>          <value>DICT_THES</value>
>      </prop>
>      <prop oor:name="Locales" oor:type="oor:string-list">
>          <value>en-GB en-US en-ZA en-AU en-CA</value>
>      </prop>
> </node>
>
> * dict-en-nz-2008-12-03.oxt:
>
> <node oor:name="ThesDic_en-US" oor:op="fuse">
>      <prop oor:name="Locations" oor:type="oor:string-list">
>          <value>%origin%/th_en_US_v2.dat</value>
>      </prop>
>      <prop oor:name="Format" oor:type="xs:string">
>          <value>DICT_THES</value>
>      </prop>
>      <prop oor:name="Locales" oor:type="oor:string-list">
>          <value>en-NZ</value>
>      </prop>
> </node>
>
> As you see, they have the same name, "ThesDic_en-US", despite the fact
> that the official documentation states clearly that dictionary extension
> developers should use a unique node name, see
> http://wiki.services.openoffice.org/wiki/Extension_Dictionaries#Dictionary_entries_.28must_be_provided.29
> specially "About node names for the dictionaries".

The thesaurus file in dict-en-au-2008-12-15 did rename the thesaurus 
file to th_en_AU_v2.dat.  That avoids the conflict but still wastes 18MB 
of disk space.

>
> I didn't research what the fuse operation is *supposed* to do there
> (it's applied to the node, not to the properties), but the documentation
> is clear in stating that the node name must be unique. And the result is
> that the properties are not fused but replaced, having as effect that
> the en-NZ dictionary installed disables the thesaurus for en-US.
>
> As this bug has its root in the dictionary extensions, the only thing we
> can do to fix it is just provide only one extension, in this case
> dict-en.oxt.

Dropping the other english dictionaries is a good idea for other 
reasons, too.  Issue 119272 
(https://issues.apache.org/ooo/show_bug.cgi?id=119272) describes the 
problem of all dictionaries using more than 160MB, most of this are the 
large thesaurus files.  Including only one english dictionary would 
reduce this number considerably.  Besides, it contains support for most 
variants of English anyway.

-Andre


>
> Note that I only discovered this bug in the English dictionary
> extensions, I didn't check other languages, but we should do so in the
> cases where we're providing more than one dictionary extension.
>
>
> Regards
>


Mime
View raw message