db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sylvain Leroux <s...@wanadoo.fr>
Subject Re: Accent-insensitive searches
Date Fri, 11 Sep 2009 08:33:37 GMT
Hi Josu,

And sorry for the late reply.


I've tried the exact same thing as you yesterday - except I have used 
Locale.FRENCH as /base/ locale:
 >         Collator c=Collator.getInstance(Locale.FRENCH);

It works like a charm (both with = and LIKE).
I am using Apache Derby 10.5.1.1 and Sun JDK 1.6.0_12.


Maybe you should try with a more recent version of Derby (?)


Otherwise, check that you've specified your custom collator at DB 
creation time. Not when booting the DB: since even if you specify an 
other collator when you boot an existing DB, it's still the collator 
specified at creation time that is used.


Finally, the most doubtful one: accent difference might be a PRIMARY 
difference for the default es_ES collator (??). But, that would be 
really surprising...


Let us know if you find the answer!

Best regards,
Sylvain


josu a écrit :
> 
> I'm working on an database application. Items in the database are all in
> spanish language. It's mandatory that searches are accent-insensitive,
> meaning that, for example, a search for the word 'electrico' (no accent)
> must return entrances containing 'eléctrico' (with accent).
> 
> Searching the web for a solution, I find I must set these two properties
> when creating the database: 
> 
> territory=es_ES
> collation=TERRITORY_BASED
> 
> But it still doesn't work this way. Looks like the default collation for
> es_ES is still accent-sensitive.
> 
> So I try to use a custom collator that will behave as I need to. I find some
> instructions for this in the following blog:
> 
>   http://blogs.sun.com/kah/entry/user_defined_collation_in_apache
> http://blogs.sun.com/kah/entry/user_defined_collation_in_apache 
> 
>  In brief, I define a new CollatorProvider and register it with the JVM.
> Here's the code for this class:
> 
> 
> public class IgnoraAcentosCollatorProvider extends
> java.text.spi.CollatorProvider {
> 
>     @Override
>     public Collator getInstance(Locale locale) {
>         if (!locale.equals(new Locale("es","ES","accentinsensitive"))){
>             throw new IllegalArgumentException("Solo acepta
> es_ES_accentinsensitive");
>         }
>         Collator c=Collator.getInstance(new Locale("es","ES"));
>         c.setStrength(Collator.PRIMARY);
>         return c;
>     }
> 
>     @Override
>     public Locale[] getAvailableLocales() {
>         return new Locale[]{
>             new Locale("es","ES","accentinsensitive")
>         };
>     }
> 
> }
> 
> 
>  It simply takes the default es_ES Collator and changes strength to PRIMARY.
> This makes the collator return 0 when comparing 'electrico' and 'eléctrico'.
> 
> After making sure this new Collator is available for the JVM, I re-start
> Derby and make a new database, now setting territory=es_ES_accentinsensitive
> 
> The database is created without errors (meaning Derby reaches my Collator),
> but searches are still accent-sensitive (no matter if I use = or LIKE
> operators).
> 
> Any clue? I made intensive searches about this issue but I found no
> solution. I can avoid the problem simply using MySQL (the default spanish
> configuration has already the desired behaviour) but I would like to keep on
> using Derby if possible.
> 
> I'm using JavaDB-Derby 10.4.2.1
> 
> Thanks. 


-- 
Website: http://www.chicoree.fr



Mime
View raw message