db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "dev@xx" <...@proxiflex.fr>
Subject Re: Accent-insensitive searches
Date Fri, 11 Sep 2009 16:57:15 GMT
Bonsoir Sylvain,

I was just starting to look for the way to implement case insensitive and 
accent insensitive SELECT in Derby when I received this mail.

I tried what you suggest but I can't make it working. I kown, it is not a 
purely Derby question but I guess that those who are not familiar with spi 
may have the same problem, so I post on this mailing list.

I have defined my jar with the META-INF file and ServiceLoader.load( 
java.text.spi.CollatorProvider.class ) call is returning my class correctly.
Anyway Derby throw ERROR XBM04 as it can't find my provider.
I search into the java API source and found that 
ServiceLoader.loadInstalled() was called by 
sun.util.LocaleServiceProviderPool (Locale.getAvailableLocales())
And javadoc says : "This method is intended for use when only installed 
providers are desired. The resulting service will only find and load 
providers that have been installed into the current Java virtual machine; 
providers on the application's class path will be ignored."

Does that mean I have to deploy my jar CollatorProvider somewhere else than 
in the application class path ? It should be a deployment problem for me...
What means "installed into the current Java virtual machine" ?

It is a shame that DERBY doesn't implement an easier way to do such frequent 
action. A case insensitive, accent insensitive behaviour option **should 
be** a standard DERBY feature (as SQL Server has and ORACLE don't have) !



----- Original Message ----- 
From: "Sylvain Leroux" <sl20@wanadoo.fr>
To: "Derby Discussion" <derby-user@db.apache.org>
Sent: Friday, September 11, 2009 10:33 AM
Subject: Re: Accent-insensitive searches

> Hi Josu,
> And sorry for the late reply.
> I've tried the exact same thing as you yesterday - except I have used 
> Locale.FRENCH as /base/ locale:
> >         Collator c=Collator.getInstance(Locale.FRENCH);
> It works like a charm (both with = and LIKE).
> I am using Apache Derby and Sun JDK 1.6.0_12.
> Maybe you should try with a more recent version of Derby (?)
> Otherwise, check that you've specified your custom collator at DB creation 
> time. Not when booting the DB: since even if you specify an other collator 
> when you boot an existing DB, it's still the collator specified at 
> creation time that is used.
> Finally, the most doubtful one: accent difference might be a PRIMARY 
> difference for the default es_ES collator (??). But, that would be really 
> surprising...
> Let us know if you find the answer!
> Best regards,
> Sylvain
> josu a écrit :
>> I'm working on an database application. Items in the database are all in
>> spanish language. It's mandatory that searches are accent-insensitive,
>> meaning that, for example, a search for the word 'electrico' (no accent)
>> must return entrances containing 'eléctrico' (with accent).
>> Searching the web for a solution, I find I must set these two properties
>> when creating the database: territory=es_ES
>> collation=TERRITORY_BASED
>> But it still doesn't work this way. Looks like the default collation for
>> es_ES is still accent-sensitive.
>> So I try to use a custom collator that will behave as I need to. I find 
>> some
>> instructions for this in the following blog:
>>   http://blogs.sun.com/kah/entry/user_defined_collation_in_apache
>> http://blogs.sun.com/kah/entry/user_defined_collation_in_apache In brief, 
>> I define a new CollatorProvider and register it with the JVM.
>> Here's the code for this class:
>> public class IgnoraAcentosCollatorProvider extends
>> java.text.spi.CollatorProvider {
>>     @Override
>>     public Collator getInstance(Locale locale) {
>>         if (!locale.equals(new Locale("es","ES","accentinsensitive"))){
>>             throw new IllegalArgumentException("Solo acepta
>> es_ES_accentinsensitive");
>>         }
>>         Collator c=Collator.getInstance(new Locale("es","ES"));
>>         c.setStrength(Collator.PRIMARY);
>>         return c;
>>     }
>>     @Override
>>     public Locale[] getAvailableLocales() {
>>         return new Locale[]{
>>             new Locale("es","ES","accentinsensitive")
>>         };
>>     }
>> }
>>  It simply takes the default es_ES Collator and changes strength to 
>> This makes the collator return 0 when comparing 'electrico' and 
>> 'eléctrico'.
>> After making sure this new Collator is available for the JVM, I re-start
>> Derby and make a new database, now setting 
>> territory=es_ES_accentinsensitive
>> The database is created without errors (meaning Derby reaches my 
>> Collator),
>> but searches are still accent-sensitive (no matter if I use = or LIKE
>> operators).
>> Any clue? I made intensive searches about this issue but I found no
>> solution. I can avoid the problem simply using MySQL (the default spanish
>> configuration has already the desired behaviour) but I would like to keep 
>> on
>> using Derby if possible.
>> I'm using JavaDB-Derby
>> Thanks.
> -- 
> Website: http://www.chicoree.fr

View raw message