db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From josu <josu.eh.mo...@gmail.com>
Subject Accent-insensitive searches
Date Wed, 09 Sep 2009 10:29:22 GMT

I'm working on an database application. Items in the database are all in
spanish language. It's mandatory that searches are accent-insensitive,
meaning that, for example, a search for the word 'electrico' (no accent)
must return entrances containing 'el├ęctrico' (with accent).

Searching the web for a solution, I find I must set these two properties
when creating the database: 

territory=es_ES
collation=TERRITORY_BASED

But it still doesn't work this way. Looks like the default collation for
es_ES is still accent-sensitive.

So I try to use a custom collator that will behave as I need to. I find some
instructions for this in the following blog:

  http://blogs.sun.com/kah/entry/user_defined_collation_in_apache
http://blogs.sun.com/kah/entry/user_defined_collation_in_apache 

 In brief, I define a new CollatorProvider and register it with the JVM.
Here's the code for this class:


public class IgnoraAcentosCollatorProvider extends
java.text.spi.CollatorProvider {

    @Override
    public Collator getInstance(Locale locale) {
        if (!locale.equals(new Locale("es","ES","accentinsensitive"))){
            throw new IllegalArgumentException("Solo acepta
es_ES_accentinsensitive");
        }
        Collator c=Collator.getInstance(new Locale("es","ES"));
        c.setStrength(Collator.PRIMARY);
        return c;
    }

    @Override
    public Locale[] getAvailableLocales() {
        return new Locale[]{
            new Locale("es","ES","accentinsensitive")
        };
    }

}


 It simply takes the default es_ES Collator and changes strength to PRIMARY.
This makes the collator return 0 when comparing 'electrico' and 'el├ęctrico'.

After making sure this new Collator is available for the JVM, I re-start
Derby and make a new database, now setting territory=es_ES_accentinsensitive

The database is created without errors (meaning Derby reaches my Collator),
but searches are still accent-sensitive (no matter if I use = or LIKE
operators).

Any clue? I made intensive searches about this issue but I found no
solution. I can avoid the problem simply using MySQL (the default spanish
configuration has already the desired behaviour) but I would like to keep on
using Derby if possible.

I'm using JavaDB-Derby 10.4.2.1

Thanks. 
-- 
View this message in context: http://www.nabble.com/Accent-insensitive-searches-tp25362254p25362254.html
Sent from the Apache Derby Users mailing list archive at Nabble.com.


Mime
View raw message