lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: problem with mapping-iso accents
Date Wed, 06 Jun 2012 14:51:36 GMT
First, please post usage/configuration questions over on the user's list, see:
http://lucene.apache.org/solr/discussion.html. The dev list is intended for
discussing development issues/bugs/etc.

You're probably being fooled by setting 'stored="true" '. When you return
the value of a field in a document (by the "fl" parameter or similar) you're
getting the original, unanalyzed value. To see what's actually indexed in
the document itself, try using the admin/schema browser page or the
TermsComponent (see: http://wiki.apache.org/solr/TermsComponent)

A quick test would be to search for tue unaccented version and see if the
document is found....

Best
Erick

On Wed, Jun 6, 2012 at 10:43 AM, Gastone Penzo <gastone.penzo@gmail.com> wrote:
> Hi,
> i have a problem ISOaccent tokenize filter.
>
> i have e field in my schema with this filter:
>
> <charFilter class="solr.MappingCharFilterFactory"
> mapping="mapping-ISOLatin1Accent.txt"/>
>
> if i try this filter with analyisis tool in solr admin panel it works.
>
> for example:
>
> sarĂ  => sara.
>
> but when i create indexes it doesn't work. in the index the field is "sarĂ "
> with accent. why?
>
> i use ad mysqlconnector to create indexes directly from mysql db
>
> the mysql db is in uft-8, the connector charset is utf-8, solr is in utf-8
> by default.
>
> recently i changed my java from openjdk to sun-jdk. can be that the reason?
>
> thanx
>
>
>
> --
> Gastone Penzo
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message