lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: can solr automatically search for different punctuation of a word
Date Tue, 31 Jan 2012 14:36:53 GMT
Take a look at solrconfig.xml, the <lib..../> directives there. Either add
a path (relative) there or just plop the jar into one of the dirs
already specified.

Best
Erick

On Mon, Jan 30, 2012 at 10:38 PM,  <alxsss@aim.com> wrote:
>
>  Hi Chantal,
>
> In the readme file at  solr/contrib/analysis-extras/README.txt it says to add the ICU
library (in lib/)
>
> Do I need also add <dependecy>... and where?
>
> Thanks.
> Alex.
>
>
>
>
>
> -----Original Message-----
> From: Chantal Ackermann <chantal.ackermann@btelligent.de>
> To: solr-user <solr-user@lucene.apache.org>
> Sent: Fri, Jan 13, 2012 1:52 am
> Subject: Re: can solr automatically search for different punctuation of a word
>
>
> Hi Alex,
>
>
>
> for me, ICUFoldingFilterFactory works very good. It does lowercasing and
>
> removes diacritica (this is how umlauts and accenting of letters is
>
> called - punctuation means comma, points etc.). It will work for any any
>
> language, not only German. And it will also handle apostrophs as in
>
> "C'est bien".
>
>
>
> ICU requires additional libraries in the classpath. For an in-built solr
>
> solution have a look at ASCIIFoldingFilterFactory.
>
>
>
> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ASCIIFoldingFilterFactory
>
> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ICUFoldingFilterFactory
>
>
>
>
>
>
>
> Example configuration:
>
> <fieldType name="text_sort" class="solr.TextField"
>
>        positionIncrementGap="100">
>
>        <analyzer>
>
>                <tokenizer class="solr.KeywordTokenizerFactory" />
>
>                <filter class="solr.ICUFoldingFilterFactory" />
>
>        </analyzer>
>
> </fieldType>
>
>
>
> And dependencies (example for Maven) in addition to solr-core:
>
> <dependency>
>
>        <groupId>org.apache.lucene</groupId>
>
>        <artifactId>lucene-icu</artifactId>
>
>        <version>${solr.version}</version>
>
>        <scope>runtime</scope>
>
> </dependency>
>
> <dependency>
>
>        <groupId>org.apache.solr</groupId>
>
>        <artifactId>solr-analysis-extras</artifactId>
>
>        <version>${solr.version}</version>
>
>        <scope>runtime</scope>
>
> </dependency>
>
>
>
> Cheers,
>
> Chantal
>
>
>
> On Fri, 2012-01-13 at 00:09 +0100, alxsss@aim.com wrote:
>
>> Hello,
>
>>
>
>> I would like to know if solr has a functionality to automatically search for a
>
> different punctuation of a word.
>
>> For example if I if a user searches for a word Uber, and stemmer is german
>
> lang, then solr looks for both Uber and  Über,  like in synonyms.
>
>>
>
>> Is it possible to give a file with a list of possible substitutions of letters
>
> to solr and have it search for all possible punctuations?
>
>>
>
>>
>
>> Thanks.
>
>> Alex.
>
>
>
>
>

Mime
View raw message