lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: space between search terms
Date Sat, 19 Apr 2014 00:07:08 GMT
Ahmet:

Yeah, the index .vs. query time bit is a pain. Often what people will
do is take their best shot at index time, then accumulate omissions
and use that list for query time. Then whenever they can/need to
re-index, merge the query-time list into the index time list and start
over.

Not an ideal solution by any means, but one that people have made to work.

Best,
Erick

On Fri, Apr 18, 2014 at 4:38 PM, Ahmet Arslan <iorixxx@yahoo.com> wrote:
> Hi Jack,
>
> I am planning to extract and publish such words for Turkish language. But I am not sure
how to utilize them.
>
> I wonder if there is a more flexible solution that will work query time only. That would
not require reindexing every time a new item is added.
>
> Ahmet
>
>
> On Friday, April 18, 2014 1:47 PM, Jack Krupansky <jack@basetechnology.com> wrote:
> Use an index-time synonym filter with a synonym entry:
>
> indira nagar,indiranagar
>
> But do not use that same filter at query time.
>
> But, that may mess up some exact phrase queries, such as:
>
> q="indiranagar xyz"
>
> since the following term is actually positioned after the longest synonym.
>
> To resolve that, use a sloppy phrase:
>
> q="indiranagar xyz"~1
>
> Or, set qs=1 for the edismax query parser.
>
> -- Jack Krupansky
>
>
> -----Original Message-----
> From: kumar
> Sent: Friday, April 18, 2014 6:34 AM
> To: solr-user@lucene.apache.org
> Subject: space between search terms
>
> Hi,
>
> I Have a field called "title". It is having a values called "indira nagar"
> as well as "indiranagar".
>
> If i type any of the keywords it has to display both results.
>
> Can anybody help how can we do this?
>
>
> I am using the title field in the following way:
>
> <fieldType name="title" class="solr.TextField" positionIncrementGap="100">
> <analyzer type="index">
> <charFilter class="solr.MappingCharFilterFactory"
> mapping="mapping-ISOLatin1Accent.txt" />
> <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="1"
> generateNumberParts="1"
> catenateWords="1"
> catenateNumbers="1"
> catenateAll="1"
> splitOnCaseChange="1"
> splitOnNumerics="1"
> preserveOriginal="1" />
> <filter class="solr.LowerCaseFilterFactory" />
> <filter class="solr.PatternReplaceFilterFactory"
> pattern="([^\w\d\*æøåÆØÅ ])" replacement=" " replace="all" />
> <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords.txt" enablePositionIncrements="true" />
>
> </analyzer>
> <analyzer type="query">
> <charFilter class="solr.MappingCharFilterFactory"
> mapping="mapping-ISOLatin1Accent.txt" />
> <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="1"
> generateNumberParts="1"
> catenateWords="1"
> catenateNumbers="1"
> catenateAll="1"
> splitOnCaseChange="1"
> splitOnNumerics="1"
> preserveOriginal="1"/>
> <filter class="solr.LowerCaseFilterFactory" />
> <filter class="solr.PatternReplaceFilterFactory"
> pattern="([^\w\d\*æøåÆØÅ ])" replacement=" " replace="all" />
> <filter class="solr.SynonymFilterFactory" ignoreCase="true"
> synonyms="synonyms_tf.txt" expand="true" />
> <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords.txt" enablePositionIncrements="true" />
> <filter class="solr.KeywordMarkerFilterFactory"
> protected="protwords.txt" />
>                 <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
> </analyzer>
> </fieldType>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/space-between-search-terms-tp4131967.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Mime
View raw message