lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Prathik Puthran <prathik.puthra...@gmail.com>
Subject Help in resolving the below retrieval issue
Date Tue, 10 Sep 2013 12:10:48 GMT
Hi,

I am facing the below issue where in Solr is not retrieving the indexed
word for some cases.

This happens whenever the indexed word has string " - " (quotes for
clarity) as substring i.e word prefix followed by a space which is followed
by '-' again followed by a space and followed by the rest of the word
suffix.
When I search with search query being the exact string Solr returns no
results.

Example:
Indexed word --> "Rahul - kumar"  (quotes for clarity)
If I search with the search query as below Solr gives no results
Search query --> "Rahul - kumar"  (quotes for clarity)

However the below search query returns the results
Search query --> "Rahul kumar"

Can you please let me know what I am doing wrong here and what should I do
to ensure the first query i.e. "Rahul - kumar" returns the documents
indexed using it.

Below are the analyzers I am using:
Index time analyzer components:
1) <charFilter class="solr.PatternReplaceCharFilterFactory"
pattern="([^A-Za-z0-9 ])" replacement=""/>
 2) <tokenizer class="solr.KeywordTokenizerFactory"/>
 3) <filter class="solr.LowerCaseFilterFactory"/>
 4) <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1"
preserveOriginal="1"/>
 5) <filter class="solr.EdgeNGramFilterFactory" minGramSize="2"
maxGramSize="50" side="front"/>
 6) <filter class="solr.EdgeNGramFilterFactory" minGramSize="2"
maxGramSize="50" side="back"/>

Query time analyzer components:
 1) <charFilter class="solr.PatternReplaceCharFilterFactory"
pattern="([^A-Za-z0-9 ])" replacement=""/>
 2) <tokenizer class="solr.KeywordTokenizerFactory"/>
 3) <filter class="solr.LowerCaseFilterFactory"/>
 4) <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1"
preserveOriginal="1"/>


Can you please let me know how I can fix this?

Thanks,
Prathik

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message