lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From marotosg <marot...@gmail.com>
Subject Query exact match with ASCIIFoldingFilterFactory
Date Wed, 08 Jun 2016 16:04:25 GMT
Hi all,

I am trying to query and match on a collection of documents with a field
which is basically text coming from pdfs. It could contain any type of text.

field type
<fieldType name="text_general" class="solr.TextField"
positionIncrementGap="100">    
      <analyzer type="index">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"
preserveOriginal="1"/>
        <filter class="solr.ASCIIFoldingFilterFactory"
preserveOriginal="false"/>
        <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="0" generateNumberParts="0" catenateWords="0"
catenateNumbers="0" catenateAll="0" splitOnCaseChange="0"
preserveOriginal="1"/>
        <filter class="solr.ASCIIFoldingFilterFactory"
preserveOriginal="false"/>
        <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
 </fieldType>


It works well in general but i have one use case is not working and I don't
know how to solve it.
when I try to make an exact match like below.
q=docContent:"dq/ex report"

It can't find the match because the worddelimiter is separating the
positions on the index but not in the query as I don't want to retrieve
false positives.

Result from analyser
Index: dq/ex dq ex dqex report
Query: dq/ex                report

Is it possible to use the same functionality but make exact match.

Thanks
Sergio







--
View this message in context: http://lucene.472066.n3.nabble.com/Query-exact-match-with-ASCIIFoldingFilterFactory-tp4281256.html
Sent from the Solr - User mailing list archive at Nabble.com.

Mime
View raw message