lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Okke Klein (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-3390) Highlighting issue with multi-word synonyms causes to highlight the wrong terms
Date Tue, 05 Jun 2012 09:09:23 GMT

    [ https://issues.apache.org/jira/browse/SOLR-3390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13289270#comment-13289270
] 

Okke Klein commented on SOLR-3390:
----------------------------------

Using multi word synonyms works a lot better in LUCENE_33 because of the way SlowSynonymFilter
handles them. Is there a way to get the same behavior with the new filter?
                
> Highlighting issue with multi-word synonyms causes to highlight the wrong terms
> -------------------------------------------------------------------------------
>
>                 Key: SOLR-3390
>                 URL: https://issues.apache.org/jira/browse/SOLR-3390
>             Project: Solr
>          Issue Type: Bug
>          Components: highlighter, query parsers
>    Affects Versions: 3.6
>         Environment: Windows 7. (Development machine, not the server) 
>            Reporter: Rahul Babulal
>              Labels: highlighter, multi-word, solr, synonyms
>
> I am using solr 3.6 and when I have multi-words synonyms the highlighting results have
the wrong word highlighted. 
> If I have the below entry in the synonyms file:
> dns, domain name system 
> If I index something like: "A sample dns entry explaining the details".
> Searching for "name" (without quotes) in the highlight results/snippets I get :  "A sample
dns <em>entry</em> explaining the details". (The token "entry" overlaps with the
token "name" in the analysis.jsp)
> Searching for "system" (without quotes) in the highlight results/snippets I get :  "A
sample dns entry <em>explaining</em> the details". (The token "explaining" overlaps
with the token "system" in the analysis.jsp)
> Here is my schema field Type:
> <fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
>       <analyzer type="index">
>         <charFilter class="solr.HTMLStripCharFilterFactory"/>
>         <tokenizer class="solr.StandardTokenizerFactory"/>
>         <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true"
expand="true"/>
>         <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"
enablePositionIncrements="true" />
>         <filter class="solr.LowerCaseFilterFactory"/>
>         <filter class="solr.PorterStemFilterFactory"/>        
>       </analyzer>
>       <analyzer type="query">
>         <tokenizer class="solr.StandardTokenizerFactory"/>
>         <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true"
expand="false"/>
>         <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"
enablePositionIncrements="true" />
> 		<filter class="solr.LowerCaseFilterFactory"/>
>         <filter class="solr.PorterStemFilterFactory"/>
>       </analyzer>
>     </fieldType>

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message