lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jack Krupansky (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-3455) WordDelimiterFilterFactory split word on hyphen though generateWordParts="0"
Date Thu, 17 May 2012 15:02:10 GMT

    [ https://issues.apache.org/jira/browse/SOLR-3455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13277878#comment-13277878
] 

Jack Krupansky commented on SOLR-3455:
--------------------------------------

Given this configuration, 34333 should only match documents containing terms that are exactly
34333 or start with 34333, but will not match terms that have 34333 embedded within them,
including after a hyphen. So, 34333 will not match RET-34333 or WAT-34333, but your original
description indicates that it is matching RET-34333, and matching 34333. But later you say
it is not matching 34333. Something in your description and comments is inconsistent. Until
you resolve these inconsistencies in your description, the problem (if any) will not be clear.

                
> WordDelimiterFilterFactory split word on hyphen though generateWordParts="0"
> ----------------------------------------------------------------------------
>
>                 Key: SOLR-3455
>                 URL: https://issues.apache.org/jira/browse/SOLR-3455
>             Project: Solr
>          Issue Type: Bug
>            Reporter: phatak.prachi
>            Priority: Blocker
>
> •	RET-34333
> •	WAT-34333
> •	RET 35555
> •	34333
> When I search for RET => RET-34333, RET 35555
> When I search for RET- => RET-34333
> When I search for 34333 => RET-34333, WAT-34333, 34333
> When I search for RET-3 => RET-34333
> When I search for RET-34333 => RET-34333
> When I search for T-3 => nothing returns 
> When I search for T 3 => nothing returns 
> Configuration:
> <fieldType name="textgen" class="solr.TextField" positionIncrementGap="100">
>       <analyzer type="index">
>         <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>         <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"
enablePositionIncrements="true" />
>         <filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="15"
side="front"/>
>         <filter class="solr.LowerCaseFilterFactory"/>
>         <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>       </analyzer>
>       <analyzer type="query">
>         <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>         <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true"
expand="true"/>
>         <filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="15"
side="front"/>
>         <filter class="solr.StopFilterFactory"   ignoreCase="true" words="stopwords.txt"
 enablePositionIncrements="true"  />
>         <filter class="solr.LowerCaseFilterFactory"/>
>         <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>        </analyzer>
> </fieldType>

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message