lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "phatak.prachi (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-3455) WordDelimiterFilterFactory split word on hyphen though generateWordParts="0"
Date Thu, 17 May 2012 16:10:09 GMT

    [ https://issues.apache.org/jira/browse/SOLR-3455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13277923#comment-13277923
] 

phatak.prachi commented on SOLR-3455:
-------------------------------------

Jack,
Sorry for the confusion.
This is my new configuration:
<fieldType name="textgen" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true"
/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="15" side="front"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory" />
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords.txt"
enablePositionIncrements="true"
/>
<filter class="solr.WordDelimiterFilterFactory" preserveOriginal="1" generateWordParts="1"
generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="0"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>


previously it was not matching 34333 because I was not using WDFF. Now I using it, and it
is tokenizing 34333 as a word on analysis page http://localhost:8983/solr/admin/analysis.jsp,
but in my actual application it is not giving any result.


                
> WordDelimiterFilterFactory split word on hyphen though generateWordParts="0"
> ----------------------------------------------------------------------------
>
>                 Key: SOLR-3455
>                 URL: https://issues.apache.org/jira/browse/SOLR-3455
>             Project: Solr
>          Issue Type: Bug
>            Reporter: phatak.prachi
>            Priority: Blocker
>
> •	RET-34333
> •	WAT-34333
> •	RET 35555
> •	34333
> When I search for RET => RET-34333, RET 35555
> When I search for RET- => RET-34333
> When I search for 34333 => RET-34333, WAT-34333, 34333
> When I search for RET-3 => RET-34333
> When I search for RET-34333 => RET-34333
> When I search for T-3 => nothing returns 
> When I search for T 3 => nothing returns 
> Configuration:
> <fieldType name="textgen" class="solr.TextField" positionIncrementGap="100">
>       <analyzer type="index">
>         <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>         <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"
enablePositionIncrements="true" />
>         <filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="15"
side="front"/>
>         <filter class="solr.LowerCaseFilterFactory"/>
>         <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>       </analyzer>
>       <analyzer type="query">
>         <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>         <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true"
expand="true"/>
>         <filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="15"
side="front"/>
>         <filter class="solr.StopFilterFactory"   ignoreCase="true" words="stopwords.txt"
 enablePositionIncrements="true"  />
>         <filter class="solr.LowerCaseFilterFactory"/>
>         <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>        </analyzer>
> </fieldType>

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message