lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dalius (Created) (JIRA)" <j...@apache.org>
Subject [jira] [Created] (SOLR-3106) Wildcard ? issue
Date Tue, 07 Feb 2012 17:12:59 GMT
Wildcard ? issue
----------------

                 Key: SOLR-3106
                 URL: https://issues.apache.org/jira/browse/SOLR-3106
             Project: Solr
          Issue Type: Bug
    Affects Versions: 3.5
         Environment: Tomcat 7.0.25 (request encoding UTF-8)
Solr 3.5.0
Java 7 Oracle
Ubuntu 11.10
            Reporter: Dalius


Sorry for inaccurate title.
I have a 3 fields containing same value:
{code}
<title xmlns="http://www.tei-c.org/ns/1.0">cal- lígraf</title>
{code}
and these fields are configured accordingly:
{code}
    <fieldType name="xml" class="solr.TextField" positionIncrementGap="100">
      <analyzer type="index">
        <charFilter class="solr.HTMLStripCharFilterFactory"/>
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.ICUFoldingFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.ICUFoldingFilterFactory"/>
      </analyzer>
    </fieldType>
    
    <fieldType name="xml_unicode" class="solr.TextField" positionIncrementGap="100">
      <analyzer type="index">
        <charFilter class="solr.HTMLStripCharFilterFactory"/>
        <tokenizer class="solr.StandardTokenizerFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
      </analyzer>
    </fieldType>
    
    <fieldType name="xml_unicode_full" class="solr.TextField" positionIncrementGap="100">
      <analyzer type="index">
        <charFilter class="solr.HTMLStripCharFilterFactory"/>
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
      </analyzer>
    </fieldType>
{code}

And finally my search configuration:
{code}
    <requestHandler name="dictionary" class="solr.SearchHandler">
         <lst name="defaults">
           <str name="echoParams">all</str>
           <str name="defType">edismax</str>
           <str name="mm">2&lt;-25%</str>
           <str name="qf">dc_title_unicode_full^2 dc_title_unicode^2 dc_title</str>
           <int name="rows">10</int>
           <str name="spellcheck.onlyMorePopular">true</str>
           <str name="spellcheck.extendedResults">false</str>
           <str name="spellcheck.count">1</str>
         </lst>
        <arr name="last-components">
          <str>spellcheck</str>
        </arr>
    </requestHandler>
{code}

I am trying to match the field with various search phrases (that are valid). There are results:
|| # || search phrase || match? ||
| 1 | cal- lígra? | (/) |
| 2 | cal- ligra? | (x) |
| 3 | cal- ligraf | (/) |
| 4 | calligra? | (/) |

The problem is the #2 attempt to match a data. The #3 works replacing ? with f.

One more thing. If * is used insted of ? other data is matched as cal- lígrafia but not cal-
lígraf...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message