lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ribeaud, Christian (Ext)" <christian.ribe...@novartis.com>
Subject RE: Wildcard search not working
Date Fri, 12 Aug 2016 07:36:27 GMT
Hi Ahmet, Hi Upayavira,

OK, it seems that I have to dive a bit deeper in the Solr filters and tokenizers. I've just
realized that my command there is too limited.
Thanks a lot guys so far for help. Cheers and have a nice day,

christian

-----Original Message-----
From: Ahmet Arslan [mailto:iorixxx@yahoo.com] 
Sent: Freitag, 12. August 2016 07:41
To: solr-user@lucene.apache.org; Ribeaud, Christian (Ext)
Subject: Re: Wildcard search not working

Hi Christian,

Please use the following filter before/above the stemmer.
<filter class="solr.KeywordRepeatFilterFactory"/>

Plus, you may want to add :

<analyzer type="multiterm">
  <tokenizer class="solr.KeywordTokenizerFactory" />
  <filter class="solr.LowerCaseFilterFactory"/>
  <filter class="solr.GermanNormalizationFilterFactory"/></analyzer>

Ahmet



On Thursday, August 11, 2016 9:31 PM, "Ribeaud, Christian (Ext)" <christian.ribeaud@novartis.com>
wrote:
Hi Ahmet,

Many thanks for your reply. I had a look at the URL you pointed out but, honestly, I have
to admit that I did not fully understand you.
Let's be a bit more concrete. Following the schema snippet for the corresponding field:

...
<field name="title" type="text_de" indexed="true" stored="true" required="false" multiValued="false"
/>

<!-- German -->
<fieldType name="text_de" class="solr.TextField" positionIncrementGap="100">
    <analyzer> 
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_de.txt"
format="snowball" />
        <filter class="solr.GermanNormalizationFilterFactory"/>
        <filter class="solr.GermanLightStemFilterFactory"/>
        <!-- less aggressive: <filter class="solr.GermanMinimalStemFilterFactory"/>
-->
        <!-- more aggressive: <filter class="solr.SnowballPorterFilterFactory" language="German2"/>
-->
    </analyzer>
</fieldType>
...

What is wrong with this schema? Respectively, what should I change to be able to correctly
do wildcard searches?

Many thanks for your time. Cheers,

christian
--
Christian Ribeaud
Software Engineer (External)
NIBR / WSJ-310.5.17
Novartis Campus
CH-4056 Basel



-----Original Message-----
From: Ahmet Arslan [mailto:iorixxx@yahoo.com] 
Sent: Donnerstag, 11. August 2016 16:00
To: solr-user@lucene.apache.org; Ribeaud, Christian (Ext)
Subject: Re: Wildcard search not working

Hi Chiristian,

The query r?che may not return at least the same number of matches as roche depending on your
analysis chain.
The difference is roche is analyzed but r?che don't. Wildcard queries are executed on the
indexed/analyzed terms.
For example, if roche is indexed/analyzed as roch, the query r?che won't match it.

Please see : https://wiki.apache.org/solr/MultitermQueryAnalysis

Ahmet



On Thursday, August 11, 2016 4:42 PM, "Ribeaud, Christian (Ext)" <christian.ribeaud@novartis.com>
wrote:
Hi,

What would be the reasons making the wildcard search for Lucene Query Parser NOT working?

We are using Solr 5.4.1 and, using the admin console, I am triggering for instance searches
with term 'roche' in a specific core. Everything fine, I am getting for instance two matches.
I would expect at least the same number of matches with term 'r?che'. However, this does NOT
happen. I am getting zero matches. Same problem occurs with 'r*che'. 'roch?' does not work
neither but 'roch*' works.

Switching debug mode brings following output:

"debug": {
    "rawquerystring": "roch?",
    "querystring": "roch?",
    "parsedquery": "text:roch?",
    "parsedquery_toString": "text:roch?",
    "explain": {},
    "QParser": "LuceneQParser",
...

Any idea? Thanks and cheers,

christian
Mime
View raw message