lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohamed Parvez <par...@gmail.com>
Subject Re: Wild card search does not return any result
Date Wed, 05 Aug 2009 15:58:00 GMT
looks like earlier schema.xml, has some typo.
below is the correct schema.xml

3] schema.xml
......
......
<field name="ID" type="float" indexed="true" stored="true" />
<field name="BUS" type="text" indexed="true" stored="true"/>
<field name="ROLE" type="text" indexed="true" stored="true" />
<field name="SPELL" type="textSpell" indexed="true" stored="true"
multiValued="true"/>
<copyField source="BUS" dest="SPELL" />
......
......
    <fieldType name="text" class="solr.TextField"
positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" enablePositionIncrements="true" />
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"
preserveOriginal="1" />
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.EnglishPorterFilterFactory"
protected="protwords.txt"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt"/>
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="0"
catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"
preserveOriginal="1" />
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.EnglishPorterFilterFactory"
protected="protwords.txt"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
    </fieldType
....
....

Thanks/Regards,
Parvez



On Wed, Aug 5, 2009 at 10:53 AM, Mohamed Parvez <parvez@gmail.com> wrote:

> Thanks Otis and Avlesh,
>
> Below is the configuration I have
>
> 1] solrconfig.xml
> ....
> .....
>   <requestHandler name="standard" class="solr.SearchHandler"
> default="true">
>      <lst name="defaults">
>        <str name="echoParams">explicit</str>
>       <str name="spellcheck.onlyMorePopular">false</str>
>       <str name="spellcheck.extendedResults">false</str>
>       <str name="spellcheck.count">1</str>
>     </lst>
>      <arr name="last-components">
>       <str>spellcheck</str>
>     </arr>
>   </requestHandler>
> .....
> .....
>   <requestHandler name="/dataimport"
> class="org.apache.solr.handler.dataimport.DataImportHandler">
>     <lst name="defaults">
>       <str name="config">data-import.xml</str>
>     </lst>
>   </requestHandler>
> ......
> ......
>   <searchComponent name="spellcheck" class="solr.SpellCheckComponent">
>     <str name="queryAnalyzerFieldType">textSpell</str>
>     <lst name="spellchecker">
>       <str name="name">default</str>
>       <str name="field">SPELL</str>
>       <str name="spellcheckIndexDir">./spellcheckerIndex</str>
>       <str name="buildOnCommit">true</str>
>       <str name="buildOnOptimize">true</str>
>     </lst>
>   </searchComponent>
>
> 2] data-import.xml
>
> .....
> ......
>     <document name="doc">
>         <entity name="user" pk="ID"
>                   query="select * from user">
>     <field column="ROLE" name="ROLE" />
>     <field column="ID" name="ID" />
>     <field column="BUS" name="BUS" />
> .....
> .....
>
> 3] schema.xml
> ......
> ......
> <field name="ID" type="float" indexed="true" stored="true" />
> <field name="BUS" type="text" indexed="true" stored="true"/>
> <field name="ROLE" type="text" indexed="true" stored="true" />
>  ......
> ......
> <field name="ID" type="float" indexed="true" stored="true" />
> <field name="BUS" type="text" indexed="true" stored="true"/>
> <field name="ROLE" type="text" indexed="true" stored="true" />
> <field name="SPELL" type="textSpell" indexed="true" stored="true"
> multiValued="true"/>
> <copyField source="BUS" dest="SPELL" />
> ......
> ......
>     <fieldType name="text" class="solr.TextField"
> positionIncrementGap="100">
>       <analyzer type="index">
>         <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>         <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords.txt" enablePositionIncrements="true" />
>         <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="1" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"
> preserveOriginal="1" />
>         <filter class="solr.LowerCaseFilterFactory"/>
>         <filter class="solr.EnglishPorterFilterFactory"
> protected="protwords.txt"/>
>         <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>       </analyzer>
>       <analyzer type="query">
>         <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>         <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
> ignoreCase="true" expand="true"/>
>         <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords.txt"/>
>         <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="1" generateNumberParts="1" catenateWords="0"
> catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"
> preserveOriginal="1" />
>         <filter class="solr.LowerCaseFilterFactory"/>
>         <filter class="solr.EnglishPorterFilterFactory"
> protected="protwords.txt"/>
>         <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>       </analyzer>
>     </fieldType
> ....
> ....
>
>
> To make it simple. I have only one record in the table,
> ID=1
> BUS=ICS
> ROLE=SSE
>
>
> like I said before,
> *I don't get any match, if i search for q=ics*
> I get the match, which is correct result, if i search for q=sse**
>
> I have not done any query rewriting, i am just using the default
> configuration, that comes with solr.
>
> Otis, Let me know if you need any more information.
>
> Avlesh, The above set up is just a striped down version, to figure out what
> is the issue, In my real application, I have 100 of collums in the table,
> that i use for building the search index. I dont think its a good option to
> copy over all the fields and create another 100 odd fields, with just lower
> case filter applied.
>
> ----
> Parvez
>
>
> From: Otis Gospodnetic <otis_gospodnetic@yahoo.com>
> Date: Tue, Aug 4, 2009 at 8:25 PM
> Subject: Re: Wild card search does not return any result
> To: solr-user@lucene.apache.org
>
>
> Hi,
>
> I doubt it's a bug.  It's probably working correctly based on the config,
> etc., I just don't have enough details about the configuration, your request
> handler, query rewriting, the data in your index, etc. to tell you what
> exactly is happening.
>
>  Otis
>
>
> On Tue, Aug 4, 2009 at 11:13 PM, Avlesh Singh <avlesh@gmail.com> wrote:
>
>> You read it incorrectly Parvez.
>> The "bug" that Bill seem to have found out is with the analysis tool and
>> NOT
>> the search handler itself. Results in your case is as expected. Wildcard
>> queries are not analyzed hence the inconsistency.
>> A workaround is suggested, on the same thread, here -
>>
>> http://markmail.org/message/ts65a6jok3ii6nva#query:+page:1+mid:i5zxdbnvspgek2bp+state:results
>>
>> Cheers
>> Avlesh
>>
>> On Wed, Aug 5, 2009 at 12:52 AM, Mohamed Parvez <parvez@gmail.com> wrote:
>>
>> > Thanks Otis, The thread suggests that this is bug
>> >
>> >
>> >
>> http://markmail.org/message/ts65a6jok3ii6nva#query:+page:1+mid:qinymqdn6mkocv4k
>> >
>> > Both SSE and ICS are 3 letter word and both are not part of English
>> > language.
>> > SEE* works fine and ICS* does not work, this is sure a bug.
>> >
>> > Any idea when will this bug be fixed or if there is any work around.
>> >
>> > ----
>> > Thanks/Regards,
>> > Parvez
>> > GV : 786-693-2228
>> >
>> >
>> > On Tue, Aug 4, 2009 at 11:48 AM, Otis Gospodnetic <
>> > otis_gospodnetic@yahoo.com> wrote:
>> >
>> > > Could it be the same reason as described here:
>> > >
>> > > http://markmail.org/message/ts65a6jok3ii6nva
>> > >
>> > > Otis
>> > > --
>> > > Sematext is hiring -- http://sematext.com/about/jobs.html?mls
>> > > Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
>> > >
>> > >
>> > >
>> > > ----- Original Message ----
>> > > > From: Mohamed Parvez <parvez@gmail.com>
>> > > > To: solr-user@lucene.apache.org
>> > > > Sent: Tuesday, August 4, 2009 11:26:45 AM
>> > > > Subject: Wild card search does not return any result
>> > > >
>> > > > Hello All,
>> > > >
>> > > >        I have two fields.
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > I have document(which has been indexed) that has a value of "ICS for
>> > BUS
>> > > > field" and "SSE for ROLE filed"
>> > > >
>> > > > When I search for q=BUS:ics i get the result, but if i search for
>> > > q=BUS:ics*
>> > > > i don't get any match (or result)
>> > > >
>> > > > when I search for q=ROLE:sse or q=ROLE:sse*, both the times I get
>> the
>> > > > result.
>> > > >
>> > > > why BUS:ics* does not return any result ?
>> > > >
>> > > >
>> > > > I have the default configuration for text filed, see below.
>> > > >
>> > > >
>> > > > positionIncrementGap="100">
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >                 ignoreCase="true"
>> > > >                 words="stopwords.txt"
>> > > >                 enablePositionIncrements="true"
>> > > >                 />
>> > > >
>> > > > generateWordParts="1" generateNumberParts="1" catenateWords="1"
>> > > > catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
>> > > >
>> > > >
>> > > > protected="protwords.txt"/>
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > ignoreCase="true" expand="true"/>
>> > > >
>> > > > words="stopwords.txt"/>
>> > > >
>> > > > generateWordParts="1" generateNumberParts="1" catenateWords="0"
>> > > > catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
>> > > >
>> > > >
>> > > > protected="protwords.txt"/>
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > ----
>> > > > Thanks/Regards,
>> > > > Parvez
>> > > >
>> > > > Note : This is a re-post. looks like something went wrong the first
>> > time
>> > > > around.
>> > >
>> > >
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message