lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hasnain <hasn...@hotmail.com>
Subject Re: Alphanumeric wildcard search problem
Date Mon, 06 Sep 2010 16:11:53 GMT

Hi Erik,
         So I took your advice and started fresh with solr, got my self
latest copy of solr and started adding things gradually in configuration
files. Unfortunately, this still doesnot work. But, I realized that
searching for q=r-1* didnt return any results but when I query like this
q=r-(space)1*, solr returned perfect results.
when ever there is start of digits I have to put space before first digit
and it works perfect like dcp123* doesnot return no results but
dcp(space)123* works perfectly. Im using standard request handler and same
type i.e. textShoaib and Im absolutely sure im using the same schema.

Im sensing im very close to accomplish my task, any idea why solr is
behaving like that.
Is there anything I can do to get rid of this (space) issue before digits?

thanks


Erick Erickson wrote:
> 
> All I can say is that I just tried it with the following definitions
> and it works as expected. That is:
> http://localhost:8983/solr/select/?q=eoe:R-1*&version=2.2&start=0&rows=10&indent=on
> 
> <http://localhost:8983/solr/select/?q=eoe:R-1*&version=2.2&start=0&rows=10&indent=on>returns
> nothing (casing issue) and
> 
> http://localhost:8983/solr/select/?q=eoe:r-1*&version=2.2&start=0&rows=10&indent=on<http://localhost:8983/solr/select/?q=eoe:R-1*&version=2.2&start=0&rows=10&indent=on>
> 
> <http://localhost:8983/solr/select/?q=eoe:R-1*&version=2.2&start=0&rows=10&indent=on>returned
> 3 documents
> 
> <fieldType name="textShoaib" class="solr.TextField"
> positionIncrementGap="100">
> <analyzer type="index">
>  <tokenizer class="solr.StandardTokenizerFactory"/> just playing
>  <filter class="solr.StopFilterFactory"
>                ignoreCase="true"
>                words="stopwords.txt"
>                enablePositionIncrements="true"
>                />
> <filter class="solr.LowerCaseFilterFactory"/>
>  <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
> </analyzer>
>  <analyzer type="query">
> <tokenizer
> class="solr.StandardTokenizerFactory"/>
>  <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
> ignoreCase="true" expand="true"/>
>  <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords.txt"/>
> <filter class="solr.LowerCaseFilterFactory"/>
>  <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
> </analyzer>
>  </fieldType>
> 
> and
> <field name="eoe" type="textShoaib" indexed="true" />
> 
> where the relevant documents had the following:
> 
> <field name="eoe">R-1 and some stuff</field>
> <field name="eoe">r-1 really now</field>
> <field name="eoe">R-1</field>
> 
> searching for r* and r-* also returned three documents.
> Note that using StandardAnalyzer and WhitespaceAnalyzer
> didn't make any difference.
> 
> browsing the schema from the admin page shows the tokena
> r-1
> really
> some
> stuff
> now
> 
> Are you totally and absolutely sure that you're using the schema you think
> you are?
> And you restarted SOLR after you made schema changes?
> And you re-indexed your data?
> 
> 
> I also tried it with the dismax query, and it works there too:
> http://localhost:8983/solr/select/?qt=standard2&q=r-1*&version=2.2&start=0&rows=10&indent=on&debugQuery=true
> 
> returns 3 results, where standard2 is defined like this:
> 
> <requestHandler name="standard2" class="solr.SearchHandler">
> <!-- default values for query parameters -->
> <lst name="defaults">
> <str name="defType">dismax</str>
> <str name="echoParams">explicit</str>
> <str name="tie">0.6</str>
> <str name="qf">eoe^2.3 mat_nr^0.4</str>
> <str name="mm">0%</str>
> <!--
>       <int name="rows">10</int>
>       <str name="fl">*</str>
>       <str name="version">2.1</str>
>        -->
> </lst>
> 
> 
> So in sum, I suspect that you've made some innocous-seeming
> change somewhere that you've forgotten about (easy to do when
> you're trying a zillion things and frustrated). So, here's what I'd
> recommend:
> 
> back up and forget dismax for a while, just get it all working
> with the normal query parser and gradually add things back
> in until you either fail succeed all the way back to dismax.
> 
> BTW, you probably want to specify dismax with the qf parameter,
> pf stands for "phrase fields" where qf stands for "query fields".
> pf worked for me, but I doubt that's what you're really after.
> 
> This is all on SOLR 1.4.1 BTW.
> 
> Best
> Erick
> 
> 
> On Thu, Sep 2, 2010 at 6:12 AM, Hasnain <hasn_36@hotmail.com> wrote:
> 
>>
>> Erick,
>>       I have checked with lowercasing, and yes there are Items by this
>> name.
>> Im not getting anywhere with this, tried many things and Im really
>> perplexed.
>>
>> any other suggestion?
>>
>>
>>
>>
>>
>> Oh dear. Wildcard queries aren't analyzed, so I suspect it's a casing
>> issue.
>>
>> Try two things:
>> 1> search for r-1*
>> 2> look in your index and be sure the actual terms are there as you
>> expect.
>>
>> HTH
>> Erick
>>
>>
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/Alphanumeric-wildcard-search-problem-tp1393332p1405431.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
> 
> 

-- 
View this message in context: http://lucene.472066.n3.nabble.com/Alphanumeric-wildcard-search-problem-tp1393332p1427565.html
Sent from the Solr - User mailing list archive at Nabble.com.

Mime
View raw message