lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Fäßler <erik.faess...@uni-jena.de>
Subject Re: Solr 1.4.1: Weird query results
Date Wed, 20 Apr 2011 08:40:36 GMT
  Oooops, I have to take something back: The index *has* been created 
with Lucene 2.9.3! Sorry for confusing that, I am using two different 
index versions, the older for productive purposes and the newer for what 
I am developing currently. I just checked  back with Luke, he 
acknowledges that the index used with my Solr instance has format -9 
(Lucene 2.9).
So that's not the matter, I guess...still ideas?! ;)

Am 20.04.2011 10:17, schrieb Erik Fäßler:
>  Thank you very much for your answers :-) First of all, I just noticed 
> I sent the question unintentionally to the Lucene list while it's more 
> of a Solr issue. I will answer here all the same to not confuse 
> things. My apologies ;)
>
> First to Erick's suggestions. The default field has been "text" for a 
> longer time so I did not make a change to that field yesterday but it 
> had been this field before already.
> With "not created by Solr" I exactly mean it has been created using 
> Lucene directly. This could be an issue indeed as Lucene 2.3.1 has 
> been used to create the index, where Solr 1.4.1 uses Lucene 2.9.3. But 
> it seemed to work fine so far (but perhaps that's just wrong, I don't 
> know yet).
>
> I tried your hint with appending "&debugQuery=on". Guess what: With 
> that appended I get my hits. No kidding, appending the debug option 
> gives my 30 document hits, deleting it from my browser's address bar 
> leaves me behind with 0 hits (?!).
> Adress bar strings are:
> No hits:
> http://localhost:8983/solr/select/?q=marine&version=2.2&start=0&rows=10&indent=on

>
> 30 hits:
> http://localhost:8983/solr/select/?q=marine&version=2.2&start=0&rows=10&indent=on&debugQuery=on

>
>
> Here's the debug output concerning the query:
>
> <str name="rawquerystring">marine</str>
> <str name="querystring">marine</str>
> <str name="parsedquery">text:marine</str>
> <str name="parsedquery_toString">text:marine</str>
>
> Seems fine. This is expected because I already tried the analysis 
> interface to check whether the correct terms are searched for.
> Here my schema snippets:
>
> FieldType "text_ws":
> <fieldType name="text_ws" class="solr.TextField" 
> positionIncrementGap="100">
> <analyzer>
> <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> </analyzer>
> </fieldType>
> (Solr 1.4.1. default)
>
> Field "text":
> <field name="text" type="text_ws" indexed="true" stored="true" 
> termVectors="true" termPositions="true" />
>
> Default search field:
> <defaultSearchField>text</defaultSearchField>
>
> I guess this also answers the hints given by Lance. Writing this down, 
> I get the feeling the first thing I should do is to update my index to 
> match the Lucene version used by Solr. This seems to be the most 
> obvious hint (but as Luke can handle all version I thought using this 
> index with Solr should be fine, too). Although it's really quite 
> strange that appending the debug option changes my search results. Oh 
> my, probably I did just miss some basic about how to usr Solr ;)
> Your opinion? Changing the index to another Lucene version isn't 
> exactly the fastest and easiest thing so I'd like to strike out all 
> other possibilities before :)
>
> Best regards,
>
>     Erik
>
>
> Am 20.04.2011 01:07, schrieb Lance Norskog:
>> Look at the "text" definition stack. Does it have the same analyzer
>> and filter that you used to make the index, and in the same order?
>>
>> The specific problem is that the "text" field includes a stemmer, and
>> your code probably did not. And so "marine" is stored as, maybe
>> 'marin'.  To check this out, look at the 'schema browser' page off the
>> admin page. This will show you all of the indexed terms in each field.
>> Also look at the Analysis page: this lets you see how text is parsed
>> and changed in the analysis stack.
>>
>> On Tue, Apr 19, 2011 at 2:56 PM, Erick 
>> Erickson<erickerickson@gmail.com>  wrote:
>>> Hmmmm, I don't see the problem either. It *sounds* like you don't 
>>> really
>>> have the default search field defined the way you think you do. Did 
>>> you restart
>>> Solr after making that change?
>>>
>>> I'm assuming that when you say "not created by Solr" you mean that 
>>> it's created
>>> by Lucene. What version of Lucene and Solr are you using if that's 
>>> true?
>>>
>>> You can test this by appending "&debugQuery=on" to your query or 
>>> checking
>>> the "debug enable" checkbox in the full query interface from the 
>>> admin page.
>>> That should show you exactly what is being searched. You might also 
>>> want
>>> to look at the analysis page for your field and see how your query
>>> is tokenized.
>>>
>>> But, like I said, this looks like it should work. If you can post 
>>> the results of
>>> adding&debugQuery=on and your actual<fieldType>  definition for 
>>> "text_ws" your
>>> <field>  declaration for "text" and the<defaultSearchField>    from

>>> your schema
>>> that would help. I can't tell you how many times something that's 
>>> eluded me
>>> for hours is obvious to someone else :)..
>>>
>>> Best
>>> Erick
>>>
>>>
>>>
>>> On Tue, Apr 19, 2011 at 3:59 PM, Erik 
>>> Fäßler<erik.faessler@uni-jena.de>  wrote:
>>>>   Hallo there,
>>>>
>>>> my issue qualifies as newbie question I guess, but I'm really a bit
>>>> confused. I have an index which has not been created by Solr. 
>>>> Perhaps that's
>>>> already the point although I fail to see why this should be an 
>>>> issue with my
>>>> problem.
>>>>
>>>> I use the admin interface to check which results particular queries 
>>>> bring
>>>> in. My index documents have a field "text" which holds the document 
>>>> text.
>>>> This text has only been white space tokenized. So in my schema, the 
>>>> type for
>>>> this field is "text_ws". My schema says
>>>> "<defaultSearchField>text</defaultSearchField>".
>>>>
>>>> When I now search for, say, 'marine' (without quotes), I don't get any
>>>> search results. But when I search '"marine"' (that is, embraced by 
>>>> double
>>>> quotes) I get my document hits. Alternatively, I can prepend the 
>>>> field name:
>>>> 'text:marine' and will also get my results.
>>>>
>>>> Similar with this phrase query: "marine mussels", where "In marine 
>>>> mussels
>>>> of the genus" is a text snippet of a document. The phrase "marine 
>>>> mussels"
>>>> won't give any hits. Searching for 'text:"marine mussels"' will 
>>>> give me the
>>>> exact document containing this text snippet.
>>>>
>>>> I'm sure this has quite a simple explanation but I'm unable to find 
>>>> it right
>>>> now ;-) Perhaps you can help with that.
>>>>
>>>> Thanks a lot!
>>>>
>>>> Best regards,
>>>>
>>>>     Erik
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message