lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Solr 1.4.1: Weird query results
Date Wed, 20 Apr 2011 11:47:29 GMT
This is all very strange. I guess I only have a few suggestions:

> It might be worth getting a copy of Luke. Under th "tools" menu
   there's a "checkindex" option that may show you something. You
   can also use Luke to query your index and examine it. That said,
   Luke uses Lucene, not Solr so I rather expect it to show you that
   everything's fine...

> Is there any chance at all you have some older jars mixed in?
   Frankly this is just me resorting to mysticism because that
   doesn't seem very likely to cause appending &debugQuery=on
   to change the search results! I'd expect much worse
   problems.

> As you said, re-indexing should fix things up. But there's no reason
   that I know of that this should be necessary given what you describe.

good luck, because I'm stumped!
Erick

On Wed, Apr 20, 2011 at 4:40 AM, Erik Fäßler <erik.faessler@uni-jena.de> wrote:
>  Oooops, I have to take something back: The index *has* been created with
> Lucene 2.9.3! Sorry for confusing that, I am using two different index
> versions, the older for productive purposes and the newer for what I am
> developing currently. I just checked  back with Luke, he acknowledges that
> the index used with my Solr instance has format -9 (Lucene 2.9).
> So that's not the matter, I guess...still ideas?! ;)
>
> Am 20.04.2011 10:17, schrieb Erik Fäßler:
>>
>>  Thank you very much for your answers :-) First of all, I just noticed I
>> sent the question unintentionally to the Lucene list while it's more of a
>> Solr issue. I will answer here all the same to not confuse things. My
>> apologies ;)
>>
>> First to Erick's suggestions. The default field has been "text" for a
>> longer time so I did not make a change to that field yesterday but it had
>> been this field before already.
>> With "not created by Solr" I exactly mean it has been created using Lucene
>> directly. This could be an issue indeed as Lucene 2.3.1 has been used to
>> create the index, where Solr 1.4.1 uses Lucene 2.9.3. But it seemed to work
>> fine so far (but perhaps that's just wrong, I don't know yet).
>>
>> I tried your hint with appending "&debugQuery=on". Guess what: With that
>> appended I get my hits. No kidding, appending the debug option gives my 30
>> document hits, deleting it from my browser's address bar leaves me behind
>> with 0 hits (?!).
>> Adress bar strings are:
>> No hits:
>>
>> http://localhost:8983/solr/select/?q=marine&version=2.2&start=0&rows=10&indent=on
>> 30 hits:
>>
>> http://localhost:8983/solr/select/?q=marine&version=2.2&start=0&rows=10&indent=on&debugQuery=on
>>
>> Here's the debug output concerning the query:
>>
>> <str name="rawquerystring">marine</str>
>> <str name="querystring">marine</str>
>> <str name="parsedquery">text:marine</str>
>> <str name="parsedquery_toString">text:marine</str>
>>
>> Seems fine. This is expected because I already tried the analysis
>> interface to check whether the correct terms are searched for.
>> Here my schema snippets:
>>
>> FieldType "text_ws":
>> <fieldType name="text_ws" class="solr.TextField"
>> positionIncrementGap="100">
>> <analyzer>
>> <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>> </analyzer>
>> </fieldType>
>> (Solr 1.4.1. default)
>>
>> Field "text":
>> <field name="text" type="text_ws" indexed="true" stored="true"
>> termVectors="true" termPositions="true" />
>>
>> Default search field:
>> <defaultSearchField>text</defaultSearchField>
>>
>> I guess this also answers the hints given by Lance. Writing this down, I
>> get the feeling the first thing I should do is to update my index to match
>> the Lucene version used by Solr. This seems to be the most obvious hint (but
>> as Luke can handle all version I thought using this index with Solr should
>> be fine, too). Although it's really quite strange that appending the debug
>> option changes my search results. Oh my, probably I did just miss some basic
>> about how to usr Solr ;)
>> Your opinion? Changing the index to another Lucene version isn't exactly
>> the fastest and easiest thing so I'd like to strike out all other
>> possibilities before :)
>>
>> Best regards,
>>
>>    Erik
>>
>>
>> Am 20.04.2011 01:07, schrieb Lance Norskog:
>>>
>>> Look at the "text" definition stack. Does it have the same analyzer
>>> and filter that you used to make the index, and in the same order?
>>>
>>> The specific problem is that the "text" field includes a stemmer, and
>>> your code probably did not. And so "marine" is stored as, maybe
>>> 'marin'.  To check this out, look at the 'schema browser' page off the
>>> admin page. This will show you all of the indexed terms in each field.
>>> Also look at the Analysis page: this lets you see how text is parsed
>>> and changed in the analysis stack.
>>>
>>> On Tue, Apr 19, 2011 at 2:56 PM, Erick Erickson<erickerickson@gmail.com>
>>>  wrote:
>>>>
>>>> Hmmmm, I don't see the problem either. It *sounds* like you don't really
>>>> have the default search field defined the way you think you do. Did you
>>>> restart
>>>> Solr after making that change?
>>>>
>>>> I'm assuming that when you say "not created by Solr" you mean that it's
>>>> created
>>>> by Lucene. What version of Lucene and Solr are you using if that's true?
>>>>
>>>> You can test this by appending "&debugQuery=on" to your query or
>>>> checking
>>>> the "debug enable" checkbox in the full query interface from the admin
>>>> page.
>>>> That should show you exactly what is being searched. You might also want
>>>> to look at the analysis page for your field and see how your query
>>>> is tokenized.
>>>>
>>>> But, like I said, this looks like it should work. If you can post the
>>>> results of
>>>> adding&debugQuery=on and your actual<fieldType>  definition for
>>>> "text_ws" your
>>>> <field>  declaration for "text" and the<defaultSearchField>
   from your
>>>> schema
>>>> that would help. I can't tell you how many times something that's eluded
>>>> me
>>>> for hours is obvious to someone else :)..
>>>>
>>>> Best
>>>> Erick
>>>>
>>>>
>>>>
>>>> On Tue, Apr 19, 2011 at 3:59 PM, Erik Fäßler<erik.faessler@uni-jena.de>
>>>>  wrote:
>>>>>
>>>>>  Hallo there,
>>>>>
>>>>> my issue qualifies as newbie question I guess, but I'm really a bit
>>>>> confused. I have an index which has not been created by Solr. Perhaps
>>>>> that's
>>>>> already the point although I fail to see why this should be an issue
>>>>> with my
>>>>> problem.
>>>>>
>>>>> I use the admin interface to check which results particular queries
>>>>> bring
>>>>> in. My index documents have a field "text" which holds the document
>>>>> text.
>>>>> This text has only been white space tokenized. So in my schema, the
>>>>> type for
>>>>> this field is "text_ws". My schema says
>>>>> "<defaultSearchField>text</defaultSearchField>".
>>>>>
>>>>> When I now search for, say, 'marine' (without quotes), I don't get any
>>>>> search results. But when I search '"marine"' (that is, embraced by
>>>>> double
>>>>> quotes) I get my document hits. Alternatively, I can prepend the field
>>>>> name:
>>>>> 'text:marine' and will also get my results.
>>>>>
>>>>> Similar with this phrase query: "marine mussels", where "In marine
>>>>> mussels
>>>>> of the genus" is a text snippet of a document. The phrase "marine
>>>>> mussels"
>>>>> won't give any hits. Searching for 'text:"marine mussels"' will give
me
>>>>> the
>>>>> exact document containing this text snippet.
>>>>>
>>>>> I'm sure this has quite a simple explanation but I'm unable to find it
>>>>> right
>>>>> now ;-) Perhaps you can help with that.
>>>>>
>>>>> Thanks a lot!
>>>>>
>>>>> Best regards,
>>>>>
>>>>>    Erik
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>
>>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>>
>>>
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message