lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nawab Zada Asad Iqbal <khi...@gmail.com>
Subject Re: trivia question: why q=*:* doesn't return same result as q.alt=*:*
Date Wed, 17 Jan 2018 20:50:28 GMT
Chris / Hoss

Thanks for the detailed explanation. Erick Erickson's explanation made
sense to me but it didn't explain the part why the fields are different for
'hello' vs '*:*' .

I had never paid much attention the parser part of query handling and so
far focused only on the field definitions. I had to re-read parts of this
thread to understand the whole picture.

I had dropped an apparently unnecessary question but this thread has
provided a lot of necessary learning.


Thanks
Nawab

On Fri, Jan 12, 2018 at 10:38 AM, Chris Hostetter <hossman_lucene@fucit.org>
wrote:

>
> : defType=dismax does NOT do anything special with *:* other than treat it
>         ...
> : > As Chris explained, this is special:
>         ...
>
> I'm interpreting your followup question differently then Erick & Erik
> did.  I'm going to assume both E & E missunderstood your question, and i'm
> going to assume you completley understood my response to your original
> question.
>
> I'm going to assume that a way to rewrod/expand your followup question is
> something like this...
>
> "I understand now that defType=dismax doesn't support special syntax like
> '*:*' and treats that 3 input as just another 3 character string to search
> against the qf & pf fields -- but now what i don't understand is why are
> list of fields in the debug query output is different for 'q=*:*' compared
> to something like 'q=hello'"
>
> (If i have not understood your followup question correctly, please
> clarify)
>
> Let's look at those outputs you mentioned...
>
> : >> http://localhost:8983/solr/filesearch/select?fq=id:1193&
> : >> q=*:*&debugQuery=true
> : >>
> : >>
> : >>   - parsedquery: "+DisjunctionMaxQuery((user_email:*:* |
> user_name:*:* |
> : >>   tags:*:* | (name_shingle_zh-cn:, , name_shingle_zh-cn:, ,) |
> : >> id:*:*)~0.01)
> : >>   DisjunctionMaxQuery(((name_shingle_zh-cn:", , , ,"~100)^100.0 |
> : >>   tags:*:*)~0.01)",
> ...
> : >> e.g. following query uses the my expected set of pf and qf.
> ...
> : >> http://localhost:8983/solr/filesearch/select?fq=id:1193&
> : >> q=hello&debugQuery=true
> : >>
> : >>
> : >>
> : >>   - parsedquery: "+DisjunctionMaxQuery(((name_token:hello)^60.0 |
> : >>   user_email:hello | (name_combined:hello)^10.0 |
> (name_zh-cn:hello)^10.0
> : >> |
> : >>   name_shingle:hello | comments:hello | user_name:hello |
> : >> description:hello |
> : >>   file_content_zh-cn:hello | file_content_de:hello | tags:hello |
> : >>   file_content_it:hell | file_content_fr:hello | file_content_es:hell
> |
> : >>   file_content_en:hello | id:hello)~0.01)
> : >> DisjunctionMaxQuery((description:hello
> : >>   | (name_shingle:hello)^100.0 | comments:hello | tags:hello)~0.01)",
>
>
> The answer has to do with the list of qf & pf fields you have confiugred
> -- you didn't provide us with concrete specifics of what qf/pf you
> have configured in your requestHandler -- but you did mention in your
> second example that "following query uses the my expected set of pf and
> qf"
>
> By comparing the 2 examples at a glance, It appears that the fields in the
> first example (q=*:* ... again, searching for the literal 3 character
> string '*:*') are (mostly) a subset of the fields you "expected" (from the
> 2nd example)
>
> I'm fairly certain that what's happening here is that in both examples the
> literal string input is being given to the analyzer for all of your fields
> -- but in the case of the (literal) string '*:*' many of the analyzers are
> producing no terms at all -- ie: they are completley striping out
> punctuation -- so they don't appear in the final query.
>
> IIUC it looks like one other oddity here is that the reverse also
> seems to be true in some cases -- i suspect that
> although "name_shingle_zh-cn" doesn't appera in your 2nd example, it
> probably *is* in your pf param but whatever analyzer you have confiured
> for it produces no tokens for the latin characters "hello" but does
> produces tokens for the pure-punctuation characters "*:*"
>
>
> (If i'm correct about your question, but wrong about your qf/pf then
> please provide us with a lot more details -- notably your full
> schema/solrconfig used when executing those queries.
>
>
> -Hoss
> http://www.lucidworks.com/
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message