lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hoss Man (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-2996) make "q=*" not suck in the lucene and edismax parsers
Date Fri, 30 Dec 2011 22:46:31 GMT

    [ https://issues.apache.org/jira/browse/SOLR-2996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13177799#comment-13177799
] 

Hoss Man commented on SOLR-2996:
--------------------------------

Recent example of this type of confusion and the problems it can cause from the mailing list...

https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201112.mbox/%3Calpine.DEB.2.00.1112131115550.16571@bester%3E

Another recent discussion about this type of problem from IRC...

{noformat}
13:25 < mikeliss:#solr> Hi, I'm running into an error with maxbooleanclauses when I
try to do a range query with 
                        highlighting...is there any workaround for this? Would really appreciate
some direction, if 
                        anybody knows.
13:26 < mikeliss:#solr> This is the query that dies: 
http://localhost:8983/solr/select/?q=*&version=2.2&start=0&rows=20&indent=on&hl=true&hl.fl=text,caseName,westCite,docketNumber,lexisCite,court_citation_string&hl.snippets=5&f.text.hl.alternateField=text&f.text.hl.maxAlternateFieldLength=500

13:28 < hoss:#solr> that query doesn't make sense ... for a couple of reasons ... what
are you *trying* to do?
13:29 < hoss:#solr> i mena ... for starters ... there is no range query there.  second,
q=* is a big red flag: it's 
                    a prefix query on the default field using the prefix "" (ie: the empty
string)

14:23 < mikeliss:#solr> hoss, yeah, I assumed that highlighting would just do nothing
if a prefix query were given 
                        on an empty string.
14:24 < mikeliss:#solr> hoss, I added a check in my code that will only enable highlighting
if the query isn't '*'.
14:24 < mikeliss:#solr> hoss, Seems naive, but it's working at least for the moment.

14:27 < hoss:#solr> i think you're missing my point: q=* is a fairly non-sensical query
... you should't just 
                    prevent highlighting on that query, you should stop doing that query in
the first place
14:28 < hoss:#solr> as a query solr can handle it, and optimize it to be efficient
14:28 < hoss:#solr> (evenn though it's silly)
14:28 < mikeliss:#solr> hoss, I'm using that query on my homepage to show the latest
documents in the index. It 
                        should just return everything, right?
14:28 < hoss:#solr> but for highlighting, the highlighter actually needs to know all
the terms it matches
14:28 < hoss:#solr> and to konw al lthe terms it matches, it needs to look at *ALL*
the terms in the default field
14:29 < hoss:#solr> mikeliss: no, no, NO ... i'm not sure where people started getting
the missconception that 
                    "q=*" matches all docs, but that is *NOT* what it does
14:29 < hoss:#solr> one second...
14:30 < hoss:#solr> mikeliss: 
https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201112.mbox/%3CCAL69qOn1XeMNz6JYdWj_o7rH_=O3i-NiqdO6rorvN48bywU+nA@mail.gmail.com%3E
14:30 < hoss:#solr> ...and...
14:30 < hoss:#solr> https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201112.mbox/%3Calpine.DEB.2.00.1112131115550.16571@bester%3E

14:32 < mikeliss:#solr> hoss, ah, that makes sense. I guess * is just too tempting,
since it is something users can 
                        easily remember.
14:34 < mikeliss:#solr> hoss, back to my original issue, now I'm confused why hl fails
on a search for *. Shouldn't 
                        it just highlight nothing, and return results? I wasn't able to get
debugging to work for 
                        the query, so I'm a bit confused..

14:35 < hoss:#solr> see my other comment above: the highlighter is trying to find all
the terms used in the query 
                    to highlight them -- a query for "*" matches all terms in the default
field, which is way more 
                    then the highlighter can handle (hence the exception)

14:38 < hoss:#solr> i'm filing a bug to change the beahvior of "q=*" ... do you mind
if i cut/paste this dialog 
                    into the jira issue as an example of user confusion?
14:39 < mikeliss:#solr> Not at all. I was wondering if that was potentially a bug...figured
I'd leave it to the 
                        experts.
{noformat}
                
> make "q=*" not suck in the lucene and edismax parsers
> -----------------------------------------------------
>
>                 Key: SOLR-2996
>                 URL: https://issues.apache.org/jira/browse/SOLR-2996
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Hoss Man
>
> More then a few users have gotten burned by thinking that "*" is the appropriate syntax
for "match all docs" when what it really does (unless i'm mistaken) is create a prefix query
on the default search field using a blank string as the prefix.
> since it seems very unlikely that anyone has a genuine usecase for making a prefix query
with a blank prefix, we should change the default behavior of the LuceneQParser and EDismaxQParsers
(and any other Qparsers that respect *:* if i'm forgetting them) to treat this situation the
same as *:*.  we can offer a (local)param to force the old behavior if someone really wants
it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message