lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shai Erera <ser...@gmail.com>
Subject Re: [jira] Commented: (LUCENE-2458) queryparser shouldn't generate phrasequeries based on term count
Date Sun, 23 May 2010 18:25:51 GMT
So ... after a long IRC chat on this, I think this has just been worded
incorrectly (the issue). As I understand, there are two issues here:
1) QP loses a phrase info for fields -- the query f:"abcd" and f:abcd are
parsed the same, or handled the same. There is no way for the one extending
QP to tell if quotes were used.
2) QP has a default impl for f:abcd which is not international-friendly.

I agree (1) should be fixed, and I apologize if I missed that previously.
Version is the right way to go with this.

About (2), I think that if f:abcd is submitted, then a PQ should not be
created. The user hasn't asked for it. But if f:"abcd" was submitted, then
it is ok to create a PQ by default. And we're only talking about defaults
here. Anyone should be able to extend QP and override the relevant
getFieldQuery variant and do whatever he wants.

If the question on what should be the default behavior for (2), then I think
pending Version, it should create a PQ for f:"abcd" only. And we leave it to
the extended to determine what should be his right behavior.

Shai

On Sun, May 23, 2010 at 9:09 PM, Robert Muir <rcmuir@gmail.com> wrote:

> On Sun, May 23, 2010 at 1:00 PM, Uwe Schindler <uwe@thetaphi.de> wrote:
> >  I just want to make the feature accessible and documented without
> Version.
>
> I think it is just a bug (a shoddy implementation that does not use
> the syntax, whether it was quoted or not, since this has been thrown
> away). In this implementation no one thought about languages that
> don't use whitespace and that it would make all queries into
> phrasequeries.
>
> I really do not think this sort of code belongs inside core lucene, if
> you want to make uninternationalized code in your own code base that
> is not correct that is fine.
>
> Furthermore by preserving this kind of bug it makes the queryparser
> more complicated, and especially in the future. If at some point in
> the future you want to really have the QP not split on whitespace (as
> you yourself said on the issue you want) to enable support for
> multi-word synonyms and "real" n-grams at querytime, I hope you
> understand this buggy code conflicts and complicates this later goal.
>
> --
> Robert Muir
> rcmuir@gmail.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

Mime
View raw message