lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: Boolean Query Issue
Date Mon, 02 Jun 2008 12:49:36 GMT
You need to include your code I think. This makes no sense on a quick
look, so unless we see some code it'll be hard to know whether
we're looking at anything relevant.

Best
Erick

On Mon, Jun 2, 2008 at 1:19 AM, Sonu Sudhakar <sonusr@gmail.com> wrote:

> Hi,
>
> I have done some more analysis on this issue. I think it is related to
> lucene's default operator.
> I am getting excat results, when I sets the default operator as 'OR', but
> facing problem when setting the default operator as 'AND'.
>
> The following are the lucene QueryParser outputs for both cases.
>
> Query :* TTL:store AND TTL:data OR TTL:variable
>
> *1. When lucene default operator is '*OR'
>
> *QueryParser output  using toString method: *
> +TTL:store +TTL:data TTL:variable
>
>
> *2. When lucene default operator is '*AND'
>
> *QueryParser output  using toString method:
> *+TTL:store TTL:data TTL:variable
>
> *The output of second case is confusing me.
>
> Could anybody please give me an explanation for this behavior?
>
> Thanks,
> Sonu
>
> On Thu, May 29, 2008 at 3:49 PM, Sonu Sudhakar <sonusr@gmail.com> wrote:
>
> > Erick,
> >
> > Thanks for your reply.
> >
> > I am working with approximately 1 million documents. They are indexed in
> 4
> > servers. Each document has multiple fields. I am using
> ParallelMultiSearcher
> > for searching purpose.
> >
> > I tried a few queries in the title(TTL) field.
> >
> > i started with a simple query without boolean operators.
> >
> > *1. TTL:data => 3733 results (all matches had "data" in title)*
> >
> > Then I tried a second one with AND operator.
> >
> > *2. TTL:data AND TTL:store => 19 results*
> >
> > I analyzed the results. the results had both "data" and "store" in the
> > title.
> >
> > *I then tried OR operator*
> >
> > *3. TTL:data AND TTL:store OR TTL:variable*
> >
> > I got 3,733 results., same as the query TTL:data.
> >
> > I even tried giving a meaningless query
> >
> > TTL:data AND TTL:storettttt OR TTL:variablettttt => 3,733 results (The
> > results were same as that of TTL:data.)
> >
> > TTL:data AND TTL:computer OR TTL:device => 3,733 results (this also
> showed
> > the same results as above)
> >
> > The same thing repeats for other cases too. The queries below also
> behaved
> > the same way.
> > i.e. -
> >
> > 1. TTL:store AND TTL:data OR TTL:variable => 76 results
> > 2. TTL:store AND TTL:datatttt OR TTL:variabletttt => 76 results
> > 3. TTL:store AND TTL:computer OR TTL:device => 76 results
> >
> >
> > 1. TTL:variable AND TTL:data OR  TTL:store => 1,496 results
> > 2. TTL:variable AND TTL:datatttt OR  TTL:storetttt => 1,496 results
> > 3. TTL:variable AND TTL:computer OR  TTL:device => 1,496 results
> >
> > I hope you have a clearer picture of my issue now.
> >
> > Thanks,
> > Sonu
> >
> >
> > On Wed, May 28, 2008 at 7:09 PM, Erick Erickson <erickerickson@gmail.com
> >
> > wrote:
> >
> >> It's unclear what you *should* expect. How is your data
> >> distributed?
> >>
> >> In other words, how many documents do you have? In this example,
> >> for instance,
> >> 1. TTL:data AND TTL:store OR TTL:variable => 3,733 results
> >> it considered the TTL:data part only.
> >>
> >> it's perfecily reasonable if every document that had "variable" in the
> >> field *also* has "data" and "store" in the field. So your numbers
> >> don't give us much to work with.....
> >>
> >> Remember, though, that Lucene syntax isn't a pure boolean syntax. See
> >>
> >> http://wiki.apache.org/lucene-java/BooleanQuerySyntax
> >>
> >> And when in doubt parenthesize <G>...
> >>
> >> Best
> >> Erick
> >>
> >> On Wed, May 28, 2008 at 7:44 AM, Sonu Sudhakar <sonusr@gmail.com>
> wrote:
> >>
> >> > Hi,
> >> >
> >> > I have some issue with boolean queries.
> >> >
> >> > I am using Lucene-core-2.3.1.
> >> >
> >> > I have done test on boolean query with 3 terms (data, store, variable)
> >> in
> >> > my
> >> > TTL field. The TTL field is indexed and searched using
> StandardAnalyzer.
> >> >
> >> > The three terms when searched individually gave the following result
> >> >
> >> > 1. TTL:data  => 3,733 results
> >> > 2. TTL:store  => 76 results
> >> > 3. TTL:variable  => 1,496 results
> >> >
> >> > But found issue when combining these terms with boolean operators.
> >> >
> >> > e.g.
> >> > 1. TTL:data AND TTL:store OR TTL:variable => 3,733 results
> >> > it considered the TTL:data part only.
> >> >
> >> > 2. TTL:store AND TTL:data OR TTL:variable => 76 results
> >> > it considered  the TTL:store part only.
> >> >
> >> > 3. TTL:variable AND TTL:data OR  TTL:store => 1,496 results
> >> > it considered  the TTL:variable part only.
> >> >
> >> > But I am getting correct result when combining terms with 'AND'
> >> operator. I
> >> > think the issue is with 'OR' operator.
> >> >
> >> >
> >> > Could anybody give an explanation for this behavior of lucene?
> >> > Could you give suggestions to rectify this?
> >> >
> >> > Thanks,
> >> > Sonu
> >> >
> >>
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message