lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shai Erera <ser...@gmail.com>
Subject Re: First result in the group
Date Wed, 02 Sep 2009 11:51:09 GMT
I see ... the solution I have in mind is not simple, but it follows the
Collector approach. Index categories as payloads of documents such that
there is one field (cats:all for example) that includes a posting list for
all documents, each has the categories it is associated w/ in its payload:
cats:all --> 0 [cat1, cat2] 1 [cat1, cat3, cat4] 4 [cat2, cat3, cat4] ...

Then when Collector.collect() is called, skip to that doc ID and read its
categories and store the ID in category maps (i.e., you'll have maps for
cat1, cat2, ... catN, each will include all doc IDs, sorted by score, which
were collected by this query).

Then you can fetch all categories whose maps/sets are not empty and display
M docs from each.

If you know in advance which categories are requested to be grouped by, for
example I want a group by on categories 1, 3, 4 and 7, you can optimize the
solution further, but I'm not sure if that's what you requested.

Also, if you can translate category Strings to integers, you can store more
efficient payloads ...

Shai

On Wed, Sep 2, 2009 at 2:38 PM, Ganesh <emailgane@yahoo.co.in> wrote:

> I have a field called category and all documents will have belong to some
> category( say some belong to X and some Y etc). The field values may change
> dynamically. I want the search results to be filterted to retrieve one
> document per category.
>
> This is similar to 'group by' feature in database.
>
> Regards
> Ganesh
>
>
> ----- Original Message -----
> From: "Shai Erera" <serera@gmail.com>
> To: <java-user@lucene.apache.org>
> Sent: Wednesday, September 02, 2009 4:33 PM
> Subject: Re: First result in the group
>
>
> > What do you mean by "first result in the group"? What is a group?
> >
> > On Wed, Sep 2, 2009 at 1:36 PM, Ganesh <emailgane@yahoo.co.in> wrote:
> >
> >> Hello all,
> >>
> >> I want to retrieve the first result in the group. How to acheive this?
> >> Currently i am parsing all the results, using a hash and avoiding
> duplicate
> >> entries.
> >>
> >> Is there any better way?
> >>
> >> Regards
> >> Ganesh
> >> Send instant messages to your online friends
> http://in.messenger.yahoo.com
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
> >
> Send instant messages to your online friends http://in.messenger.yahoo.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message