lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-1421) Ability to group search results by field
Date Wed, 11 May 2011 22:46:47 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13032145#comment-13032145
] 

Michael McCandless commented on LUCENE-1421:
--------------------------------------------

{quote}
I think we also need a strategy mechanism (or at least an GroupCollector class hierarchy)
inside this module. The mechanism should select the right group collector(s) for a certain
request. Some users maybe only care about the top group document, so I second pass won't be
necessary. Another example with faceting in mind. When group based faceting is necessary.
The top N groups don't suffice. You'll need all group docs (I currently don't see a other
way). These groups docs are then used to create a grouped Solr DocSet. But this should be
a completely different implementation.
{quote}

I agree, there's much more we could do here!  Specialized collection for the maxDocsPerGroup=1
case, and for the "I want all groups" case, would be nice.  For the "not many unique values
in the group field" case we could do a single-pass collector, I think.

Grouping by a multi-valued field should be possible (we now have DocTermOrds in Lucene, but
it doesn't load the term byte[] data), as well as support for sharding, ie, by merging top
groups and docs w/in each group (but I think we need an addition to FieldComparator API for
this).

I think we should commit this starting point, today, and then iterate from there...

Martijn, thank you for persisting for so long on SOLR-236!  We are
finally getting grouping functionality accessible from Lucene and
Solr...


> Ability to group search results by field
> ----------------------------------------
>
>                 Key: LUCENE-1421
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1421
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Search
>            Reporter: Artyom Sokolov
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 3.2, 4.0
>
>         Attachments: LUCENE-1421.patch, LUCENE-1421.patch, lucene-grouping.patch
>
>
> It would be awesome to group search results by specified field. Some functionality was
provided for Apache Solr but I think it should be done in Core Lucene. There could be some
useful information like total hits about collapsed data like total count and so on.
> Thanks,
> Artyom

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message