lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Martijn van Groningen (Issue Comment Edited) (JIRA)" <j...@apache.org>
Subject [jira] [Issue Comment Edited] (LUCENE-3482) Refactor grouping module to be more maintainable
Date Sun, 02 Oct 2011 20:07:33 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-3482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13119052#comment-13119052
] 

Martijn van Groningen edited comment on LUCENE-3482 at 10/2/11 8:06 PM:
------------------------------------------------------------------------

Attached initial patch. The patch has a concept named GroupHolder that knowns how to efficiently
collect groups. All grouping collectors use that concept to prevent subclassing. 

I didn't change the already existing collectors in the grouping module. I added a package
research inside the grouping package. I decided to keep the original collectors around to
test performance between the original collectors and collectors that use the GroupHolder.

The grouping module depends now on the queries module. I added a method to DocValues to retrieve
the ord from a value:
{code}public int ord(MutableValue value) { throw new UnsupportedOperationException(); }{code}

The attached patch also support grouping by bytes IndexDocValues. We already have NumericIndexDocValueSource
maybe that can be merged with ByteRefIndexDocDV (included in this patch) into IndexDocValueSource?
                
      was (Author: martijn.v.groningen):
    Attached initial patch. The patch has a concept named GroupHolder that knowns how to efficiently
collect groups. All grouping collectors use that concept to prevent subclassing. 

I didn't change the already existing collectors in the grouping module. I added a package
research inside the grouping package. I decided to keep the original collectors around to
test performance between the original collectors and collectors that use the GroupHolder.

The grouping module depends now on the queries module. I added a method to DocValues to retrieve
the ord from a value:
{code}public int ord(MutableValue value) { throw new UnsupportedOperationException(); }{code}
                  
> Refactor grouping module to be more maintainable
> ------------------------------------------------
>
>                 Key: LUCENE-3482
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3482
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/grouping
>    Affects Versions: 4.0
>            Reporter: Martijn van Groningen
>             Fix For: 4.0
>
>         Attachments: LUCENE-3482.patch
>
>
> Currently we have 4 types of grouping collectors and 8 concrete subclasses in Lucene
/ Solr. In current architecture for each type of collector two concrete subclasses need to
be created. An implementation optimized for single term based groups and a more general implementation
that works with MutableValue to also support grouping by functions. If we want for example
group by IndexDocValues each type of grouping collector needs to have three concrete subclasses.
This design isn't very maintainable.
> I think it is best to introduce a concept that knows how deals with dealing groups for
all the different sources. Therefore the grouping module should depend on the queries module,
so that grouping can reuse the ValueSource concept. A term based concrete impl. of this concept
knows for example to use the DocValues.ord() method. Or more generic concrete impl. will use
DocValues.ValueFiller. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message