lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yonik Seeley (Commented) (JIRA)" <>
Subject [jira] [Commented] (LUCENE-3482) Refactor grouping module to be more maintainable
Date Sun, 02 Oct 2011 21:25:33 GMT


Yonik Seeley commented on LUCENE-3482:

With respect to some modules depending on other modules, I'm not sure where I stand... some
have the opinion that the modules should be independent.

What caught my attention in this issue though is that the refactoring seems extensive enough
that we should do performance testing of all the current cases to ensure there are no regressions.
 Performance is the most important factor here, and I'm not sure if trying to introduce more
layers of abstraction (when it's really just an implementation detail) is worth it.

The addition of remove() to SentinelIntSet also seems erroneous:
+  public void remove(int key) {
+    int s = find(key);
+    if (s >= 0) {
+      count--;
+      keys[s] = emptyVal;
+    }
+  }
That won't work, as it will foil future lookups for some keys.  If we need a Set that supports
removal, it should prob be implemented in a different class.
> Refactor grouping module to be more maintainable
> ------------------------------------------------
>                 Key: LUCENE-3482
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/grouping
>    Affects Versions: 4.0
>            Reporter: Martijn van Groningen
>             Fix For: 4.0
>         Attachments: LUCENE-3482.patch
> Currently we have 4 types of grouping collectors and 8 concrete subclasses in Lucene
/ Solr. In current architecture for each type of collector two concrete subclasses need to
be created. An implementation optimized for single term based groups and a more general implementation
that works with MutableValue to also support grouping by functions. If we want for example
group by IndexDocValues each type of grouping collector needs to have three concrete subclasses.
This design isn't very maintainable.
> I think it is best to introduce a concept that knows how deals with dealing groups for
all the different sources. Therefore the grouping module should depend on the queries module,
so that grouping can reuse the ValueSource concept. A term based concrete impl. of this concept
knows for example to use the DocValues.ord() method. Or more generic concrete impl. will use

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message