asterixdb-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wenhai Li (Code Review)" <>
Subject Change in asterixdb[master]: RangeGenerator aggfunc for the numeric/asciiString datatype ...
Date Mon, 07 Nov 2016 16:27:03 GMT
Wenhai Li has posted comments on this change.

Change subject: RangeGenerator aggfunc for the numeric/asciiString datatype based on parallel
streaming histogram.

Patch Set 35:


Hi, Yingyi and Preston.

I didn't know you cann't see the comments without publishing. :( Hope it's not too late.
File asterixdb/asterix-app/src/test/resources/runtimets/queries/aggregate/rg_double/rg_double.3.query.aql:

Line 20: set partitions '2'
> can "partitions" be an argument of the function?
File hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/range/

Line 38:         SHORT,
> Those types are not AsterixDB types.  

Line 52:     public void appendPair(E item, int count, boolean soft) throws HyracksDataException;
> What's the difference between addPair and appendPair?  Do we need both at a
Addpair is for extending purpose, currently we will trigger the appendPair once the local
histograms are merged consequently onto the global one.

Line 54:     public List<Entry<E, Integer>> generate(boolean isGlobal) throws
> Must there be an final step to call generate? Can generate be called multip
For local histogram, it is more like a "get" without merging. While in the global one, it
needs the inner merge. Both generate once and service multiple times.
File hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/range/

Line 29:     public GenericQuantile<K, V> get(int index);
> Can you add some high-level annotation to describe what each method means?
File hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/range/structures/

Line 34:     private List<GenericQuantile<Double, Integer>> bins;
> Use a fixed size array?
The fixed size array can hardly handle the frequent insert.

Line 38:     protected DominantQuantile<Number> heapHead;
> Use Java's PriorityQueue?

Line 40:     protected Random prng;
> prng is not used in this abstract class?
used in the derived class. Place it here or there?

Line 46:         prng = new Random(31183);
> Why is the random seed a fixed number?
Maybe a trick, but well-accepted in histogram.

Line 60:                 } else {
> Use JDK library binarySearch?

Line 102:     public void add(Double key, Integer value) {
> Double->double?
For sake of template requirement.

Line 120:     public void add(Double key) throws HyracksDataException {
> Double->double?
For sake of template requirement.

Line 125:         else if (bins.size() < cardinality) {
> is the code formatted correctly? "else if" should be appended to the previo

Line 175:     public void setKey(int index, Double key) {
> Double->double?
For sake of template requirement.
File hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/range/structures/

Line 70:         return super.hashCode();
> This doesn't seem right. hashCode implementation should be consistent with 

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: I450d0962fbeacfb2b6ab9fae0750f025ef17ba01
Gerrit-PatchSet: 35
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Wenhai Li <>
Gerrit-Reviewer: Jenkins <>
Gerrit-Reviewer: Jianfeng Jia <>
Gerrit-Reviewer: Michael Blow <>
Gerrit-Reviewer: Preston Carman <>
Gerrit-Reviewer: Till Westmann <>
Gerrit-Reviewer: Wenhai Li <>
Gerrit-Reviewer: Yingyi Bu <>
Gerrit-Reviewer: Yingyi Bu <>
Gerrit-HasComments: Yes

View raw message