asterixdb-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jianfeng Jia (Code Review)" <do-not-re...@asterixdb.incubator.apache.org>
Subject Change in hyracks[master]: Implemented the memory-bounded HashGroupby and HashJoin for ...
Date Sat, 16 Jan 2016 06:16:27 GMT
Jianfeng Jia has posted comments on this change.

Change subject: Implemented the memory-bounded HashGroupby and HashJoin for BigObject
......................................................................


Patch Set 7:

(11 comments)

@Yingyi, I'd like to address the IFrame related changes to a separate patch. Otherwise this
one will become too big.

https://asterix-gerrit.ics.uci.edu/#/c/398/7/algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/operators/physical/ExternalGroupByPOperator.java
File algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/operators/physical/ExternalGroupByPOperator.java:

Line 256:                 true);
> Do we always need to sort the output?
No. We don't need to sort if using ExternalHashGroupby operator. Did you mean that "true"?
That "true" is about "isFinalStage".


https://asterix-gerrit.ics.uci.edu/#/c/398/7/hyracks/hyracks-api/src/main/java/org/apache/hyracks/api/comm/FixedSizeFrame.java
File hyracks/hyracks-api/src/main/java/org/apache/hyracks/api/comm/FixedSizeFrame.java:

Line 43:     public ByteBuffer getBuffer() {
> What is the user-level abstraction after this change?
I've updated all the IFrame to the IFrameReader. Haven't done it for the IFrameWriter. I think
it should be a separate patch, because it will be a big change again.


https://asterix-gerrit.ics.uci.edu/#/c/398/7/hyracks/hyracks-dataflow-common/src/main/java/org/apache/hyracks/dataflow/common/comm/io/FrameTupleAppenderWrapper.java
File hyracks/hyracks-dataflow-common/src/main/java/org/apache/hyracks/dataflow/common/comm/io/FrameTupleAppenderWrapper.java:

Line 45:     public IFrameTupleAppender getFrameTupleAppender() {
> This method is never called by others. 
Done


https://asterix-gerrit.ics.uci.edu/#/c/398/7/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/buffermanager/AbstractTupleBufferAccessor.java
File hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/buffermanager/AbstractTupleBufferAccessor.java:

Line 132:         throw new IllegalAccessError("Should never call this reset");
> Again, about the abstraction, should the user-code, e.g., operators and fun
I will address this as a separate CR.


https://asterix-gerrit.ics.uci.edu/#/c/398/7/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/buffermanager/DeletableFramePool.java
File hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/buffermanager/DeletableFramePool.java:

Line 54:     public ByteBuffer allocateFrame(int frameSize) throws HyracksDataException {
> Should this return IFrame, if IFrame is the abstraction that hides all the 
Will address this issue in another CR.


https://asterix-gerrit.ics.uci.edu/#/c/398/7/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/buffermanager/FrameFreeSlotLastFit.java
File hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/buffermanager/FrameFreeSlotLastFit.java:

Line 45:         frameSpaces = new FrameSpace[initial_frame_number];
> variable name should be camel case.
Done


Line 55:         for (int i = size - 1; i >= 0; i--) {
> Can that be a binary search, or based on a SortedMap?
This why it is called "lastFit". BinarySearch is introduced in "bestFit" code.
Actually, based on my experiment the lastFit is the best. BinarySearch will introduce more
overhead. Especially when we assume most of the case is the normal object case, the size of
each Frame is equal, then picking the last one is the most efficient way.


https://asterix-gerrit.ics.uci.edu/#/c/398/7/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/buffermanager/IDeletableFramePool.java
File hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/buffermanager/IDeletableFramePool.java:

Line 26:     void deAllocateBuffer(ByteBuffer buffer);
> Should that be
Will cover IFrame related changes in a separate patch.


https://asterix-gerrit.ics.uci.edu/#/c/398/7/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/buffermanager/IFramePool.java
File hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/buffermanager/IFramePool.java:

Line 40:     ByteBuffer allocateFrame(int frameSize) throws HyracksDataException;
> Should this return IFrame?
Will cover the IFrame related change in a separate patch


https://asterix-gerrit.ics.uci.edu/#/c/398/7/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/buffermanager/IPartitionedTupleBufferManager.java
File hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/buffermanager/IPartitionedTupleBufferManager.java:

Line 61:     void flushPartition(int pid, IFrameWriter writer) throws HyracksDataException;
> Please add a comment here about what this function does. How does it differ
Done


https://asterix-gerrit.ics.uci.edu/#/c/398/7/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/structures/NonSerializableHashTable.java
File hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/structures/NonSerializableHashTable.java:

Line 24: public class NonSerializableHashTable implements IBatchDeleteTable<TuplePointer>
{
> Why do we switch to this NoSerializable implementation for ExternalGroupBy?
removed


-- 
To view, visit https://asterix-gerrit.ics.uci.edu/398
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I248f3a374fdacad7d57e49cf18d8233745e55460
Gerrit-PatchSet: 7
Gerrit-Project: hyracks
Gerrit-Branch: master
Gerrit-Owner: Jianfeng Jia <jianfeng.jia@gmail.com>
Gerrit-Reviewer: Jenkins <jenkins@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Jianfeng Jia <jianfeng.jia@gmail.com>
Gerrit-Reviewer: Pouria Pirzadeh <pouria.pirzadeh@gmail.com>
Gerrit-Reviewer: Preston Carman <prestonc@apache.org>
Gerrit-Reviewer: Till Westmann <tillw@apache.org>
Gerrit-Reviewer: Yingyi Bu <buyingyi@gmail.com>
Gerrit-HasComments: Yes

Mime
View raw message