impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tianyi Wang (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-5976: Remove equivalence class computation in FE
Date Fri, 03 Nov 2017 01:36:08 GMT
Tianyi Wang has posted comments on this change. ( )

Change subject: IMPALA-5976: Remove equivalence class computation in FE

Patch Set 4:

File fe/src/main/java/org/apache/impala/analysis/
PS3, Line 281: 
> Let's please avoid code changes unrelated to the purpose of this patch as m
File fe/src/main/java/org/apache/impala/analysis/
PS3, Line 91: /**
> We need a precise definition. The original definition was more precise.
I think If the predicate transfer is the starting point to consider whether there is a value
transfer between slots, it should be defined as so. The original sentence "is computed based
on..." is not really a definition to me.
PS3, Line 93:  *
> Not sure this part adds value. What's the significance of these statements 
Removed. I intended to use these statements to show that the symmetric closure is an equivalence
PS3, Line 277:     // Tracks all warnings (e.g. non-fatal errors) that were generated during
> The SCC-condensed graph representation of all slot value transfers.
PS3, Line 278:     // These are passed to the backend and eventually propagated to the shell.
Maps from
> public/private?
Done. private.
PS3, Line 1146:           conjunctIds.add(e.getId());
> a mutual value transfer
PS3, Line 1467:     List<TupleId> tids = Lists.newArrayList();
> how about: if every source slot has a value transfer to a slot in destTid
PS3, Line 1513:       boolean hasOuterJoinedTuple = false;
> Let's simplify the example and make it as clear as possible:
PS3, Line 1618:         // check if we already created this predicate
> Need a definition of equivalence class in the Analyzer class comment. You d
I think they are the same.
PS3, Line 1636:    * equivalent slot sets of lhsTids and rhsTids.
> Garbled sentence, please clean up
PS3, Line 1956:   public boolean isVisible(TupleId tid) {
> the registered value transfers
PS3, Line 1973:         new WritableGraph(globalState_.descTbl.getMaxSlotId().asInt() + 1);
> missing space in string
PS3, Line 1982:       RandomAccessibleGraph reference =
> Add value-transfer edges to 'g' based on the registered equi-join conjuncts
PS3, Line 2059:       if (tblRef.getJoinOp() == JoinOperator.LEFT_OUTER_JOIN
> of the given slot id
PS3, Line 2062:         g.addEdge(outerSlot.asInt(), innerSlot.asInt());
> slot -> sid
PS3, Line 2066:       }
> Unfortunate that we have to do this. Should probably clean this up later by
PS3, Line 2073:    * Time complexity: O(V) where V = number of slots
> Simpler to avoid "image set" and just use your second sentence to describe.
PS3, Line 2087:    * Time complexity: O(V) where V = number of slots
> Second sentence should be part of the equivalence class definition in the A
PS3, Line 2089:   public List<SlotId> getValueTransferTargets(SlotId srcSid) {
> We generally avoid these @param for brevity. I concede they might make sens
File fe/src/main/java/org/apache/impala/analysis/
PS3, Line 136:   public boolean localEquals(Expr that) { return true; }
> Missing implementation?
localEquals only checks members defined in this subclass. ArithmeticExprs' node-local information
is fn_, which is defined and compared in Expr.
File fe/src/main/java/org/apache/impala/analysis/
PS3, Line 687:     return Joiner.on(", ").join(strings);
> * Confusing name because we have a BinaryPredicate Expr and it's not clear 
PS3, Line 700:    * 3. For every pair of corresponding non-SlotRef exprs, localEquals returns
> I think similar() is confusing, how about matches() or equivalent()
PS3, Line 720:     if (fn_ == null || that.fn_ == null) return false; // One null, one not
> It returns whether this expr is equal to that ignoring children.
PS3, Line 721:     // Both fn_'s are not null
> Fix comment: Only one expr given to this function
PS3, Line 723:   }
> protected? Doesn't seem like a good public interface to force callers to ch
PS3, Line 728:    * Caller should guarantee that 'this' and 'that' have the same type.
> public? Move to the other predicates above. Also LOCAL_EQ_PREDICATE
Changed to SlotrefComparator and moved into SlotRef.
PS3, Line 784:     if (id_ == null) {
> update comment, should this function live in Analyzer instead?
Comment is updated and it's moved to Analyzer.
File fe/src/main/java/org/apache/impala/analysis/
PS3, Line 121:     if (v == 0 && 1.0 / v == Double.NEGATIVE_INFINITY) return false;
> Let's keep this patch focused on minimize unrelated changes like this one.
PS3, Line 182:       if (type_ == null) {
> Let's keep this patch focused on minimize unrelated changes like this one.
File fe/src/main/java/org/apache/impala/planner/
PS3, Line 522:     public boolean isCompatible(AnalyticExpr analyticExpr) {
> Please revert. This reformatted condition is pretty much incomprehensible t
File fe/src/main/java/org/apache/impala/planner/
PS3, Line 399:     boolean partitionedByRight = node.getJoinOp() == RIGHT_OUTER_JOIN ||
> I don't understand this change. What does the join op have to do with how t
This is the output partitioning of this fragment. If it is a right outer join then the output
is partitioned by the rhs join exprs, not lhs because the nulls in lhs can be in any partition.
The partitioning compatibility check doesn't use equivalence class anymore and is wrong without
File fe/src/main/java/org/apache/impala/planner/
PS3, Line 1377:       TableRef rhsTblRef = analyzer.getTableRef(rhsId);
> simplify with analyzer.getTupleId(lhsSid)
File fe/src/main/java/org/apache/impala/util/
PS3, Line 31: 
> The flow of calls/code is somewhat hard to follow for our specific use case
Specialized most of them except for reflexiveTransitiveClosure(). It is called with different
types for validation purpose.
PS3, Line 40:       sb.append(i).append(" => ");
> mention that BFS is implemented iteratively to avoid unbounded stack usage
PS3, Line 43:       }
> This can be called on any type of Graph but we don't test that it actually 
It is called on RandomAccessibleGraph and WritableGraph. The former is called in condensedReflexiveTransitiveClosure
and the latter is called in Analyzer.computeValueTransferGraph to generate a reference for
graph validation. Maybe use a precondition check to exclude SccCondensedGraph?
PS3, Line 49:   /**
> How about moving this out of the loop and preallocating it to numVertices()
PS3, Line 52:    */
> This line in particular is confusing because dsts() is abstract and it's no
Yes, reflexiveTransitiveClosure.dsts() is expensive. But I think this is more of a problem
with the dsts interface. dsts() is not a good API because:
1. SccCondensedGraph.dsts() constructs a new array every time it's called.
2. The return value of SccCondensedGraph.dsts is owned by the caller. On the other hand the
return value of RandomAccessibleGraph.dsts() and WritableGraph.dsts() is read only. There
isn't a good way to enforce the read-only property.
3. RandomAccessibleGraph just needs an int[] rather than an IntArrayList. But dsts() forces
it to use the latter.

I changed dsts() interface to returning an iterator. In this way SccCondensedGraph.dsts()
doesn't need to "decompress" the result and the caller cannot modify the data in Graphs. Does
this solve this problem?
PS3, Line 53:   public RandomAccessibleGraph reflexiveTransitiveClosure() {
> style: ++j
PS3, Line 98:     public void addEdge(int srcVid, int dstVid) {
> Duplicate edges are allowed.
PS3, Line 128:       int idx = Arrays.binarySearch(adjList_[srcVid], dstVid);
> lists
I just checked Wikipedia and a lecture note. An adjacency list is a list of lists. According
to this, "sorted adjacency list" might be vague. I added some comments there.
PS3, Line 132: 
> Construct from a list of sorted adjacency lists.
Reworded a little. But an adjacency list is a list of lists.
PS3, Line 150:     }
> for the typical -> because the typical
PS3, Line 210:      * Get the ID of the strongly connected component containing 'vid'.
> What types of input Graphs are really supported/expected?
This function is merged into condensedReflexiveTransitiveClosure(). See below.
PS3, Line 222:     /**
> This takes a Graph as input. Does it really work on any type of Graph? I do
Changed to WritableGraph.
PS3, Line 223:      * Get an array of vertices IDs in the scc. The caller shouldn't modify
the returned
> Flow here is confusing because we create several intermediate SccCondensedG
PS3, Line 257:         // SCC as vid.
> Imo this doesn't really give the intuition for why the algorithm works.
Changed to a link.
PS3, Line 258:         int unAssignedStackPos;
> lowest -> smallest?
PS3, Line 268:       int nextDfsIndex = 0;
> What kind of graph is expected here?
WritableGraph. Done.
PS3, Line 271: u
> vertex ids
PS3, Line 273:           DfsContext ctx = dfsStack.peek();
> A mapping from vertex id to its DFS preordering index. -1 means not visited
PS3, Line 282:             // All successors have been searched. Check if this is the root
of an SCC.
> * final
PS3, Line 299:               // Tree edge. DFS this successor. ctx.dstIt is not advanced until
the DFS
> int nextDfsIndex = 0;
PS3, Line 313:     ;
> typo: been
PS3, Line 359:         while (refIt.hasNext() || condensedIt.hasNext()) {
> What kind of graph is expected here?
WritableGraph. Done.
File fe/src/main/java/org/apache/impala/util/
PS3, Line 26: public class IntArrayList {
> private, you can add a data() getter
PS3, Line 29: 
> public
PS3, Line 33:     data_ = new int[capacity];
> public
PS3, Line 44:       int[] newData = new int[Math.max(data_.length * 2, capacity)];
> style: remove spaces after and before paranthesis
PS3, Line 58:   /**
> remove "some"
PS3, Line 67: 
> single line where it fits; apply everywhere in this file
PS3, Line 115: 
> Do we really need this? If not, please remove
Removed. It was used to enable Collections.sort() in Graph.validate().
File fe/src/test/java/org/apache/impala/util/
PS3, Line 49:       fail();
> We should be sure to test all public functionality. As far as I can tell th

To view, visit
To unsubscribe, visit

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If4cb1d8be46efa8fd61a97048cc79dabe2ffa51a
Gerrit-Change-Number: 8317
Gerrit-PatchSet: 4
Gerrit-Owner: Tianyi Wang <>
Gerrit-Reviewer: Alex Behm <>
Gerrit-Reviewer: Tianyi Wang <>
Gerrit-Comment-Date: Fri, 03 Nov 2017 01:36:08 +0000
Gerrit-HasComments: Yes

  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message