phoenix-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [phoenix] comnetwork commented on a change in pull request #444: PHOENIX-5148
Date Fri, 15 Mar 2019 13:33:28 GMT
comnetwork commented on a change in pull request #444: PHOENIX-5148
URL: https://github.com/apache/phoenix/pull/444#discussion_r265831803
 
 

 ##########
 File path: phoenix-core/src/it/java/org/apache/phoenix/end2end/SortMergeJoinMoreIT.java
 ##########
 @@ -423,7 +424,7 @@ public void testBug2894() throws Exception {
                         "         ) L\n" +
                         "     ON L.BUCKET = E.BUCKET AND L.TIMESTAMP = E.TIMESTAMP\n" +
                         " ) C\n" +
-                        " GROUP BY C.BUCKET, C.TIMESTAMP";
+                        " GROUP BY C.BUCKET, C.TIMESTAMP ORDER BY C.BUCKET, C.TIMESTAMP";
 
 Review comment:
   I added the order by here mainly for:
   
   1.Make the query result stable,because the original sql is only have  group by clause,
and the group by is not `orderPreserving` before this patch, so must add a additional `SORTED
BY [E.TIMESTAMP, E.BUCKET]`, but after this patch, the group by is  `orderPreserving`, so
the query result is not same as without this patch.
   
   2.To test if the GroupBy and OrderBy could both be optimized by this patch, from the explain
we can see the 
   `GROUP BY C.BUCKET, C.TIMESTAMP` is  `orderPreserving` and 
   `ORDER BY C.BUCKET, C.TIMESTAMP` is compiled out, which are meeting the expectation.
   
   If I do not add the order by, the explain assertions and query result assertions are same
as the order by is added. 
   
   Another problem  I want to explain is  why the order of query result is reversed after
this patch:
   
   the query result before this patch is :
                   assertEquals("5SEC", rs.getString(1));
                   assertEquals(1462993520000000000L, rs.getLong(2));
                   assertTrue(rs.next());
                   assertEquals("5SEC", rs.getString(1));
                   assertEquals(1462993515000000000L, rs.getLong(2));
                   assertTrue(rs.next());
                   assertEquals("5SEC", rs.getString(1));
                   assertEquals(1462993510000000000L, rs.getLong(2));
                   assertTrue(rs.next());
                   assertEquals("5SEC", rs.getString(1));
                   assertEquals(1462993505000000000L, rs.getLong(2));
                   assertTrue(rs.next());
                   assertEquals("5SEC", rs.getString(1));
                   assertEquals(1462993490000000000L, rs.getLong(2));
                   assertTrue(rs.next());
                   assertEquals("5SEC", rs.getString(1));
                   assertEquals(1462993485000000000L, rs.getLong(2));
                   assertTrue(rs.next());
                   assertEquals("5SEC", rs.getString(1));
                   assertEquals(1462993480000000000L, rs.getLong(2));
                   assertTrue(rs.next());
                   assertEquals("5SEC", rs.getString(1));
                   assertEquals(1462993475000000000L, rs.getLong(2));
                   assertTrue(rs.next());
                   assertEquals("5SEC", rs.getString(1));
                   assertEquals(1462993470000000000L, rs.getLong(2));
                   assertTrue(rs.next());
                   assertEquals("5SEC", rs.getString(1));
                   assertEquals(1462993430000000000L, rs.getLong(2));
   
   the query result after this patch is :
                  assertEquals("5SEC", rs.getString(1));
                   assertEquals(1462993430000000000L, rs.getLong(2));
                   assertTrue(rs.next());
                   assertEquals("5SEC", rs.getString(1));
                   assertEquals(1462993470000000000L, rs.getLong(2));
                   assertTrue(rs.next());
                   assertEquals("5SEC", rs.getString(1));
                   assertEquals(1462993475000000000L, rs.getLong(2));
                   assertTrue(rs.next());
                   assertEquals("5SEC", rs.getString(1));
                   assertEquals(1462993480000000000L, rs.getLong(2));
                   assertTrue(rs.next());
                   assertEquals("5SEC", rs.getString(1));
                   assertEquals(1462993485000000000L, rs.getLong(2));
                   assertTrue(rs.next());
                   assertEquals("5SEC", rs.getString(1));
                   assertEquals(1462993490000000000L, rs.getLong(2));
                   assertTrue(rs.next());
                   assertEquals("5SEC", rs.getString(1));
                   assertEquals(1462993505000000000L, rs.getLong(2));
                   assertTrue(rs.next());
                   assertEquals("5SEC", rs.getString(1));
                   assertEquals(1462993510000000000L, rs.getLong(2));
                   assertTrue(rs.next());
                   assertEquals("5SEC", rs.getString(1));
                   assertEquals(1462993515000000000L, rs.getLong(2));
                   assertTrue(rs.next());
                   assertEquals("5SEC", rs.getString(1));
                   assertEquals(1462993520000000000L, rs.getLong(2));
   
   We can see that the query result order is reversed after the patch, that is because without
this patch , `CLIENT SORTED BY [E.TIMESTAMP, E.BUCKET]` (`E.TIMESTAMP` is moved first because
is fixed width and not null) is added in order to complete the aggregate, and we have to note
is  that the `SortOrder` of `E.TIMESTAMP` is `DESC` in table `eventCountTableName`'s DDL,
 meanwhile when we add `CLIENT SORTED BY [E.TIMESTAMP, E.BUCKET]` in `ClientAggregatePlan.iterator`,
the `isAscending` parameter in following line 159 is true, so after` OrderedResultIterator`
in line 167, the result tuple is sort by the "DESC of E.TIMESTAMP",  that is to say , `order
by E.TIMESTAMP DESC,E.BUCKET ASC`
   
   With this patch, the unnecessary sort  for completing GroupBy is avoid, after the SortMergeJoin,
the ordering is `order by E.BUCKET ASC, E.TIMESTAMP ASC`, so the order of query result is
reversed after this patch when the `E.BUCKET` is same.
    
   ```
   157              List<OrderByExpression> keyExpressionOrderBy = Lists.newArrayListWithExpectedSize(keyExpressions.size());
   158                for (Expression keyExpression : keyExpressions) {
   159                    keyExpressionOrderBy.add(new OrderByExpression(keyExpression, false,
true));
   160               }
   161
   162             if (useHashAgg) {
   163                   // Pass in orderBy to apply any sort that has been optimized away
   164                   aggResultIterator = new ClientHashAggregatingResultIterator(context,
iterator, serverAggregators, keyExpressions, orderBy);
   165                } else {
   166                    iterator =
   167                            new OrderedResultIterator(iterator, keyExpressionOrderBy,
   168                                    spoolingEnabled, thresholdBytes, null, null,
   169                                    projector.getEstimatedRowByteSize());
   170                    aggResultIterator = new ClientGroupedAggregatingResultIterator(LookAheadResultIterator.wrap(iterator),
serverAggregators, keyExpressions);
   171                }
   172            }
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message