drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Hsuan-Yi Chu" <hsua...@usc.edu>
Subject Re: Review Request 32248: DRILL-2139: Star is not expanded correctly in "select distinct" query
Date Thu, 19 Mar 2015 19:52:13 GMT


> On March 19, 2015, 6:35 p.m., Jinfeng Ni wrote:
> > exec/java-exec/src/test/java/org/apache/drill/TestExampleQueries.java, line 747
> > <https://reviews.apache.org/r/32248/diff/1/?file=900143#file900143line747>
> >
> >     Have you consider the case where * comes from join?
> >     
> >     select distinct * from dept, emp where ...
> >     
> >     Seems the code you added will not work properly for such case.

Hmmm~ I am not sure why that case could fail. 

As soon as Aggr sees '*' in the plan, it knows it should group-by respect to all the columns.
This logic seems to hold for JOIN as well to me.

Besides, I tried some simple casesm such as:

String queryDistinct = String.format("select distinct * " +
              "from dfs.`%s` t1, dfs.`%s` t2 " +
              "where (t1.a1 = t2.a1);", root, root);
(t1 and t2 are simply .json)

DRILL gave the same result as Postgres.


> On March 19, 2015, 6:35 p.m., Jinfeng Ni wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/HashAggBatch.java,
line 72
> > <https://reviews.apache.org/r/32248/diff/1/?file=900140#file900140line72>
> >
> >     Can you show example where you will expect to see "*" in the aggregate expression?
For distinct * over schemaless table, I assume the * will appear in the GroupBy keys, not
in the aggregated expressions. For count(*) as the agg expression, planner will replace *
with a constant. So, I could not see why it is necessary to check if there is star in the
aggregated expression.

Oh... I was not aware of the planner's smart substitution of * in that case.

As of now, I agree with you. But let me think more about it. Thanks.


- Sean Hsuan-Yi


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32248/#review77076
-----------------------------------------------------------


On March 19, 2015, 6:31 p.m., Sean Hsuan-Yi Chu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32248/
> -----------------------------------------------------------
> 
> (Updated March 19, 2015, 6:31 p.m.)
> 
> 
> Review request for drill, Aman Sinha and Jinfeng Ni.
> 
> 
> Bugs: DRILL-2139
>     https://issues.apache.org/jira/browse/DRILL-2139
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> Expand * at the run time
> 
> 
> Diffs
> -----
> 
>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/HashAggBatch.java
c29fbf2 
>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/StreamingAggBatch.java
33d2c7a 
>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/ExternalSortBatch.java
a23780e 
>   exec/java-exec/src/test/java/org/apache/drill/TestExampleQueries.java d2d97f8 
>   exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/agg/TestHashAggr.java
3786bfd 
>   exec/java-exec/src/test/resources/store/text/data/repeatedRows.json PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/32248/diff/
> 
> 
> Testing
> -------
> 
> Unit and all QA tests passed.
> 
> 
> Thanks,
> 
> Sean Hsuan-Yi Chu
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message