drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jinfeng Ni" <...@maprtech.com>
Subject Re: Review Request 30701: DRILL-2173 partition queries for dynamic partition pruning
Date Thu, 02 Apr 2015 17:50:27 GMT


> On April 1, 2015, 1 p.m., Jinfeng Ni wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/DirectoryExplorers.java,
line 82
> > <https://reviews.apache.org/r/30701/diff/6/?file=906263#file906263line82>
> >
> >     Not sure if the case-insensitive comparison should be used as default, or it
should depend on the schema (i.e HBase schema could use a different sensitive policy from
FileSystemSchema, etc), or it should be passed in as a parameter of udf maxdir(). In the query

> >     " select * 
> >       from dfs.my_workspace.data_directory 
> >       where dir0 in (select MAX(dir0) from dfs.my_workspace.data_directory)"
> >     
> >     Aggregate function max() could use case sensitive string comparison. If this
maxdir UDF chooses to use case-insensitive, then after partition pruning, it might return
different query results.
> 
> Jason Altekruse wrote:
>     The primary use case we had in mind with this feature was actually just finding recent
data, so all of the partition names were numeric. For the sake of date formats that are arranged
such that a string comparison can give the corret result, ie YYYY-MM-DD or similar, the case
sensitivity wouldn't matter. I think there are a lot of possibilities of ways that users might
want to query there partition information, and I think it might be best to leave open the
interface for writing custom UDFs in these cases. I could pass a flag to this UDF, or wriate
another to do the same operation but case-sensitively.

Make sense either add a flag, or provide another UDF which will use case-sensitive compare.
(similar to the function implementation "like" and "ilike").


- Jinfeng


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30701/#review78562
-----------------------------------------------------------


On March 25, 2015, 5:54 p.m., Jason Altekruse wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30701/
> -----------------------------------------------------------
> 
> (Updated March 25, 2015, 5:54 p.m.)
> 
> 
> Review request for drill, Jacques Nadeau, Mehant Baid, Parth Chandra, and Venki Korukanti.
> 
> 
> Bugs: DRILL-2173
>     https://issues.apache.org/jira/browse/DRILL-2173
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> Adds a new interface for UDFs to access partition information. Together with 2060 which
allows constant expression folding this will allow UDFs that can query against partition information
and then scan a subset of data. Example use case, find the most recent directory and only
that partition worth of data.
> 
> 
> Diffs
> -----
> 
>   contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseSchemaFactory.java
7b76092 
>   contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/schema/HiveSchemaFactory.java
023517b 
>   contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/schema/MongoSchemaFactory.java
32c42ba 
>   exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/FunctionConverter.java ab121b0

>   exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/DirectoryExplorers.java
PRE-CREATION 
>   exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/interpreter/InterpreterEvaluator.java
35c35ec 
>   exec/java-exec/src/main/java/org/apache/drill/exec/ops/FragmentContext.java 5e31e5c

>   exec/java-exec/src/main/java/org/apache/drill/exec/ops/QueryContext.java 3b51a69 
>   exec/java-exec/src/main/java/org/apache/drill/exec/ops/UdfUtilities.java f7a1a04 
>   exec/java-exec/src/main/java/org/apache/drill/exec/store/AbstractSchema.java 90e3ef4

>   exec/java-exec/src/main/java/org/apache/drill/exec/store/AbstractStoragePlugin.java
b032fce 
>   exec/java-exec/src/main/java/org/apache/drill/exec/store/PartitionExplorer.java PRE-CREATION

>   exec/java-exec/src/main/java/org/apache/drill/exec/store/PartitionExplorerImpl.java
PRE-CREATION 
>   exec/java-exec/src/main/java/org/apache/drill/exec/store/PartitionNotFoundException.java
PRE-CREATION 
>   exec/java-exec/src/main/java/org/apache/drill/exec/store/SchemaPartitionExplorer.java
PRE-CREATION 
>   exec/java-exec/src/main/java/org/apache/drill/exec/store/SubSchemaWrapper.java 2c0d8b8

>   exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/FileSystemSchemaFactory.java
4a3eba9 
>   exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/WorkspaceSchemaFactory.java
7c8d9b3 
>   exec/java-exec/src/test/java/org/apache/drill/exec/fn/interp/TestConstantFolding.java
PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/30701/diff/
> 
> 
> Testing
> -------
> 
> Test have been run on a very recent version, made a few minor cleanup edits since, waiting
on another run, but do not anticipate issues.
> 
> 
> Thanks,
> 
> Jason Altekruse
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message