drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Venki Korukanti" <venki.koruka...@gmail.com>
Subject Re: Review Request 38796: DRILL-3209: Support reading Hive tables using Drill's native parquet reader
Date Tue, 29 Sep 2015 16:23:50 GMT


> On Sept. 28, 2015, 10:18 p.m., Aman Sinha wrote:
> > contrib/storage-hive/core/src/main/java/org/apache/drill/exec/planner/sql/logical/ConvertHiveParquetScanToDrillParquetScan.java,
line 64
> > <https://reviews.apache.org/r/38796/diff/2/?file=1085484#file1085484line64>
> >
> >     The name of a function should not have a product name in it..

I am using the existing function in master. It is a convert function which converts data from
IMPALA/Hive specific format to drill format. Let me know if you want this to be changed. I
can log a separate jira to track this.


> On Sept. 28, 2015, 10:18 p.m., Aman Sinha wrote:
> > contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveDrillNativeParquetScan.java,
line 63
> > <https://reviews.apache.org/r/38796/diff/2/?file=1085485#file1085485line63>
> >
> >     Since the RecordCount is the same regardless of the type of the reader, we should
not divide it by the factor.  Dividing the cpu cost and disk cost seems ok.

If I understand correctly, we are using only the rowcount while caclulating the self cost
of the scan in ScanPrel.computeSelfCost. So we need to alter the rowcount here.


> On Sept. 28, 2015, 10:18 p.m., Aman Sinha wrote:
> > contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveDrillNativeScanBatchCreator.java,
line 123
> > <https://reviews.apache.org/r/38796/diff/2/?file=1085487#file1085487line123>
> >
> >     Should there be a consideration for complex type data ?  similar to that of
the ParquetScanBatchCreator ?

Currently we don't support Hive complex types. Added the check in the rule, so that we exit
early.


- Venki


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38796/#review100878
-----------------------------------------------------------


On Sept. 29, 2015, 9:23 a.m., Venki Korukanti wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/38796/
> -----------------------------------------------------------
> 
> (Updated Sept. 29, 2015, 9:23 a.m.)
> 
> 
> Review request for drill and Jinfeng Ni.
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> Please jira DRILL-3209 for details.
> 
> 
> Diffs
> -----
> 
>   contrib/storage-hive/core/src/main/java/org/apache/drill/exec/planner/sql/HivePartitionDescriptor.java
11c6455 
>   contrib/storage-hive/core/src/main/java/org/apache/drill/exec/planner/sql/logical/ConvertHiveParquetScanToDrillParquetScan.java
PRE-CREATION 
>   contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveDrillNativeParquetScan.java
PRE-CREATION 
>   contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveDrillNativeParquetSubScan.java
PRE-CREATION 
>   contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveDrillNativeScanBatchCreator.java
PRE-CREATION 
>   contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveScan.java
9ada569 
>   contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveStoragePlugin.java
23aa37f 
>   contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveSubScan.java
2181c2a 
>   contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/schema/DrillHiveTable.java
b459ee4 
>   contrib/storage-hive/core/src/test/java/org/apache/drill/exec/TestHivePartitionPruning.java
f0b4bdc 
>   contrib/storage-hive/core/src/test/java/org/apache/drill/exec/TestHiveProjectPushDown.java
6423a36 
>   contrib/storage-hive/core/src/test/java/org/apache/drill/exec/hive/TestHiveStorage.java
9211af6 
>   contrib/storage-hive/core/src/test/java/org/apache/drill/exec/hive/TestInfoSchemaOnHiveStorage.java
6118be5 
>   contrib/storage-hive/core/src/test/java/org/apache/drill/exec/store/hive/HiveTestDataGenerator.java
34a7ed6 
>   exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java 66f9f03 
>   exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java
5838bd1 
> 
> Diff: https://reviews.apache.org/r/38796/diff/
> 
> 
> Testing
> -------
> 
> Added unittests to test reading all supported types, project pushdown and partition pruning.
Manually tested with Hive tables containing large amount of data (these tests will become
part of the regression suite).
> 
> 
> Thanks,
> 
> Venki Korukanti
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message