hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From j.prasant...@gmail.com
Subject Re: Review Request 62098: HIVE-17403: Fail concatenation for unmanaged and transactional tables
Date Wed, 06 Sep 2017 00:26:05 GMT


> On Sept. 6, 2017, 12:08 a.m., Sergey Shelukhin wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java
> > Line 231 (original), 256 (patched)
> > <https://reviews.apache.org/r/62098/diff/1/?file=1815918#file1815918line256>
> >
> >     hmm... is this necessary?
> >     1) how does this interact with Hive duplicate file detection that can potentially
happen later?
> >     2) on the cloud, renaming files is very slow (one of the main reasons for MM
tables). We should not rename unless it's really needed.

Yes. This is necessary. If there are no incompatible files then this will essentially be directory
rename. If there are incompatible files and if the filename does not match Hive's convention,
then this has to go through file-by-file renaming to staging directory. On the cloud (or on-prem),
this will only affect users who already have managed or unmanaged hive tables with files externally
loaded (via copy or load data). In which case, renaming has to be done to avoid data loss
(more info in the jira). Also this patch disables concatenation on external/unmanaged tables
so it won't affect users with this patch. 

Hive's duplicate file detection, does not delete files if it has _copy_ suffix (which this
patch introduces on conflicts).


- Prasanth_J


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62098/#review184607
-----------------------------------------------------------


On Sept. 5, 2017, 9:44 p.m., Prasanth_J wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62098/
> -----------------------------------------------------------
> 
> (Updated Sept. 5, 2017, 9:44 p.m.)
> 
> 
> Review request for hive and Sergey Shelukhin.
> 
> 
> Bugs: HIVE-17403
>     https://issues.apache.org/jira/browse/HIVE-17403
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> HIVE-17403: Fail concatenation for unmanaged and transactional tables
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java b3ef9169c25c36c3d6c845f5000874fa78e51f82

>   ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java dfad6c192947c9ac80a1bbd86665f46aab128453

>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java aca99f2d833822a44f373ded4257af3589707baa

>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java feacdd8b605eb75166155fa2e7a1692ad4d52bd0

>   ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 230ca47e4a667b297cf2a6fef90dd18cf3d1a1c3

>   ql/src/test/queries/clientnegative/merge_negative_4.q PRE-CREATION 
>   ql/src/test/queries/clientnegative/merge_negative_5.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/orc_merge13.q PRE-CREATION 
>   ql/src/test/results/clientnegative/merge_negative_3.q.out 906336d4d3ea77ca3174a58fad05668d569f7492

>   ql/src/test/results/clientnegative/merge_negative_4.q.out PRE-CREATION 
>   ql/src/test/results/clientnegative/merge_negative_5.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/orc_merge13.q.out PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/62098/diff/1/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Prasanth_J
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message