hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sahil Takiar <takiar.sa...@gmail.com>
Subject Re: Review Request 54451: HIVE-15367: CTAS with LOCATION should write temp data under location directory rather than database location
Date Wed, 07 Dec 2016 22:35:05 GMT


> On Dec. 7, 2016, 6:18 p.m., Sergio Pena wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java, line 2140
> > <https://reviews.apache.org/r/54451/diff/2/?file=1578404#file1578404line2140>
> >
> >     Shouldn't we use the parent directory instead of the location? I think CTAS
will do a rename() from the temp -> location once it finishes (test blobstores that specify
only s3a://bucket/path or s3a://bucket). 
> >     
> >     In this patch, the location will have a staging temp directory that when renamed
will rename file per file because the staging is a subdirectory of the location.

Ever since the introduction of the variable `hive.exec.stagingdir` the staging directory always
gets created inside the table location. Yes, the rename is done file by file, but I think
Hive does that intentionally in order to support HDFS encryption. I think it has some other
benefits too, like making permissions easier to deal with. HIVE-15215 is an open JIRA to see
if this behavior is necessary on blobstores. But until then, this is how Hive does things
(at least thats my understanding).


- Sahil


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54451/#review158356
-----------------------------------------------------------


On Dec. 7, 2016, 10:29 p.m., Sahil Takiar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54451/
> -----------------------------------------------------------
> 
> (Updated Dec. 7, 2016, 10:29 p.m.)
> 
> 
> Review request for hive, Sergio Pena and Yongzhi Chen.
> 
> 
> Bugs: HIVE-15367
>     https://issues.apache.org/jira/browse/HIVE-15367
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> CTAS with LOCATION should write temp data under location directory rather than database
location
> 
> 
> Diffs
> -----
> 
>   itests/hive-blobstore/src/test/queries/clientpositive/ctas.q PRE-CREATION 
>   itests/hive-blobstore/src/test/results/clientpositive/ctas.q.out PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java c88dbc8 
>   ql/src/test/queries/clientpositive/ctas_uses_table_location.q PRE-CREATION 
>   ql/src/test/results/clientpositive/ctas_uses_table_location.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/encrypted/encryption_ctas.q.out 5b503ac 
> 
> Diff: https://reviews.apache.org/r/54451/diff/
> 
> 
> Testing
> -------
> 
> Added qtests for hive-blobstore and for qtest
> 
> 
> Thanks,
> 
> Sahil Takiar
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message