hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ashutosh Chauhan" <hashut...@apache.org>
Subject Re: Review Request 12050: HIVE-3756 (LOAD DATA does not honor permission inheritance)
Date Wed, 03 Jul 2013 01:46:16 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12050/#review22698
-----------------------------------------------------------



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
<https://reviews.apache.org/r/12050/#comment46400>

    I had quite a discussion on this with Rohini on HIVE-2936 on how to do these fs ops in
a performant and robust way. Feel free to follow that.



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
<https://reviews.apache.org/r/12050/#comment46401>

    Why not use api instead of FsShell? Does filesystem doesn't offer an api for recursively
doing chmod and chgrp?



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
<https://reviews.apache.org/r/12050/#comment46398>

    I think following way to write this is more robust:
    if(inheritPerms) {
    fs.mkdirs(destfp, destfp.getParent().getPerms);
    } else {
    fs.mkdirs(destfp);
    }



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
<https://reviews.apache.org/r/12050/#comment46399>

    same as previous comment.


Couple of comments on api usage.

- Ashutosh Chauhan


On July 2, 2013, 4:39 p.m., Chaoyu Tang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/12050/
> -----------------------------------------------------------
> 
> (Updated July 2, 2013, 4:39 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE- and HIVE-3756
>     https://issues.apache.org/jira/browse/HIVE-
>     https://issues.apache.org/jira/browse/HIVE-3756
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Problems:
> 1. When doing load data or insert overwrite to a table, the data files under database/table
directory could not inherit their parent's permissions (i.e. group) as described in HIVE-3756.
> 2. Beside the group issue, the read/write permission mode is also not inherited
> 3. Same problem affects the partition files (see HIVE-3094)
> 
> Cause:
> The task results (from load data or insert overwrite) are initially stored in scratchdir
and then loaded under warehouse table directory. FileSystem.rename is used in this step (e.g.
LoadTable/LoadPartition) to move the dirs/files but it preserves their permissions (including
group and mode) which are determined by scratchdir permission or umask. If the scratchdir
has different permissions from those of warehouse table directories, the problem occurs.
> 
> Solution:
> After the FileSystem.rename is called, changing all renamed (moved) files/dirs to their
destination parents' permissions if needed (say if hive.warehouse.subdir.inherit.perms is
true). Here I introduced a new method renameFile doing both rename and permission. It replaces
the FileSystem.rename used in LoadTable/LoadPartition. I do not replace rename used to move
files/dirs under same scratchdir in the middle of task processing. It looks to me not necessary
since they are temp files and also probably access protected by top scratchdir mode 700 (HIVE-4487).
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 17daaa1 
> 
> Diff: https://reviews.apache.org/r/12050/diff/
> 
> 
> Testing
> -------
> 
> The following cases tested that all created subdirs/files inherit their parents' permission
mode and group in : 1). create database; 2). create table; 3). load data; 4) insert overwrite;
5) partitions.
> {code}
> hive> dfs -ls -d /user/tester1/hive;                                             
                         
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:20 /user/tester1/hive
> 
> hive> create database tester1 COMMENT 'Database for user tester1' LOCATION '/user/tester1/hive/tester1.db';
> hive> dfs -ls -R /user/tester1/hive;                                             
                          
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:21 /user/tester1/hive/tester1.db
> 
> hive> use tester1;
> hive>  create table tester1.tst1(col1 int, col2 string) ROW FORMAT DELIMITED FIELDS
TERMINATED BY ',' STORED AS TEXTFILE;
> hive> dfs -ls -R /user/tester1/hive;                                             
                                      
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:22 /user/tester1/hive/tester1.db
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:22 /user/tester1/hive/tester1.db/tst1
> 
> hive>  load data local inpath '/home/tester1/tst1.input' into table tst1;        
                                      
> hive> dfs -ls -R /user/tester1/hive;                                     
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:22 /user/tester1/hive/tester1.db
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:23 /user/tester1/hive/tester1.db/tst1
> -rw-rw----   3 tester1 testgroup123        168 2013-06-22 13:23 /user/tester1/hive/tester1.db/tst1/tst1.input
> 
> hive> create table tester1.tst2(col1 int, col2 string) ROW FORMAT DELIMITED FIELDS
TERMINATED BY ',' STORED AS SEQUENCEFILE;
> hive> dfs -ls -R /user/tester1/hive;                                             
                                      
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:24 /user/tester1/hive/tester1.db
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:23 /user/tester1/hive/tester1.db/tst1
> -rw-rw----   3 tester1 testgroup123        168 2013-06-22 13:23 /user/tester1/hive/tester1.db/tst1/tst1.input
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:24 /user/tester1/hive/tester1.db/tst2
> 
> hive> insert overwrite table tst2 select * from tst1;
> hive> dfs -ls -R /user/tester1/hive;                 
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:25 /user/tester1/hive/tester1.db
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:23 /user/tester1/hive/tester1.db/tst1
> -rw-rw----   3 tester1 testgroup123        168 2013-06-22 13:23 /user/tester1/hive/tester1.db/tst1/tst1.input
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:25 /user/tester1/hive/tester1.db/tst2
> -rw-rw----   3 tester1 testgroup123        291 2013-06-22 13:25 /user/tester1/hive/tester1.db/tst2/000000_0
> 
> hive> create table tester1.tst3(col2 string) PARTITIONED BY (col1 int) ROW FORMAT
DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE;
> hive> dfs -ls -R /user/tester1/hive;                                             
                                      
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:27 /user/tester1/hive/tester1.db
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:23 /user/tester1/hive/tester1.db/tst1
> -rw-rw----   3 tester1 testgroup123        168 2013-06-22 13:23 /user/tester1/hive/tester1.db/tst1/tst1.input
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:25 /user/tester1/hive/tester1.db/tst2
> -rw-rw----   3 tester1 testgroup123        291 2013-06-22 13:25 /user/tester1/hive/tester1.db/tst2/000000_0
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:27 /user/tester1/hive/tester1.db/tst3
> 
> hive> set hive.exec.dynamic.partition.mode=nonstrict;                            
                                      
> hive> insert overwrite table tester1.tst3 partition (col1) select t1.col2, t1.col1
from tst1 t1;
> hive> dfs -ls -R /user/tester1/hive;                                             
              
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:27 /user/tester1/hive/tester1.db
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:23 /user/tester1/hive/tester1.db/tst1
> -rw-rw----   3 tester1 testgroup123        168 2013-06-22 13:23 /user/tester1/hive/tester1.db/tst1/tst1.input
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:25 /user/tester1/hive/tester1.db/tst2
> -rw-rw----   3 tester1 testgroup123        291 2013-06-22 13:25 /user/tester1/hive/tester1.db/tst2/000000_0
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:28 /user/tester1/hive/tester1.db/tst3
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:28 /user/tester1/hive/tester1.db/tst3/col1=1111
> -rw-rw----   3 tester1 testgroup123         51 2013-06-22 13:28 /user/tester1/hive/tester1.db/tst3/col1=1111/000000_0
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:28 /user/tester1/hive/tester1.db/tst3/col1=2222
> -rw-rw----   3 tester1 testgroup123         51 2013-06-22 13:28 /user/tester1/hive/tester1.db/tst3/col1=2222/000000_0
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:28 /user/tester1/hive/tester1.db/tst3/col1=3333
> -rw-rw----   3 tester1 testgroup123         51 2013-06-22 13:28 /user/tester1/hive/tester1.db/tst3/col1=3333/000000_0
> {code}
> 
> 
> Thanks,
> 
> Chaoyu Tang
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message