hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Nauroth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-14323) Reduce number of FS permissions and redundant FS operations
Date Mon, 25 Jul 2016 19:47:21 GMT

    [ https://issues.apache.org/jira/browse/HIVE-14323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15392555#comment-15392555
] 

Chris Nauroth commented on HIVE-14323:
--------------------------------------

[~rajesh.balamohan], thank you for the patch.

Is the change in {{FileUtils#mkdir}} required?  It appears that the {{inheritPerms}} argument
is already intended to capture the setting of {{HIVE_WAREHOUSE_SUBDIR_INHERIT_PERMS}}, so
looking it up again within the method might be confusing.  I see some call sites pass along
the value of that property and others hard-code it.  I see your patch is also updating some
of those call sites to respect the configuration.  Do you think this change should be handled
completely by updating the call sites?

{code}
-            if (fs.exists(ptnPath)){
-              fs.delete(ptnPath,true);
+            try {
+              fs.delete(ptnPath, true);
+            } catch (IOException ioe) {
+              //ignore
             }
{code}

I think the intent here is "try the delete, and if the path doesn't exist, just keep going."
 Catching every {{IOException}} could mask other I/O errors though.  Right now, exceptions
would propagate out to a wider {{catch (Exception)}} block, where there is additional cleanup
logic.  I wonder if catching every {{IOException}} would harm this cleanup logic.

According to the [FileSystem Specification|http://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/filesystem/filesystem.html]
for delete, if there is a recursive delete attempted on a path that doesn't exist, then it
fails by returning {{false}}, not throwing an exception.  There are contract tests that verify
this behavior too.

{code}
      LOG.info("Patch..checking isEmptyPath for : " + dirPath);
{code}

Is this a leftover log statement from debugging, or is it intentional to include it in the
patch?

I don't feel confident commenting on the logic in {{Hive#replaceFiles}}, so I'll defer to
others more familiar with Hive to review that part.

> Reduce number of FS permissions and redundant FS operations
> -----------------------------------------------------------
>
>                 Key: HIVE-14323
>                 URL: https://issues.apache.org/jira/browse/HIVE-14323
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Rajesh Balamohan
>            Assignee: Rajesh Balamohan
>            Priority: Minor
>         Attachments: HIVE-14323.1.patch
>
>
> Some examples are given below.
> 1. When creating stage directory, FileUtils sets the directory permissions by running
a set of chgrp and chmod commands. In systems like S3, this would not be relevant.
> 2. In some cases, fs.delete() is followed by fs.exists(). In this case, it might be redundant
to check for exists() (lookup ops are expensive in systems like S3). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message