hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-7110) Implement chmod with JNI
Date Wed, 19 Jan 2011 01:11:46 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-7110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12983474#action_12983474

Todd Lipcon commented on HADOOP-7110:

Konstantin: depends which "current implementation" you're referring to.

In trunk right now, RawLocalFS uses the equivalent of system("chmod ... <path>") to
perform a chmod. This is very inefficient, since Java uses fork, not vfork, and thus incurs
a cost to set up new vm_area structures for the child process, etc. As I imagine you've seen
before, if the process has a very large heap (many GB) this can take several milliseconds.

HADOOP-6304 identifies this issue (in fact calls it an "emergency bug fix") and proposes using
the Java setReadable|setWritable|setExecutable APIs instead. These don't have the cost of
a fork, but have two problems:
1) They have limited power since in typical java fashion are a least-common-denominator API.
That is to say, they don't support setting a file to arbitrary modes, have no concept of "group"
ownership, etc. If you'll look at the implementation in that JIRA you'll see what I mean.
2) Because the APIs are awkward, the only way to get reasonably close to full chmod support
is to first chmod a file down to 000 and then work back up to the correct permissions. This
is race prone and has been causing build failures for several months (see MAPREDUCE-2238)

So, this JIRA adds the JNI call which fixes all three issues issues. It has the full ability
to chmod to any mode, is atomic, and is very efficient.

Does that answer the question?

> Implement chmod with JNI
> ------------------------
>                 Key: HADOOP-7110
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7110
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: io, native
>    Affects Versions: 0.22.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 0.22.0
>         Attachments: hadoop-7110.txt
> MapReduce is currently using a race-prone workaround to approximate chmod() because forking
chmod is too expensive. This race is causing build failures (and probably task failures too).
We should implement chmod in the NativeIO library so we can have good performance (ie not
fork) and still not be prone to races.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message