hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron Kimball (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-5958) Use JDK 1.6 File APIs in DF.java wherever possible
Date Fri, 27 Nov 2009 23:31:20 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-5958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12783187#action_12783187

Aaron Kimball commented on HADOOP-5958:

If java.io.File is somehow "faulty," then there are probably much, much larger problems with
the given platform. This case is not worth worrying about in detail.

You're correct that we could simply stitch the java.io.File methods directly into the existing
DF implementation. But then there's still the subprocess being launched via {{fork()}} and
not {{vfork()}}; the intent of this issue seems to be to eliminate spurious memory overallocation
spikes when a {{df}} process is exec'd, as well as to eliminate dependencies on platform-specific
tools when possible. 

So maybe we should do both of the things you suggest:

# Replace uses of {{DF}} with direct uses of {{File}} so that places that don't need to shell
out to a subprocess, don't.
# Replace the relevant internal methods of {{DF}} with uses of {{File}} so that the result
of using {{DF}} always matches the result returned by {{File}}.

This should avoid incompatible changes and get us closer to platform independence. (Currently
there's no way to get the mount points within Java, so we can't just ditch DF wholesale. Maybe
with Java7....)

> Use JDK 1.6 File APIs in DF.java wherever possible
> --------------------------------------------------
>                 Key: HADOOP-5958
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5958
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>            Reporter: Devaraj Das
>            Assignee: Aaron Kimball
>             Fix For: 0.22.0
>         Attachments: HADOOP-5958-hdfs.patch, HADOOP-5958-mapred.patch, HADOOP-5958.2.patch,
HADOOP-5958.3.patch, HADOOP-5958.4.patch, HADOOP-5958.patch
> JDK 1.6 has File APIs like File.getFreeSpace() which should be used instead of spawning
a command process for getting the various disk/partition related attributes. This would avoid
spikes in memory consumption by tasks when things like LocalDirAllocator is used for creating
paths on the filesystem.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message