hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-33) DF enhancement: performance and win XP support
Date Mon, 27 Feb 2006 18:39:35 GMT
    [ http://issues.apache.org/jira/browse/HADOOP-33?page=comments#action_12368006 ] 

Doug Cutting commented on HADOOP-33:

Yes, DF_INTERVAL should be configurable.

Caching inside DF sounds fine.  We'd then want to add a DF field to FSDataset, so that we
always reuse the same instance.

By minimizing the code I primarily mean minimal total code committed to the repository.  Minimizing
the size of patches is also good, since it makes it easier to understand.

I do not see how removing the dependency on cygwin in this one case helps the project: it
makes it bigger but adds no functionality and removes no dependencies.  Dependencies are also
not bad: we don't want to re-invent things.  Cygwin has already solved this problem (and some
others) for us permitting us to focus on Hadoop's more critical issues.

> DF enhancement: performance and win XP support
> ----------------------------------------------
>          Key: HADOOP-33
>          URL: http://issues.apache.org/jira/browse/HADOOP-33
>      Project: Hadoop
>         Type: Improvement
>   Components: fs, dfs
>  Environment: Unix, Cygwin, Win XP
>     Reporter: Konstantin Shvachko
>     Priority: Minor
>  Attachments: DF.patch, DFpatch.txt
> 1. DF is called twice for each heartbeat, which happens each 3 seconds.
> There is a simple fix for that in the attached patch.
> 2. cygwin is required to run df program in windows environment.
> There is a class org.apache.commons.io.FileSystemUtils, which can return disk free space
> for different OSs, but it does not have means to get disk capacity.
> In general in windows there is no efficient and uniform way to calculate disk capacity
> using a shell command.
> The choices are 'chkdsk' and 'defrag -a', but both of them are too slow to be called
> every 3 seconds.
> WinXP and 2003 server have a new tool called fsutil, which provides all necessary info.
> I implemented a call to fsutil in case df fails, and the OS is right.
> Other win versions should still run cygwin.
> I tested this fetaure for linux, winXP and cygwin.
> See attached patch.

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators:
For more information on JIRA, see:

View raw message