hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3585) Hardware Failure Monitoring in large clusters running Hadoop/HDFS
Date Wed, 13 Aug 2008 06:25:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12622114#action_12622114
] 

Hadoop QA commented on HADOOP-3585:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12387992/HADOOP-3585.3.patch
  against trunk revision 685425.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 1 new or modified tests.

    -1 javadoc.  The javadoc tool appears to have generated 1 warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    -1 release audit.  The applied patch generated 279 release audit warnings (more than the
trunk's current 274 warnings).

    -1 core tests.  The patch failed core unit tests.

    -1 contrib tests.  The patch failed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3051/testReport/
Release audit warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3051/artifact/trunk/current/releaseAuditDiffWarnings.txt
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3051/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3051/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3051/console

This message is automatically generated.

> Hardware Failure Monitoring in large clusters running Hadoop/HDFS
> -----------------------------------------------------------------
>
>                 Key: HADOOP-3585
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3585
>             Project: Hadoop Core
>          Issue Type: New Feature
>         Environment: Linux
>            Reporter: Ioannis Koltsidas
>            Priority: Minor
>         Attachments: FailMon-standalone.zip, failmon.pdf, failmon.pdf, failmon2.pdf,
FailMon_Package_descrip.html, FailMon_QuickStart.html, HADOOP-3585.2.patch, HADOOP-3585.3.patch,
HADOOP-3585.patch, HADOOP-3585.patch
>
>   Original Estimate: 480h
>  Remaining Estimate: 480h
>
> At IBM we're interested in identifying hardware failures on large clusters running Hadoop/HDFS.
We are working on a framework that will enable nodes to identify failures on their hardware
using the Hadoop log, the system log and various OS hardware diagnosing utilities. The implementation
details are not very clear, but you can see a draft of our design in the attached document.
We are pretty interested in Hadoop and system logs from failed machines, so if you are in
possession of such, you are very welcome to contribute them; they would be of great value
for hardware failure diagnosing.
> Some details about our design can be found in the attached document failmon.doc. More
details will follow in a later post.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message