hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6937) Another issue in handling checksum errors in write pipeline
Date Thu, 26 May 2016 12:13:13 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15301992#comment-15301992
] 

Hadoop QA commented on HDFS-6937:
---------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s {color} | {color:blue}
Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green}
The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color}
| {color:green} The patch appears to include 4 new or modified test files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 47s {color}
| {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s {color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 33s {color}
| {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s {color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s {color}
| {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 48s {color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 12s {color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 53s {color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s {color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 46s {color} | {color:green}
the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 33s {color} | {color:red}
hadoop-hdfs-project/hadoop-hdfs: patch generated 33 new + 529 unchanged - 4 fixed = 562 total
(was 533) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 54s {color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s {color}
| {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red}
The patch has 25 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 59s {color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s {color} | {color:green}
the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 9s {color} | {color:red}
hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 36s {color}
| {color:green} Patch does not generate ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 84m 56s {color} | {color:black}
{color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestDatanodeDeath |
|   | hadoop.hdfs.server.datanode.TestDiskError |
|   | hadoop.hdfs.TestDFSClientExcludedNodes |
|   | hadoop.hdfs.TestPipelineRecovery |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
|   | hadoop.hdfs.TestDecommissionWithStriped |
|   | hadoop.hdfs.TestAsyncDFSRename |
|   | hadoop.hdfs.server.namenode.ha.TestPipelinesFailover |
|   | hadoop.hdfs.tools.TestDebugAdmin |
|   | hadoop.hdfs.TestFileAppend |
| Timed out junit tests | org.apache.hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes
|
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:2c91fd8 |
| JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12806312/HDFS-6937.001.patch
|
| JIRA Issue | HDFS-6937 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  unit  findbugs
 checkstyle  |
| uname | Linux cd955b145953 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12
UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 77202fa |
| Default Java | 1.8.0_91 |
| findbugs | v3.0.0 |
| checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/15576/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
|
| whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/15576/artifact/patchprocess/whitespace-eol.txt
|
| unit | https://builds.apache.org/job/PreCommit-HDFS-Build/15576/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
|
| unit test logs |  https://builds.apache.org/job/PreCommit-HDFS-Build/15576/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
|
|  Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/15576/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs |
| Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/15576/console |
| Powered by | Apache Yetus 0.2.0   http://yetus.apache.org |


This message was automatically generated.



> Another issue in handling checksum errors in write pipeline
> -----------------------------------------------------------
>
>                 Key: HDFS-6937
>                 URL: https://issues.apache.org/jira/browse/HDFS-6937
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode, hdfs-client
>    Affects Versions: 2.5.0
>            Reporter: Yongjun Zhang
>            Assignee: Yongjun Zhang
>         Attachments: HDFS-6937.001.patch
>
>
> Given a write pipeline:
> DN1 -> DN2 -> DN3
> DN3 detected cheksum error and terminate, DN2 truncates its replica to the ACKed size.
Then a new pipeline is attempted as
> DN1 -> DN2 -> DN4
> DN4 detects checksum error again. Later when replaced DN4 with DN5 (and so on), it failed
for the same reason. This led to the observation that DN2's data is corrupted. 
> Found that the software currently truncates DN2's replca to the ACKed size after DN3
terminates. But it doesn't check the correctness of the data already written to disk.
> So intuitively, a solution would be, when downstream DN (DN3 here) found checksum error,
propagate this info back to upstream DN (DN2 here), DN2 checks the correctness of the data
already written to disk, and truncate the replica to to MIN(correctDataSize, ACKedSize).
> Found this issue is similar to what was reported by HDFS-3875, and the truncation at
DN2 was actually introduced as part of the HDFS-3875 solution. 
> Filing this jira for the issue reported here. HDFS-3875 was filed by [~tlipcon]
> and found he proposed something similar there.
> {quote}
> if the tail node in the pipeline detects a checksum error, then it returns a special
error code back up the pipeline indicating this (rather than just disconnecting)
> if a non-tail node receives this error code, then it immediately scans its own block
on disk (from the beginning up through the last acked length). If it detects a corruption
on its local copy, then it should assume that it is the faulty one, rather than the downstream
neighbor. If it detects no corruption, then the faulty node is either the downstream mirror
or the network link between the two, and the current behavior is reasonable.
> {quote}
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message