hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-11311) HDFS fsck continues to report all blocks present when DataNode is restarted with empty data directories
Date Tue, 07 Feb 2017 13:28:41 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15855977#comment-15855977
] 

Hadoop QA commented on HDFS-11311:
----------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 22s{color} | {color:blue}
Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  0s{color} |
{color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m  0s{color}
| {color:green} The patch appears to include 1 new or modified test files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m  2s{color}
| {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 48s{color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 28s{color}
| {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 53s{color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 14s{color}
| {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 47s{color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 41s{color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 50s{color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 50s{color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 50s{color} | {color:green}
the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  0m 29s{color}
| {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 119 unchanged
- 0 fixed = 120 total (was 119) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 57s{color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 11s{color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m  0s{color}
| {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 59s{color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 39s{color} |
{color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 71m 13s{color} | {color:red}
hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 21s{color}
| {color:green} The patch does not generate ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 98m 15s{color} | {color:black}
{color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation |
|   | hadoop.hdfs.server.namenode.TestNamenodeCapacityReport |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HDFS-11311 |
| JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12851364/HDFS-11311.001.patch
|
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  unit  findbugs
 checkstyle  |
| uname | Linux 96c9d9404e39 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016
x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 9dbfab1 |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
| checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/18337/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
|
| unit | https://builds.apache.org/job/PreCommit-HDFS-Build/18337/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
|
|  Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/18337/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs |
| Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/18337/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> HDFS fsck continues to report all blocks present when DataNode is restarted with empty
data directories
> -------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-11311
>                 URL: https://issues.apache.org/jira/browse/HDFS-11311
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.7.3, 3.0.0-alpha1
>            Reporter: André Frimberger
>         Attachments: HDFS-11311.001.patch, HDFS-11311-branch-3.0.0-alpha2.001.patch,
HDFS-11311.reproduce.patch
>
>
> During cluster maintenance, we had to change parameters of the underlying disk filesystem
and we stopped the DataNode, reformatted all of its data directories and started the DataNode
again in under 10 minutes with no data and only the {{VERSION}} file present. Running fsck
afterwards reports that all blocks are fully replicated, which does not reflect the true state
of HDFS. If an administrator trusts {{fsck}} and continues to replace further DataNodes, *data
will be lost!*
> Steps to reproduce:
> 1. Shutdown DataNode
> 2. Remove all BlockPools from all data directories (only {{VERSION}} file is present)
> 3. Startup DataNode in under 10.5 minutes
> 4. Run {{hdfs fsck /}}
> *Actual result:* Average replication is falsely shown as 3.0
> *Expected result:* Average replication factor is < 3.0
> *Workaround:* Trigger a block report with {{hdfs dfsadmin -triggerBlockReport $dn_host:$ipc_port}}
> *Cause:* The first block report is handled differently by NameNode and only added blocks
are respected. This behaviour was introduced in HDFS-7980 for performance reasons. But is
applied too widely and in our case data can be lost.
> *Fix:* We suggest using stricter conditions on applying {{processFirstBlockReport}} in
{{BlockManager:processReport()}}:
> Change
> {code}
> if (storageInfo.getBlockReportCount() == 0) {
>     // The first block report can be processed a lot more efficiently than
>     // ordinary block reports.  This shortens restart times.
>     processFirstBlockReport(storageInfo, newReport);
> } else {
>     invalidatedBlocks = processReport(storageInfo, newReport);
> }
> {code}
> to
> {code}
> if (storageInfo.getBlockReportCount() == 0 && storageInfo.getState() != State.FAILED
&& newReport.getNumberOfBlocks() > 0) {
>     // The first block report can be processed a lot more efficiently than
>     // ordinary block reports.  This shortens restart times.
>     processFirstBlockReport(storageInfo, newReport);
> } else {
>     invalidatedBlocks = processReport(storageInfo, newReport);
> }
> {code}
> In case the DataNode reports no blocks for a data directory, it might be a new DataNode
or the data directory may have been emptied for whatever reason (offline replacement of storage,
reformatting of data disk, etc.). In either case, the changes should be reflected in the output
of {{fsck}} in less than 6 hours to prevent data loss due to misleading output.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message