hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8344) NameNode doesn't recover lease for files with missing blocks
Date Mon, 11 May 2015 22:48:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538810#comment-14538810
] 

Hadoop QA commented on HDFS-8344:
---------------------------------

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  18m 40s | Pre-patch trunk compilation is healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any @author tags.
|
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to include 1 new
or modified test files. |
| {color:green}+1{color} | javac |   8m 56s | There were no new javac warning messages. |
| {color:green}+1{color} | javadoc |  11m  8s | There were no new javadoc warning messages.
|
| {color:green}+1{color} | release audit |   0m 29s | The applied patch does not increase
the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   2m 46s | The applied patch generated  3 new checkstyle
issues (total was 499, now 500). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that end in whitespace.
|
| {color:green}+1{color} | install |   1m 50s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with eclipse:eclipse.
|
| {color:green}+1{color} | findbugs |   3m 10s | The patch does not introduce any new Findbugs
(version 2.0.3) warnings. |
| {color:green}+1{color} | native |   3m 17s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |  75m 50s | Tests failed in hadoop-hdfs. |
| | | 126m 43s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.tools.TestHdfsConfigFields |
| Timed out tests | org.apache.hadoop.hdfs.TestSmallBlock |
|   | org.apache.hadoop.hdfs.TestDFSStorageStateRecovery |
|   | org.apache.hadoop.hdfs.TestRenameWhileOpen |
|   | org.apache.hadoop.hdfs.TestPread |
|   | org.apache.hadoop.hdfs.TestRollingUpgrade |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | http://issues.apache.org/jira/secure/attachment/12732002/HDFS-8344.02.patch
|
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / ea11590 |
| checkstyle |  https://builds.apache.org/job/PreCommit-HDFS-Build/10919/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
|
| hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/10919/artifact/patchprocess/testrun_hadoop-hdfs.txt
|
| Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/10919/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep
3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/10919/console |


This message was automatically generated.

> NameNode doesn't recover lease for files with missing blocks
> ------------------------------------------------------------
>
>                 Key: HDFS-8344
>                 URL: https://issues.apache.org/jira/browse/HDFS-8344
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.7.0
>            Reporter: Ravi Prakash
>            Assignee: Ravi Prakash
>         Attachments: HDFS-8344.01.patch, HDFS-8344.02.patch
>
>
> I found another\(?) instance in which the lease is not recovered. This is reproducible
easily on a pseudo-distributed single node cluster
> # Before you start it helps if you set. This is not necessary, but simply reduces how
long you have to wait
> {code}
>       public static final long LEASE_SOFTLIMIT_PERIOD = 30 * 1000;
>       public static final long LEASE_HARDLIMIT_PERIOD = 2 * LEASE_SOFTLIMIT_PERIOD;
> {code}
> # Client starts to write a file. (could be less than 1 block, but it hflushed so some
of the data has landed on the datanodes) (I'm copying the client code I am using. I generate
a jar and run it using $ hadoop jar TestHadoop.jar)
> # Client crashes. (I simulate this by kill -9 the $(hadoop jar TestHadoop.jar) process
after it has printed "Wrote to the bufferedWriter"
> # Shoot the datanode. (Since I ran on a pseudo-distributed cluster, there was only 1)
> I believe the lease should be recovered and the block should be marked missing. However
this is not happening. The lease is never recovered.
> The effect of this bug for us was that nodes could not be decommissioned cleanly. Although
we knew that the client had crashed, the Namenode never released the leases (even after restarting
the Namenode) (even months afterwards). There are actually several other cases too where we
don't consider what happens if ALL the datanodes die while the file is being written, but
I am going to punt on that for another time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message