hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8204) Mover/Balancer should not schedule two replicas to the same DN
Date Mon, 27 Apr 2015 07:05:40 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14513638#comment-14513638
] 

Hadoop QA commented on HDFS-8204:
---------------------------------

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 40s | Pre-patch trunk compilation is healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any @author tags.
|
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to include 1 new
or modified test files. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that end in whitespace.
|
| {color:green}+1{color} | javac |   7m 27s | There were no new javac warning messages. |
| {color:green}+1{color} | javadoc |   9m 38s | There were no new javadoc warning messages.
|
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does not increase
the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   3m 59s | The applied patch generated  1  additional
checkstyle issues. |
| {color:green}+1{color} | install |   1m 32s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with eclipse:eclipse.
|
| {color:green}+1{color} | findbugs |   3m  4s | The patch does not introduce any new Findbugs
(version 2.0.3) warnings. |
| {color:green}+1{color} | native |   3m 14s | Pre-build of native portion |
| {color:green}+1{color} | hdfs tests | 170m  9s | Tests passed in hadoop-hdfs. |
| | | 214m 45s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | http://issues.apache.org/jira/secure/attachment/12728301/HDFS-8204.003.patch
|
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 1a2459b |
| checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/10404/artifact/patchprocess/checkstyle-result-diff.txt
|
| hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/10404/artifact/patchprocess/testrun_hadoop-hdfs.txt
|
| Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/10404/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep
3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/10404/console |


This message was automatically generated.

> Mover/Balancer should not schedule two replicas to the same DN
> --------------------------------------------------------------
>
>                 Key: HDFS-8204
>                 URL: https://issues.apache.org/jira/browse/HDFS-8204
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: balancer & mover
>            Reporter: Walter Su
>            Assignee: Walter Su
>            Priority: Minor
>         Attachments: HDFS-8204.001.patch, HDFS-8204.002.patch, HDFS-8204.003.patch
>
>
> Balancer moves blocks between Datanode(Ver. <2.6 ).
> Balancer moves blocks between StorageGroups ( introduced by HDFS-6584) , in the new version(Ver.
>=2.6) .
> function
> {code}
> class DBlock extends Locations<StorageGroup>
> DBlock.isLocatedOn(StorageGroup loc)
> {code}
> -is flawed, may causes 2 replicas ends in same node after running balance.-
> For example:
> We have 2 nodes. Each node has two storages.
> We have (DN0, SSD), (DN0, DISK), (DN1, SSD), (DN1, DISK).
> We have a block with ONE_SSD storage policy.
> The block has 2 replicas. They are in (DN0,SSD) and (DN1,DISK).
> Replica in (DN0,SSD) should not be moved to (DN1,SSD) after running Balancer.
> Otherwise DN1 has 2 replicas.
> --------------
> UPDATE(Thanks [~szetszwo] for pointing it out):
> {color:red}
> This bug will *NOT* causes 2 replicas end in same node after running balance, thanks
to Datanode rejecting it. 
> {color}
> We see a lot of ERROR when running test.
> {code}
> 2015-04-27 10:08:15,809 ERROR datanode.DataNode (DataXceiver.java:run(277)) - host1.foo.com:59537:DataXceiver
error processing REPLACE_BLOCK operation  src: /127.0.0.1:52532 dst: /127.0.0.1:59537
> org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: Block BP-264794661-9.96.1.34-1430100451121:blk_1073741825_1001
already exists in state FINALIZED and thus cannot be created.
>     at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:1447)
>     at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.<init>(BlockReceiver.java:186)
>     at org.apache.hadoop.hdfs.server.datanode.DataXceiver.replaceBlock(DataXceiver.java:1158)
>     at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReplaceBlock(Receiver.java:229)
>     at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:77)
>     at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:250)
>     at java.lang.Thread.run(Thread.java:722)
> {code}
> The Balancer runs 5~20 times iterations in the test, before it exits.
> It's ineffecient.
> Balancer should not *schedule* it in the first place, even though it'll failed anyway.
In the test, it should exit after 5 times iteration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message