hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arpit Agarwal (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-7820) Client Write fails after rolling upgrade rollback with "<block_id> already exist in finalized state"
Date Sat, 28 Feb 2015 00:25:05 GMT

     [ https://issues.apache.org/jira/browse/HDFS-7820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Arpit Agarwal updated HDFS-7820:
--------------------------------
    Summary: Client Write fails after rolling upgrade rollback with "<block_id> already
exist in finalized state"  (was: Client Write fails after rolling upgrade operation with "<block_id>
already exist in finalized state")

> Client Write fails after rolling upgrade rollback with "<block_id> already exist
in finalized state"
> ----------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-7820
>                 URL: https://issues.apache.org/jira/browse/HDFS-7820
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: J.Andreina
>            Assignee: J.Andreina
>         Attachments: HDFS-7820.1.patch
>
>
> Steps to Reproduce:
> ===================
> Step 1:  Prepare rolling upgrade using "hdfs dfsadmin -rollingUpgrade prepare"
> Step 2:  Shutdown SNN and NN
> Step 3:  Start NN with the "hdfs namenode -rollingUpgrade started" option.
> Step 4:  Executed "hdfs dfsadmin -shutdownDatanode <DATANODE_HOST:IPC_PORT> upgrade"
and restarted Datanode
> Step 5:  Write 3 files to hdfs ( block id assigned are : blk_1073741831_1007, blk_1073741832_1008,blk_1073741833_1009
)
> Step 6:  Shutdown both NN and DN
> Step 7:  Start NNs with the "hdfs namenode -rollingUpgrade rollback" option.
>          Start DNs with the "-rollback" option.
> Step 8:  Write 2 files to hdfs.
> Issue:
> =======
> Client write failed with below exception
> {noformat}
> 2015-02-23 16:00:12,896 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
BP-1837556285-XXXXXXXXXXX-1423130389269:blk_1073741832_1008 src: /XXXXXXXXXXX:48545 dest:
/XXXXXXXXXXX:50010
> 2015-02-23 16:00:12,897 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock
BP-1837556285-XXXXXXXXXXX-1423130389269:blk_1073741832_1008 received exception org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException:
Block BP-1837556285-XXXXXXXXXXX-1423130389269:blk_1073741832_1008 already exists in state
FINALIZED and thus cannot be created.
> {noformat}
> Observations:
> =============
> 1. At Namenode side block invalidate is been sent only to 2 blocks.
> {noformat}
> 15/02/23 14:59:56 INFO BlockStateChange: BLOCK* InvalidateBlocks: add blk_1073741833_1009
to XXXXXXXXXXX:50010
> 15/02/23 14:59:56 INFO BlockStateChange: BLOCK* InvalidateBlocks: add blk_1073741831_1007
to XXXXXXXXXXX:50010
> {noformat}
> 2. fsck report does not show information on blk_1073741832_1008
> {noformat}
> FSCK started by Rex (auth:SIMPLE) from /XXXXXXXXXXX for path / at Mon Feb 23 16:17:57
CST 2015
> /File1:  Under replicated BP-1837556285-XXXXXXXXXXX-1423130389269:blk_1073741825_1001.
Target Replicas is 3 but found 1 replica(s).
> /File11:  Under replicated BP-1837556285-XXXXXXXXXXX-1423130389269:blk_1073741827_1003.
Target Replicas is 3 but found 1 replica(s).
> /File2:  Under replicated BP-1837556285-XXXXXXXXXXX-1423130389269:blk_1073741826_1002.
Target Replicas is 3 but found 1 replica(s).
> /AfterRollback_2:  Under replicated BP-1837556285-XXXXXXXXXXX-1423130389269:blk_1073741831_1007.
Target Replicas is 3 but found 1 replica(s).
> /Test1:  Under replicated BP-1837556285-XXXXXXXXXXX-1423130389269:blk_1073741828_1004.
Target Replicas is 3 but found 1 replica(s).
> Status: HEALTHY
>  Total size:    31620 B
>  Total dirs:    7
>  Total files:   6
>  Total symlinks:                0
>  Total blocks (validated):      5 (avg. block size 6324 B)
>  Minimally replicated blocks:   5 (100.0 %)
>  Over-replicated blocks:        0 (0.0 %)
>  Under-replicated blocks:       5 (100.0 %)
>  Mis-replicated blocks:         0 (0.0 %)
>  Default replication factor:    3
>  Average block replication:     1.0
>  Corrupt blocks:                0
>  Missing replicas:              10 (66.666664 %)
>  Number of data-nodes:          1
>  Number of racks:               1
> FSCK ended at Mon Feb 23 16:17:57 CST 2015 in 3 milliseconds
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message