hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tsz Wo Nicholas Sze (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (HDFS-8341) (Summary & Description may be invalid) HDFS mover stuck in loop after failing to move block, doesn't move rest of blocks, can't get data back off decommissioning external storage tier as a result
Date Thu, 17 Sep 2015 10:08:45 GMT

     [ https://issues.apache.org/jira/browse/HDFS-8341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Tsz Wo Nicholas Sze resolved HDFS-8341.
---------------------------------------
    Resolution: Invalid

Resolving as invalid.  Please feel free to reopen if you disagree.

> (Summary & Description may be invalid) HDFS mover stuck in loop after failing to
move block, doesn't move rest of blocks, can't get data back off decommissioning external
storage tier as a result
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-8341
>                 URL: https://issues.apache.org/jira/browse/HDFS-8341
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: balancer & mover
>    Affects Versions: 2.6.0
>         Environment: HDP 2.2
>            Reporter: Hari Sekhon
>            Priority: Minor
>
> HDFS mover gets stuck looping on a block that fails to move and doesn't migrate the rest
of the blocks.
> This is preventing recovery of data from a decomissioning external storage tier used
for archive (we've had problems with that proprietary "hyperscale" storage product which is
why a couple blocks here and there have checksum problems or premature eof as shown below),
but this should not prevent moving all the other blocks to recover our data:
> {code}hdfs mover -p /apps/hive/warehouse/<custom_scrubbed>
> 15/05/07 14:52:50 INFO mover.Mover: namenodes = {hdfs://nameservice1=[/apps/hive/warehouse/<custom_scrubbed>]}
> 15/05/07 14:52:51 INFO balancer.KeyManager: Block token params received from NN: update
interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/05/07 14:52:51 INFO block.BlockTokenSecretManager: Setting block keys
> 15/05/07 14:52:51 INFO balancer.KeyManager: Update block keys every 2hrs, 30mins, 0sec
> 15/05/07 14:52:52 INFO block.BlockTokenSecretManager: Setting block keys
> 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: /default-rack/<ip>:1019
> 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: /default-rack/<ip>:1019
> 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: /default-rack/<ip>:1019
> 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: /default-rack/<ip>:1019
> 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: /default-rack/<ip>:1019
> 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: /default-rack/<ip>:1019
> 15/05/07 14:52:52 WARN balancer.Dispatcher: Failed to move blk_1075156654_1438349 with
size=134217728 from <ip>:1019:ARCHIVE to <ip>:1019:DISK through <ip>:1019:
block move is failed: opReplaceBlock BP-120244285-<ip>-1417023863606:blk_1075156654_1438349
received exception java.io.EOFException: Premature EOF: no length prefix available
> <NOW IT STARTS LOOPING ON SAME BLOCK>
> 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: /default-rack/<ip>:1019
> 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: /default-rack/<ip>:1019
> 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: /default-rack/<ip>:1019
> 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: /default-rack/<ip>:1019
> 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: /default-rack/<ip>:1019
> 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: /default-rack/<ip>:1019
> 15/05/07 14:53:31 WARN balancer.Dispatcher: Failed to move blk_1075156654_1438349 with
size=134217728 from <ip>:1019:ARCHIVE to <ip>:1019:DISK through <ip>:1019:
block move is failed: opReplaceBlock BP-120244285-<ip>-1417023863606:blk_1075156654_1438349
received exception java.io.EOFException: Premature EOF: no length prefix available
> ...<repeat indefinitely>...
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message