hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Junping Du (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-8491) DN shutdown race conditions with open xceivers
Date Fri, 06 Jan 2017 22:03:58 GMT

     [ https://issues.apache.org/jira/browse/HDFS-8491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Junping Du updated HDFS-8491:
-----------------------------
    Target Version/s:   (was: 2.8.0)

> DN shutdown race conditions with open xceivers
> ----------------------------------------------
>
>                 Key: HDFS-8491
>                 URL: https://issues.apache.org/jira/browse/HDFS-8491
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode
>    Affects Versions: 2.6.0
>            Reporter: Daryn Sharp
>
> DN shutdowns at least for restarts have many race conditions.  Shutdown is very noisy
with exceptions.  The DN notifies writers of the restart, waits 1s and then interrupts the
xceiver threads but does not join.  The ipc server is stopped and then the bpos services are
stopped.
> Xceivers then encounter NPEs in closeBlock because the block no longer exists in the
volume map when transient storage is checked.  Just before that, the DN notifies the NN the
block was received.  This does not appear to always be true, but rather that the thread was
interrupted. They race with bpos shutdown, and luckily appear to lose, to send the block received.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message