hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lei (Eddy) Xu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6877) Avoid calling checkDisk and stop active BlockReceiver thread when an HDFS volume is removed during a write.
Date Wed, 24 Sep 2014 03:26:34 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14145827#comment-14145827
] 

Lei (Eddy) Xu commented on HDFS-6877:
-------------------------------------

[~cmccabe] Thank you very much for your quick reviews. 

I anticipated a behavior that the application is writing the block in very slow speed. For
instance, a light traffic service opens a log file and only appends 10KB logs after waking
up for every 30 seconds? Or some applications that dump immediate results (e.g., 1MB) after
every 10 minutes? In such cases, the BlockReceiver thread can live for a long time, until
the block is full and then be closed.  But from the user point of view, after {{hdfs dfsadmin
-reconfig status}} indicates the reconfigure task has finished, the user might expect to be
able to remove the disks physically anytime? Since BlockReceiver thread is still alive for
uncertain time and it opens file descriptor on the disk, it might be difficult to explain
to user why the disk can not be {{umount}} and how long they should wait for it.

{code}
    for (ReplicaInPipeline pip : activeWriters) {
      try {
        LOG.info("Stopping active writer.");
        pip.stopWriter(datanode.getDnConf().getXceiverStopTimeout());
      } catch (IOException e) {
        LOG.warn("IOException when stopping active writer.", e);
      }
    }
{code}

The reason that I moved the above out of synchronized section is that
#  {stopWrite()} is calling {{interrupt()}} and {{join()}}, which is slow. So I attempted
to make the synchronized section in {{removeVolumes()}} shorter. 
#.

> Avoid calling checkDisk and stop active BlockReceiver thread when an HDFS volume is removed
during a write.
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-6877
>                 URL: https://issues.apache.org/jira/browse/HDFS-6877
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: datanode
>    Affects Versions: 2.5.0
>            Reporter: Lei (Eddy) Xu
>            Assignee: Lei (Eddy) Xu
>         Attachments: HDFS-6877.000.consolidate.txt, HDFS-6877.000.delta-HDFS-6727.txt,
HDFS-6877.001.combo.txt, HDFS-6877.001.patch, HDFS-6877.002.patch, HDFS-6877.003.patch, HDFS-6877.004.patch,
HDFS-6877.005.patch
>
>
> Avoid calling checkDisk and stop active BlockReceiver thread when an HDFS volume is removed
during a write.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message