hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Suresh Srinivas (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2379) 0.20: Allow block reports to proceed without holding FSDataset lock
Date Tue, 11 Oct 2011 17:01:12 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13125193#comment-13125193
] 

Suresh Srinivas commented on HDFS-2379:
---------------------------------------

Comments:
# In the new version of the patch, FSDatasetInterface.java changes are missing. Also asynchronous
scan thread change is missing as well. Want to make sure that it is intentional.
#* There are some lines that are more than 80 chars.
# FSDataset.java
#* Why do you want to deprecate #getBlockInfo()? If you have a valid reason, can you please
add information on the new method/mechanism that should be used instead of the deprecated
method.
#* Add javadoc to scanBlockFilesInconsistent() - add info about why it is not synchronized.
#* SANITY_CHECK code can be removed.
#* reconcileInconsistentDiskScan
#** What happens to cases when volumeMap contains block but scanned block File does not exist
or scanned block file exists but volumeMap does not contain it?
#** In the end, the scanned block info is made to look same as the in memory state. I am just
wondering, what is the need of the scan then?

                
> 0.20: Allow block reports to proceed without holding FSDataset lock
> -------------------------------------------------------------------
>
>                 Key: HDFS-2379
>                 URL: https://issues.apache.org/jira/browse/HDFS-2379
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: data-node
>    Affects Versions: 0.20.206.0
>            Reporter: Todd Lipcon
>            Priority: Critical
>         Attachments: hdfs-2379.txt, hdfs-2379.txt, hdfs-2379.txt, hdfs-2379.txt
>
>
> As disks are getting larger and more plentiful, we're seeing DNs with multiple millions
of blocks on a single machine. When page cache space is tight, block reports can take multiple
minutes to generate. Currently, during the scanning of the data directories to generate a
report, the FSVolumeSet lock is held. This causes writes and reads to block, timeout, etc,
causing big problems especially for clients like HBase.
> This JIRA is to explore some of the ideas originally discussed in HADOOP-4584 for the
0.20.20x series.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message