hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Walter Su (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-9122) DN automatically add more volumes to avoid large volume
Date Tue, 22 Sep 2015 13:42:04 GMT
Walter Su created HDFS-9122:

             Summary: DN automatically add more volumes to avoid large volume
                 Key: HDFS-9122
                 URL: https://issues.apache.org/jira/browse/HDFS-9122
             Project: Hadoop HDFS
          Issue Type: Improvement
            Reporter: Walter Su

Currently if a DataNode has too many blocks, it partition blockReport by storage. In practice,
we've seen a single storage can contains large amount of blocks and the report even exceeds
the max RPC data length. Storage density increases quickly, a DataNode can hold more and more
blocks. It's harder to include so many blocks in one RPC report. One option is "Support splitting
BlockReport of a storage into multiple RPC"(HDFS-9011). 

I'm thinking maybe we could add more "logical" volumes (more storage directories in one device).
DataNodeStorageInfo in NameNode is cheap. And Processing a single blockReport need NN hold
the lock, so splitting one big volume to many volume can avoid a single processing hold lock
too long.

We can support wildcard in dfs.datanode.data.dir. Like /physical-volume/dfs/data/dir*
When a volume exceeds threshold(like 1m blocks), DN automatically create a new storage directory,
also a volume. We have to change RoundRobinVolumeChoosingPolicy as well, once we chosen a
physical volume, we choose the logical volume which has least number of blocks.

This message was sent by Atlassian JIRA

View raw message