hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Dimiduk <ndimi...@gmail.com>
Subject Re: Controlling the block placement and the file placement in HDFS writes
Date Thu, 18 Dec 2014 23:49:50 GMT
HBase would enjoy a similar functionality. In our case, we'd like all
replicas for all files in a given HDFS path to land on the same set of
machines. That way, in the event of a failover, regions can be assigned to
one of these other machines that has local access to all blocks for all
region files.

On Thu, Dec 18, 2014 at 3:36 PM, Zhe Zhang <zhe.zhang.research@gmail.com>
> > The second aspect is that our queries are time based and this time window
> > follows a familiar pattern of old data not being queried much. Hence we
> > would like to preserve the most recent data in the HDFS cache ( impala is
> > helping us manage this aspect via their command set ) but we would like
> the
> > next recent amount of data chunks to land on an SSD that is present on
> > every datanode. The remaining set of blocks which are "very old but in
> > large quantities" would land on spinning disks. The decision to choose a
> > given volume is based on the file name as we can control the filename
> that
> > is being used to generate the file.
> >
> Have you tried the 'setStoragePolicy' command? It's part of the HDFS
> "Heterogeneous Storage Tiers" work and seems to address your scenario.
> > 1. Is there a way to control that all file blocks belonging to a
> particular
> > hdfs directory & file go to the same physical datanode ( and their
> > corresponding replicas as well ? )
> This seems inherently hard: the file/dir could have more data than a
> single DataNode can host. Implementation wise, if requires some sort
> of a map in BlockPlacementPolicy from inode or file path to DataNode
> address.
> My 2 cents..
> --
> Zhe Zhang
> Software Engineer, Cloudera
> https://sites.google.com/site/zhezhangresearch/

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message