hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: Question regarding data location in hdfs after hbase restarts
Date Tue, 12 Oct 2010 18:26:36 GMT
When you write HDFS, you write N replicas.  By default, the first
replica is written to the local datanode.  Reading, the DFSClient will
try to read from the most local replica first.

Compactions read from multiple files and write out a single merged
file.  This newly written files' blocks will all be on the local
datanode unless anomaly.


On Tue, Oct 12, 2010 at 11:58 AM, Jack Levin <magnito@gmail.com> wrote:
> Ryan, can you elaborate how compactions create data locality?
> -Jack
> On Oct 11, 2010, at 10:12 PM, Ryan Rawson <ryanobjc@gmail.com> wrote:
>> We don't attempt to optimize region placement with hdfs locations yet. A
>> reason why is because on a long lived cluster compactions create the
>> locality you are looking for. Furthermore, in the old master such an
>> optimization was really hard to do. The new master should make it easier to
>> write such 1 off hacks.
>> On Oct 11, 2010 9:43 PM, "Tao Xie" <xietao.mailbox@gmail.com> wrote:
>>> hi, all
>>> I set hdfs replica=1 when running hbase. And DN and RS co-exists on each
>>> slave node. So the data in the regions managed by RS will be stored on its
>>> local data node, rite?
>>> But when I restart hbase and hbase client does gets on RS, datanode will
>>> read data from remote data nodes. Does that mean when RS restart, the
>>> regions are re-arranged? If so, will hbase is clever enough to re-adjust
>> the
>>> regions? I'm not clear about the behind mechanism so anyone can give me
>> some
>>> explanations? Thanks.

View raw message