hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: Regarding data storage in HBase
Date Thu, 19 Jan 2012 17:14:02 GMT
On Thu, Jan 19, 2012 at 3:34 AM, Praveen Sripati

> 1. When the memstore fills, is it flushed to HDFS or local file system?

> 2. If the region size (hbase.hregion.max.filesize) is set to 200MB and the
> HDFS Block Size is set to 64MB, will the region be split across 4 data
> nodes? I know that this doesn't make sense to split a single regions data
> across nodes in HDFS, but how is it handled in HBase?
You mean file in the above rather than region?

If so, yes, the file will be made of multiple HDFS blocks.  The blocks will
be replicated.  Usually one replica is on the datanode local to the
regionserver.  See the reference guide for more on hbase locality.

> 3. Is region size (hbase.hregion.max.filesize) the size of commit log or
> the size of the file that has been flushed?
Its about files under a region.  WALs/logs have their own configs.

> 4. The commit log might become big over time, is there similar concept of
> checkpoint in HBase for the commit logs?
WALs are rolled at configurable size -- usually 64MB.  WALs that have edits
that have been all flushed to hfiles are let go/deleted.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message