hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pamecha, Abhishek" <apame...@x.com>
Subject limit on number of blocks per HFile and files per region
Date Thu, 23 Aug 2012 23:21:15 GMT
I have a few questions on blocks/file and file/region.

1.       Can there be multiple row keys per block and then per  HFile? Or is a block or Hfile
dedicated to a single row key?

I have a scenario, where for the same column family, some rowkeys will have very wide rows,
say rowkey W, and some rowkeys will have very narrow rows, say rowkey N. In my case,  puts
for rowkeys W and N are interleaved with a ratio of say 90 rowkeyW puts vs 10 rowkeyN puts.
On the get side, my app works on getting data for a single  rowkey at a time.

Will that mean for a rowkeyN, the entries will be scattered across regions on that same region
server, given there are interleaved puts? Or Is there a way I can enforce contiguous  writes
to a region/Hfile reserved for rowkey N.  This way, I can leverage the block cache and have
the entire/most of  rowkeyN fit in there for that session.

2.       Is there a limit on number of HFiles that can exist per region? Basically, on what
criteria does a rowkey data gets split in two regions [on the same region server]. I am assuming
there can be many regions per region server. And multiple regions for the same table can belong
in the same region server.

3.       Also, is there a limit on the number of blocks that are created per HFile? What determines
whether a split is required?


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message