hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Elliott Clark <ecl...@apache.org>
Subject HBASE-14978
Date Tue, 15 Dec 2015 04:43:34 GMT
Anyone have any thoughts on HBASE-14978 ?

On a read heavy workload we're seeing some serious GC issues. So I put in
the limit for the number of bytes of cells that could be returned. However
on gets that take a small amount of data from a lot of different block this
might not be enough. If every key value holds onto one block then it's
pretty easy to OOME the heap without too large a list of gets.

The obvious answer is that our rpc is awful and should stream things into
the network layer asap. That is a longer project and something that will
probably have to be put off until 3.X.

So I propose using the size of blocks that are being held onto as a limit.
That meshes nicely with the scan limit. However the hard part is that we
have no way of knowing who is referring to a block. If we go exact and keep
a set of blocks for every request then we have a O( log n ) time in a very
hot code path. I instead went with a heuristic to see if the block is
likely the first time it's been referenced by a response. This is O(1)
however it could over count the memory a set of gets is using. Over
counting the memory used would lead to extra round trips. Under counting
results in GC issues. So I think the trade off is probably worth it.

Thoughts ?

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message