hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From fnord 99 <fnord...@googlemail.com>
Subject LazyFetching of Row Results in MapReduce
Date Mon, 22 Nov 2010 10:01:40 GMT
Hi all,

I recently filled an hbase table with many millions of columns in each row
(!). The problem that now occured was that I always get a Heap Space Error
from the JVM with a subsequent shutdown of all regionservers in which the
error occurs. Since the error isn't thrown in any of my own classes, I think
that the problem is the following:

* a row is always completely read into memory upon access (at least all
column families that I'm interested in)
* the Result object holds the complete family-qualifier-value pairs in a
KeyValue[]
* this is sometimes too much to be handled by the physical memory each map
can get, therefore a heap space error is thrown

My question is now: is there any lazy fetching technique implemented within
the single key-values within one row? In my opinion it should be but I
couldn't find anything in the source code or wiki that hints to that.

Any ideas on how to go around this problem? I had the idea to rebuild the
table schema to store more data in the row key and less data in the column
families which would make the tables "thinner" and "longer". It would work
in the current setup, however, it wouldn't solve the original problem...

Thanks already in advance for any input on that,

fnord999

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message