hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From fnord 99 <fnord...@googlemail.com>
Subject Re: LazyFetching of Row Results in MapReduce
Date Tue, 23 Nov 2010 13:48:18 GMT
Hi,

our machines have 24GB of RAM (for 8 cores) and HBase gets 6 GBs. The map
jobs all have 768 MB memory.

Currently we're using CDH3b3.

We'll definitely implement my idea of distributing the rows into multiple
columns similarly to what Friso said.

A comment from somebody who has really wide rows would be interesting,
though.

Thanks,
fnord

2010/11/22 Todd Lipcon <todd@cloudera.com>

> Hi,
>
> Which version are you using?
>
> During the 0.89 development series we got a bunch of new work in trunk
> (mostly thanks to Facebook and TrendMicro) for wide rows. Maybe one of the
> FB guys can comment, but I believe they have some very wide rows in their
> application.
>
> Thanks
> -Todd
>
> On Mon, Nov 22, 2010 at 2:01 AM, fnord 99 <fnord999@googlemail.com> wrote:
>
> > Hi all,
> >
> > I recently filled an hbase table with many millions of columns in each
> row
> > (!). The problem that now occured was that I always get a Heap Space
> Error
> > from the JVM with a subsequent shutdown of all regionservers in which the
> > error occurs. Since the error isn't thrown in any of my own classes, I
> > think
> > that the problem is the following:
> >
> > * a row is always completely read into memory upon access (at least all
> > column families that I'm interested in)
> > * the Result object holds the complete family-qualifier-value pairs in a
> > KeyValue[]
> > * this is sometimes too much to be handled by the physical memory each
> map
> > can get, therefore a heap space error is thrown
> >
> > My question is now: is there any lazy fetching technique implemented
> within
> > the single key-values within one row? In my opinion it should be but I
> > couldn't find anything in the source code or wiki that hints to that.
> >
> > Any ideas on how to go around this problem? I had the idea to rebuild the
> > table schema to store more data in the row key and less data in the
> column
> > families which would make the tables "thinner" and "longer". It would
> work
> > in the current setup, however, it wouldn't solve the original problem...
> >
> > Thanks already in advance for any input on that,
> >
> > fnord999
> >
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message