hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Asaf Mesika <asaf.mes...@gmail.com>
Subject Re: OutOfMemoryError in MapReduce Job
Date Sat, 02 Nov 2013 16:27:43 GMT
If mean, if you take all those bytes if the bit set and zip them, wouldn't
you reduce it significantly? Less traffic on the wire, memory in HBase, etc.

On Saturday, November 2, 2013, John wrote:

> I already use LZO compression in HBase. Or do you mean a compressed Java
> object? Do you know an implementation?
>
> kind regards
>
>
> 2013/11/2 Asaf Mesika <asaf.mesika@gmail.com <javascript:;>>
>
> > I would try to compress this bit set.
> >
> > On Nov 2, 2013, at 2:43 PM, John <johnnyenglish739@gmail.com<javascript:;>>
> wrote:
> >
> > > Hi,
> > >
> > > thanks for your answer! I increase the "Map Task Maximum Heap Size" to
> > 2gb
> > > and it seems to work. The OutOfMemoryEroror is gone. But the HBase
> Region
> > > server are now crashing all the time :-/ I try to store the bitvector
> > > (120mb in size) for some rows. This seems to be very memory intensive,
> > the
> > > usedHeapMB increase very fast (up to 2gb). I'm  not sure if it is the
> > > reading or the writing task which causes this, but I thnk its the
> writing
> > > task. Any idea how to minimize the memory usage? My mapper looks like
> > this:
> > >
> > > public class MyMapper extends TableMapper<ImmutableBytesWritable, Put>
> {
> > >
> > > private void storeBitvectorToHBase(
> > >        Put row = new Put(name);
> > >        row.setWriteToWAL(false);
> > >        row.add(cf,    Bytes.toBytes("columname"),
> > toByteArray(bitvector));
> > >        ImmutableBytesWritable key = new ImmutableBytesWritable(
> > >                name);
> > >        context.write(key, row);
> > > }
> > > }
> > >
> > >
> > > kind regards
> > >
> > >
> > > 2013/11/1 Jean-Marc Spaggiari <jean-marc@spaggiari.org <javascript:;>>
> > >
> > >> Ho John,
> > >>
> > >> You might be better to ask this on the CDH mailing list since it's
> more
> > >> related to Cloudera Manager than HBase.
> > >>
> > >> In the meantime, can you try to update the "Map Task Maximum Heap
> Size"
> > >> parameter too?
> > >>
> > >> JM
> > >>
> > >>
> > >> 2013/11/1 John <johnnyenglish739@gmail.com <javascript:;>>
> > >>
> > >>> Hi,
> > >>>
> > >>> I have a problem with the memory. My use case is the following: I've
> > >> crated
> > >>> a MapReduce-job and iterate in this over every row. If the row has
> more
> > >>> than for example 10k columns I will create a bloomfilter (a bitSet)
> for
> > >>> this row and store it in the hbase structure. This worked fine so
> far.
> > >>>
> > >>> BUT, now I try to store a BitSet with 1000000000 elements = ~120mb
in
> > >> size.
> > >>> In every map()-function there exist 2 BitSet. If i try to execute the
> > >>> MR-job I got this error: http://pastebin.com/DxFYNuBG
> > >>>
> > >>> Obviously, the tasktracker does not have enougth memory. I try to
> > adjust
> > >>> the configuration for the memory, but I'm not sure which is the right
> > >> one.
> > >>> I try to change the "MapReduce Child Java Maximum Heap Size" value
> from
> > >> 1GB
> > >>> to 2GB, but still got the same error.
> > >>>
> > >>> Which parameters do I have to adjust? BTW. I'm using CDH 4.4.0 with
> the
> > >>> Clouder Manager
> > >>>
> > >>> kind regards
> > >>>
> > >>
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message