hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wayne <wav...@gmail.com>
Subject Re: mslab enabled jvm crash
Date Mon, 06 Jun 2011 17:06:02 GMT
I had 25 sec CMF failure this morning...looks like bulk inserts are required
along with possibly weekly/daily scheduled rolling restarts. Do most
production clusters run rolling restarts on a regular basis to give the JVM
a fresh start?

Thanks.

On Thu, Jun 2, 2011 at 1:56 PM, Wayne <wav100@gmail.com> wrote:

> JVM w/ 10g Heap settings below. Once we are "bored" with stability we will
> try to up the 65 to 70 which seems to be standard.
>
> -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=65
> -XX:+CMSParallelRemarkEnabled -XX:+UseConcMarkSweepGC -XX:NewSize=128m
> -XX:MaxNewSize=128m -XX:+UseParNewG
>
> Our Memstore settings are default lower/upper is .35/.4.
>
> We do not currently use block cache. That may change in the future...
>
> Our table block size is 256k to keep the store file index size down. It was
> averaging ~2.75G+ per node and now has gone down to ~1.5G which should go
> even lower as regions are compacted.
>
> We have enabled the memstore MSLAB option. Not sure it is relevant but we
> have a 5G region size and 256m memstore flush size.
>
> Thanks.
>
>
>
> On Thu, Jun 2, 2011 at 11:48 AM, Stack <stack@duboce.net> wrote:
>
>> Thanks for writing back to the list Wayne.  Hopefully this message
>> hits you before the next CMF does.  Would you mind pasting your final
>> JVM args and any other configs you think one of us could use writing
>> up your war story for the 'book' as per Jeff Whiting's suggestion?
>>
>> Good stuff,
>> St.Ack
>>
>>
>> On Thu, Jun 2, 2011 at 8:09 AM, Wayne <wav100@gmail.com> wrote:
>> > I have finally been able to spend enough time to digest/test
>> > all recommendations and get this under control. I wanted to thank Stack,
>> > Jack Levin, and Ted Dunning for their input.
>> >
>> > Basically our memory was being pushed to the limit and the JVM does not
>> > like/can not handle this. We are successfully using Todd's MSLAB enabled
>> on
>> > u25 and have set parnew to 128m, increased the heap to 10g, and
>> increased
>> > our block size 4x to 256k to reduce the size of the store file index
>> (thanks
>> > Jack for pointing this out). The combination of all of these changes
>> reduce
>> > significantly the pressure on memory and now we are getting more
>> throughput
>> > (40k writes/sec/node) sustained with no CMF errors (I expect to see one
>> 5
>> > min after hitting send on this email...).
>> >
>> > We are also going to move to bulk inserts where we can, which should
>> help as
>> > well. The decreased performance of reads due to the block size increase
>> is
>> > going to be the next challenge. Thanks everyone for your help, I am
>> > a believer again.
>> >
>> >
>> > On Fri, May 27, 2011 at 12:17 AM, Erik Onnen <eonnen@gmail.com> wrote:
>> >
>> >> On Thu, May 26, 2011 at 11:01 AM, Stack <stack@duboce.net> wrote:
>> >> > What JVM configs are you running Erik?
>> >> > St.Ack
>> >>
>> >> Omitting some of the irrelevant ones...
>> >>
>> >> JAVA_OPTS="-XX:+UseLargePages -Xms8192M -Xmx8192M
>> >> -XX:+HeapDumpOnOutOfMemoryError -XX:+UseParNewGC
>> >> -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled
>> >> -Xloggc:/mnt/services/hbaseregion/var/log/hbaseregion-gc.log
>> >> -XX:+PrintGCDetails -XX:+PrintGCTimeStamps"
>> >>
>> >> For our Cassandra servers we also explicitly pin the new gen tenuring
>> >> and occupancy thresholds but that hasn't been necessary for the HBase
>> >> workloads so far:
>> >>
>> >> "-XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75
>> >> -XX:+UseCMSInitiatingOccupancyOnly"
>> >>
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message