hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Lipcon <t...@cloudera.com>
Subject Re: mslab enabled jvm crash
Date Wed, 25 May 2011 18:13:19 GMT
For your GC settings:
- i wouldn't tune newratio or survivor ratio at all
- if you want to tame your young GC pauses, use -Xmn to pick a new
size - eg -Xmn256m
- turn off CMS Incremental Mode if you're running on real server hardware

HBase settings:
- 1% of heap to block cache seems strange. maybe you should just be
turning it off at the table level?
- mslab is definitely experimental. If you can compare with it on vs
with it off, that would be a good data point.


-Todd

On Wed, May 25, 2011 at 11:08 AM, Wayne <wav100@gmail.com> wrote:
> I tried to turn off all special JVM settings we have tried in the past.
> Below are link to the requested configs. I will try to find more logs for
> the full GC. We just made the switch and on this node it has
> only occurred once in the scope of the current log (it may have rolled?).
>
> Thanks.
>
> http://pastebin.com/ca13aMRu
>
> http://pastebin.com/9KfRZFBW
>
>
> On Wed, May 25, 2011 at 1:42 PM, Todd Lipcon <todd@cloudera.com> wrote:
>
>> Hi Wayne,
>>
>> Looks like your RAM might be oversubscribed. Could you paste your
>> hbase-site.xml and hbase-env.sh files? Also looks like you have some
>> strange GC settings on (eg perm gen collection which we don't really
>> need)
>>
>> If you can paste a larger segment of GC logs (enough to include at
>> least two or three of the full gc pauses) that would be helpful.
>>
>> -Todd
>>
>> On Wed, May 25, 2011 at 10:32 AM, Wayne <wav100@gmail.com> wrote:
>> > We switched to u25 and reverted the JVM settings to those recommended.
>> Now
>> > we have concurrent mode failures that occur lasting more than 60 seconds
>> > while not under hardly any load....
>> >
>> > Below are the entries from the JVM log. Of course we can up the zookeeper
>> > timeout to 2 min or 10 min for that matter but it does not address the
>> > underlying issue. Sorry but I can not confirm that the changes for the
>> new
>> > GC settings have any affect. It appears no better or even worse as this
>> > problem below occurred while the cluster was almost idle.
>> >
>> >
>> > 2011-05-25T14:15:45.518+0000: 150358.023: [GC 150358.023: [ParNew:
>> > 230155K->27648K(249216K), 0.0653880 secs] 7754007K->7586719K(8360960K)
>> > icms_dc=100 , 0.0654900 secs] [Times: user=0.78 sys=0.00, real=0.06 secs]
>> > 2011-05-25T14:15:45.906+0000: 150358.410: [GC 150358.410: [ParNew
>> (promotion
>> > failed): 249216K->249216K(249216K), 0.5768350 secs]150358.987:
>> > [CMS2011-05-25T14:16:44.404+0000: 150416.909: [CMS-concurrent-sweep:
>> > 87.667/92.820 secs] [Times: user=182.64 sys=1.37, real=92.80 secs]
>> >  (concurrent mode failure)[Unloading class
>> > sun.reflect.GeneratedMethodAccessor20]
>> > [Unloading class sun.reflect.GeneratedMethodAccessor29]
>> > [Unloading class sun.reflect.GeneratedMethodAccessor31]
>> > [Unloading class sun.reflect.GeneratedMethodAccessor30]
>> > [Unloading class sun.reflect.GeneratedMethodAccessor32]
>> > [Unloading class sun.reflect.GeneratedMethodAccessor1]
>> > [Unloading class sun.reflect.GeneratedMethodAccessor17]
>> > [Unloading class sun.reflect.GeneratedMethodAccessor28]
>> > : 7621159K->2503625K(8111744K), 63.3195660 secs]
>> > 7798327K->2503625K(8360960K), [CMS Perm : 20128K->20106K(33580K)]
>> > icms_dc=100 , 63.8965450 secs] [Times: user=69.50 sys=0.01, real=63.89
>> > secs]
>> >
>> >
>> >
>> > On Mon, May 23, 2011 at 12:04 PM, Stack <stack@duboce.net> wrote:
>> >
>> >> On Mon, May 23, 2011 at 8:42 AM, Wayne <wav100@gmail.com> wrote:
>> >> > Our experience with any newer JVM was that fragmentation was much much
>> >> worse
>> >> > and Concurrent Mode Failures were rampant. We kept moving back in
>> >> releases
>> >> >  to get to what we use now. We are on CentOS 5.5. We will try to use
>> u24.
>> >> >
>> >>
>> >> CMS's you should be able to configure around.  u21 was supposed to
>> >> make improvements to put off frag but apparently made it worse.  Try
>> >> u25, the latest.  Also google for other's experience with JVMs up on
>> >> CentOS 5.5.
>> >>
>> >> St.Ack
>> >>
>> >
>>
>>
>>
>> --
>> Todd Lipcon
>> Software Engineer, Cloudera
>>
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Mime
View raw message