hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pablo Musa <pa...@psafe.com>
Subject Re: RegionServers Crashing every hour in production env
Date Wed, 03 Apr 2013 20:24:45 GMT
> Have you looked at http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired , suggested
below ?

Yes I did, but GC is not the issue here.

> I think zookeeper timeout should be more closely watched.

What do you mean?
My ZK timeout today is 150 secs, which is very big. However, my
problem is finding where this timeout come from?! Any ideas?

Thanks,
Pablo

On 04/03/2013 03:36 PM, Ted Yu wrote:
> Thanks for the sharing.
>
> Have you looked at
> http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired , suggested
> below ?
>
> I think zookeeper timeout should be more closely watched.
>
>
> On Wed, Apr 3, 2013 at 11:21 AM, Pablo Musa <pablo@psafe.com> wrote:
>
>> Hello guys,
>> I stopped my research on HBase ZK timeout for while due to
>> other issues I had to do, but I am back.
>>
>> A very weird behavior that I would like your comments is that my
>> RegionServers perform better (less crashes) under heavy load instead
>> of light load.
>> There is, if I let HBase alone with 50 requestsPerSecond along the
>> whole day the crashes are higher than if I run a mapred Job every hour.
>>
>>
>> Another weird thing is the following:
>>
>> RS startTime = Mon Apr 01 13:22:35 BRT 2013
>>
>> [...]$ grep slept /var/log/hbase/hbase-hbase-**regionserver-PSLBHDN00*.log
>> 2013-04-01 20:09:21,135 WARN org.apache.hadoop.hbase.util.**Sleeper: We
>> slept 45491ms instead of 3000ms, this is likely due to a long garbage
>> collecting pause and it's usually bad, see http://hbase.apache.org/book.**
>> html#trouble.rs.runtime.**zkexpired<http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired>
>> 2013-04-01 22:45:59,407 WARN org.apache.hadoop.hbase.util.**Sleeper: We
>> slept 101271ms instead of 3000ms, this is likely due to a long garbage
>> collecting pause and it's usually bad, see http://hbase.apache.org/book.**
>> html#trouble.rs.runtime.**zkexpired<http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired>
>>
>> [...]$ egrep 'real=[1-9][0-9][0-9][0-9]*\.[**0-9][0-9]'
>> /var/log/hbase/hbase-hbase-**regionserver-PSLBHDN00*.out
>> * the below report is the above command for each time range.
>> 0.0 - 0.1  secs GCs = 5084
>> 0.1 - 0.5  secs GCs = 9
>> 0.5 - 1.0  secs GCs = 3
>> 1.0 - 010  secs GCs = 0
>> 010 - 100  secs GCs = 0
>> 100 - 1000 secs GCs = 0
>>
>> So, my script for getting every gc time ("real=... secs") says that
>> there is no gc that took longer than 1 second.
>> However the RS log says twice that the RS slept more than 40 seconds
>> instead of 3.
>>
>> "this is likely due to a long garbage collecting pause", yes
>> this is likely but I dont think it is the case.
>>
>> The machine is a huge machine with 70GB RAM, 32 procs, light load,
>> no swap or iowait.
>>
>> Any ideas?
>>
>> Thanks,
>> Pablo
>>
>>
>> On 03/12/2013 12:43 PM, Pablo Musa wrote:
>>
>>> Guys,
>>> thank you very much for the help.
>>>
>>> Yesterday I spent 14 hours trying to tune the whole cluster.
>>> The cluster is not ready yet needs a lot of tunning, but at least is
>>> working.
>>>
>>> My first big problem was namenode + datanode GC. They were not using
>>> CMS and thus were taking "incremental" time to run. Ii started in 0.01
>>> ms and
>>> in 20 minutes was taking 150 secs.
>>> After setting CMSGC this time is much smaller taking a maximum of 70 secs,
>>> which is VERY HIGH, but for now does not stop HBase.
>>>
>>> With this issue solved, it was clear that the RS was doing a long pause
>>> GC,
>>> taking up to 220 secs. Zookeeper expired the RS and it shutdown.
>>> I tried a lot of different flags configuration (MORE than 20), and could
>>> not
>>> get small gcs. Eventually it would take more than 150 secs (zookeeper
>>> timeout)
>>> and shutdown.
>>>
>>> Finally I tried a config that so far, 12 hours, is working with a maximum
>>> GC
>>> time of 90 secs. Which of course is a terrible problem since HBase is a
>>> database, but at least the cluster is stable while I can tune it a
>>> little more.
>>>
>>> In my opinion, my biggest problem is to have a few "monster" machines in
>>> the
>>> cluster instead of a bunch of commodities machines. I don't know if
>>> there are
>>> a lot companies using this kind of machines inside a hadoop cluster, but
>>> a fast search on google could not find a lot of tunes for big heap GCs.
>>>
>>> I guess my next step will be search for big heap gc tuning.
>>>
>>> Back to some questions ;)
>>>
>>>    > You have ganglia or tsdb running?
>>>
>>> I use zabbix for now, and no there is nothing going on when the big
>>> pause happens.
>>>
>>>    > When you see the big pause above, can you see anything going on on the
>>>    > machine? (swap, iowait, concurrent fat mapreduce job?)
>>>    > what are you doing during long GC happened? read or write? if
>>> reading, what
>>>    > the block cache size?
>>>
>>> The cpu for the RS process goes to 100% and the logs "pause", until it
>>> gets out.
>>> Ex: [NewPar
>>>
>>> IO and SWAP are normal. There is no MR running, just normal database
>>> load, which is
>>> very low. I am probably doing reads AND writes to the database with
>>> default block
>>> cache size.
>>> One problem in this moment might be the big number of regions (1252)
>>> since I am
>>> using only one RS to be able to track the problem.
>>>
>>> The links and ideas were very helpful. Thank you very much guys.
>>>
>>> I will post my future researches as I find a solution ;)
>>>
>>> If you have more ideas or info (links, flag suggestions, etc.), please
>>> post it :)
>>>
>>> Abs,
>>> Pablo
>>>
>>> On 03/10/2013 11:24 PM, Andrew Purtell wrote:
>>>
>>>> Be careful with GC tuning, throwing changes at an application without
>>>> analysis of what is going on with the heap is shooting in the dark. One
>>>> particular good treatment of the subject is here:
>>>> http://java.dzone.com/**articles/how-tame-java-gc-**pauses<http://java.dzone.com/articles/how-tame-java-gc-pauses>
>>>>
>>>> If you have made custom changes to blockcache or memstore configurations,
>>>> back them out until you're sure everything else is ok.
>>>>
>>>> Watch carefully for swapping. Set the vm.swappiness sysctl to 0. Monitor
>>>> for spikes in page scanning or any swap activity. Nothing brings on
>>>> "Juliette" pauses better than a JVM partially swapped out. The Java GC
>>>> starts collection by examining the oldest pages, and those are the first
>>>> pages the OS swaps out...
>>>>
>>>>
>>>>
>>>> On Mon, Mar 11, 2013 at 10:13 AM, Azuryy Yu <azuryyyu@gmail.com> wrote:
>>>>
>>>>   Hi Pablo,
>>>>> It'a terrible for a long minor GC. I don't think there are swaping from
>>>>> your vmstat log.
>>>>> but I just suggest you
>>>>> 1) add following JVM options:
>>>>> -XX:+DisableExplicitGC -XX:+UseCompressedOops -XX:GCTimeRatio=19
>>>>> -XX:SoftRefLRUPolicyMSPerMB=0 -XX:SurvivorRatio=2
>>>>> -XX:MaxTenuringThreshold=3 -XX:+UseFastAccessorMethods
>>>>>
>>>>> 2) -Xmn is two small, your total Mem is 74GB, just make -Xmn2g
>>>>> 3) what are you doing during long GC happened? read or write? if
>>>>> reading,
>>>>> what the block cache size?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Mar 11, 2013 at 6:41 AM, Stack <stack@duboce.net> wrote:
>>>>>
>>>>>   You could increase your zookeeper session timeout to 5 minutes while
>>>>>> you
>>>>>> are figuring why these long pauses.
>>>>>> http://hbase.apache.org/book.**html#zookeeper.session.timeout<http://hbase.apache.org/book.html#zookeeper.session.timeout>
>>>>>>
>>>>>> Above, there is an outage for almost 5 minutes:
>>>>>>
>>>>>>   We slept 225100ms instead of 3000ms, this is likely due to a long
>>>>>>> You have ganglia or tsdb running?  When you see the big pause
above,
>>>>>> can
>>>>>> you see anything going on on the machine?  (swap, iowait, concurrent
>>>>>> fat
>>>>>> mapreduce job?)
>>>>>>
>>>>>> St.Ack
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Sun, Mar 10, 2013 at 3:29 PM, Pablo Musa <pablo@psafe.com>
wrote:
>>>>>>
>>>>>>   Hi Sreepathi,
>>>>>>> they say in the book (or the site), we could try it to see if
it is
>>>>>>>
>>>>>> really
>>>>>>
>>>>>>> a timeout error
>>>>>>> or there is something more. But it is not recomended for production
>>>>>>> environments.
>>>>>>>
>>>>>>> I could give it a try if five minutes will ensure to us that
the
>>>>>>>
>>>>>> problem
>>>>>> is the GC or
>>>>>>> elsewhere!! Anyway, I think it is hard to beleive a GC is taking
2:30
>>>>>>> minutes.
>>>>>>>
>>>>>>> Abs,
>>>>>>> Pablo
>>>>>>>
>>>>>>>
>>>>>>> On 03/10/2013 04:06 PM, Sreepathi wrote:
>>>>>>>
>>>>>>>   Hi Stack/Ted/Pablo,
>>>>>>>> Should we increase the hbase.rpc.timeout property to 5 minutes
(
>>>>>>>>
>>>>>>> 300000
>>>>>> ms
>>>>>>
>>>>>>> )  ?
>>>>>>>> Regards,
>>>>>>>> - Sreepathi
>>>>>>>>
>>>>>>>> On Sun, Mar 10, 2013 at 11:59 AM, Pablo Musa <pablo@psafe.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>     That combo should be fine.
>>>>>>>>
>>>>>>>>> Great!!
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>     If JVM is full GC'ing, the application is stopped.
>>>>>>>>>
>>>>>>>>>> The below does not look like a full GC but that is
a long pause in
>>>>>>>>>> system
>>>>>>>>>> time, enough to kill your zk session.
>>>>>>>>>>
>>>>>>>>>>   Exactly. This pause is really making the zk expire
the RS which
>>>>>>>> shutsdown
>>>>>>> (logs
>>>>>>>>> in the end of the email).
>>>>>>>>> But the question is: what is causing this pause??!!
>>>>>>>>>
>>>>>>>>>     You swapping?
>>>>>>>>> I don't think so (stats below).
>>>>>>>>>
>>>>>>>>>     Hardware is good?
>>>>>>>>> Yes, it is a 16 processor machine with 74GB of RAM and
plenty disk
>>>>>>>>>
>>>>>>>> space.
>>>>>>> Below are some metrics I have heard about. Hope it helps.
>>>>>>>>>
>>>>>>>>> ** I am having some problems with the datanodes[1] which
are having
>>>>>>>>> trouble to
>>>>>>>>> write. I really think the issues are related, but cannot
solve any
>>>>>>>>> of
>>>>>>>>> them
>>>>>>>>> :(
>>>>>>>>>
>>>>>>>>> Thanks again,
>>>>>>>>> Pablo
>>>>>>>>>
>>>>>>>>> [1] http://mail-archives.apache.
>>>>>>>>>
>>>>>>>> ****org/mod_mbox/hadoop-hdfs-**user/****
>>>>> 201303.mbox/%3CCAJzooYfS-F1KS+**********jGOPUt15PwFjcCSzigE0APeM9FXaCr*
>>>>> *****
>>>>>
>>>>>> QfVbyQ@mail.gmail.com%3E<http:****//mail-archives.apache.org/****<http://mail-archives.apache.org/**>
>>>>>>>>> mod_mbox/hadoop-hdfs-user/****201303.mbox/%3CCAJzooYfS-F1KS+****
>>>>>>>>> jGOPUt15PwFjcCSzigE0APeM9FXaCr****QfVbyQ@mail.gmail.com%3E<
>>>>>>>>>
>>>>>>>> http://mail-archives.apache.**org/mod_mbox/hadoop-hdfs-user/**
>>>>> 201303.mbox/%3CCAJzooYfS-F1KS+**jGOPUt15PwFjcCSzigE0APeM9FXaCr**
>>>>> QfVbyQ@mail.gmail.com%3E<http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-user/201303.mbox/%3CCAJzooYfS-F1KS+jGOPUt15PwFjcCSzigE0APeM9FXaCrQfVbyQ@mail.gmail.com%3E>
>>>>>
>>>>>> top - 15:38:04 up 297 days, 21:03,  2 users,  load average: 4.34,
>>>>>>>> 2.55,
>>>>>> 1.28
>>>>>>>>> Tasks: 528 total,   1 running, 527 sleeping,   0 stopped,
  0 zombie
>>>>>>>>> Cpu(s):  0.1%us,  0.2%sy,  0.0%ni, 99.7%id,  0.0%wa,
 0.0%hi,
>>>>>>>>> 0.0%si,
>>>>>>>>>      0.0%st
>>>>>>>>> Mem:  74187256k total, 29493992k used, 44693264k free,
 5836576k
>>>>>>>>>
>>>>>>>> buffers
>>>>>>> Swap: 51609592k total,   128312k used, 51481280k free,  1353400k
>>>>>>>> cached
>>>>>> ]$ vmstat -w
>>>>>>>>> procs -------------------memory-----******-------------
---swap--
>>>>>>>>> -----io----
>>>>>>>>> --system-- -----cpu-------
>>>>>>>>>      r  b       swpd       free       buff      cache
  si   so
>>>>>>>>>   bi bo
>>>>>>>>> in
>>>>>>>>>       cs  us sy  id wa st
>>>>>>>>>      2  0     128312   32416928    5838288    5043560
   0    0
>>>>>>>>> 202 53
>>>>>>>>>     0
>>>>>>>>>        0   2  1  96  1  0
>>>>>>>>>
>>>>>>>>> ]$ sar
>>>>>>>>> 02:20:01 PM     all     26.18      0.00      2.90   
  0.63 0.00
>>>>>>>>> 70.29
>>>>>>>>> 02:30:01 PM     all      1.66      0.00      1.25   
  1.05 0.00
>>>>>>>>> 96.04
>>>>>>>>> 02:40:01 PM     all     10.01      0.00      2.14   
  0.75 0.00
>>>>>>>>> 87.11
>>>>>>>>> 02:50:01 PM     all      0.76      0.00      0.80   
  1.03 0.00
>>>>>>>>> 97.40
>>>>>>>>> 03:00:01 PM     all      0.23      0.00      0.30   
  0.71 0.00
>>>>>>>>> 98.76
>>>>>>>>> 03:10:01 PM     all      0.22      0.00      0.30   
  0.66 0.00
>>>>>>>>> 98.82
>>>>>>>>> 03:20:01 PM     all      0.22      0.00      0.31   
  0.76 0.00
>>>>>>>>> 98.71
>>>>>>>>> 03:30:01 PM     all      0.24      0.00      0.31   
  0.64 0.00
>>>>>>>>> 98.81
>>>>>>>>> 03:40:01 PM     all      1.13      0.00      2.97   
  1.18 0.00
>>>>>>>>> 94.73
>>>>>>>>> Average:        all      3.86      0.00      1.38   
  0.88 0.00
>>>>>>>>> 93.87
>>>>>>>>>
>>>>>>>>> ]$ iostat
>>>>>>>>> Linux 2.6.32-220.7.1.el6.x86_64 (PSLBHDN002)     03/10/2013
_x86_64_
>>>>>>>>>      (16 CPU)
>>>>>>>>>
>>>>>>>>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>>>>>>>>                1.86    0.00    0.96    0.78    0.00 
 96.41
>>>>>>>>>
>>>>>>>>> Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read
Blk_wrtn
>>>>>>>>> sda               1.23        20.26        23.53  521533196
>>>>>>>>> 605566924
>>>>>>>>> sdb               6.51       921.55       241.90 23717850730
>>>>>>>>>
>>>>>>>> 6225863488
>>>>>> sdc               6.22       921.83       236.41 23725181162
>>>>>>>> 6084471192
>>>>>> sdd               6.25       925.13       237.26 23810004970
>>>>>>>> 6106357880
>>>>>> sde               6.19       913.90       235.60 23521108818
>>>>>>>> 6063722504
>>>>>> sdh               6.26       933.08       237.77 24014594546
>>>>>>>> 6119511376
>>>>>> sdg               6.18       914.36       235.31 23532747378
>>>>>>>> 6056257016
>>>>>> sdf               6.24       923.66       235.33 23772251810
>>>>>>>> 6056604008
>>>>>> Some more logging which reinforce that the RS crash is happening
>>>>>>>> because
>>>>>>> of
>>>>>>>>> timeout. However this time the GC log is not accusing
a big time.
>>>>>>>>>
>>>>>>>>> #####RS LOG#####
>>>>>>>>> 2013-03-10 15:37:46,712 INFO org.apache.zookeeper.******ClientCnxn:
>>>>>>>>>
>>>>>>>> Client
>>>>>>> session timed out, have not heard from server in 257739ms for
>>>>>>>> sessionid
>>>>>> 0x13d3c4bcba6014a, closing socket connection and attempting reconnect
>>>>>>>>> 2013-03-10 15:37:46,712 INFO org.apache.zookeeper.******ClientCnxn:
>>>>>>>>>
>>>>>>>> Client
>>>>>>> session timed out, have not heard from server in 226785ms for
>>>>>>>> sessionid
>>>>>> 0x13d3c4bcba60149, closing socket connection and attempting reconnect
>>>>>>>>> 2013-03-10 15:37:46,712 DEBUG org.apache.hadoop.hbase.io.******
>>>>>>>>> hfile.LruBlockCache:
>>>>>>>>> Stats: total=61.91 MB, free=1.94 GB, max=2 GB, blocks=1254,
>>>>>>>>> accesses=60087,
>>>>>>>>> hits=58811, hitRatio=97.87%, , cachingAccesses=60069,
>>>>>>>>>
>>>>>>>> cachingHits=58811,
>>>>>>> cachingHitsRatio=97.90%, , evictions=0, evicted=0, evictedPerRun=NaN
>>>>>>>>> 2013-03-10 15:37:46,712 WARN
>>>>>>>>>
>>>>>>>> org.apache.hadoop.hbase.util.******Sleeper:
>>>>>> We
>>>>>>>>> slept 225100ms instead of 3000ms, this is likely due
to a long
>>>>>>>>>
>>>>>>>> garbage
>>>>>> collecting pause and it's usually bad, see
>>>>>>>>> http://hbase.apache.org/book.******<http://hbase.apache.org/book.****><
>>>>>>>>> http://hbase.apache.org/book.**** <http://hbase.apache.org/book.**>
>>>>>>>>> html#trouble.rs.runtime.******zkexpired<http://hbase.apache.****
>>>>>>>>> org/book.html#trouble.rs.****runtime.zkexpired<
>>>>>>>>>
>>>>>>>> http://hbase.apache.org/book.**html#trouble.rs.runtime.**zkexpired<http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired>
>>>>>>> 2013-03-10 15:37:46,714 WARN org.apache.hadoop.hdfs.******DFSClient:
>>>>>>>>> DFSOutputStream ResponseProcessor exception  for block
>>>>>>>>> BP-43236042-172.17.2.10-******1362490844340:blk_-**
>>>>>>>>> 6834190810033122569_25150229
>>>>>>>>>
>>>>>>>>> java.io.EOFException: Premature EOF: no length prefix
available
>>>>>>>>>             at org.apache.hadoop.hdfs.******
>>>>>>>>> protocol.HdfsProtoUtil.**
>>>>>>>>> vintPrefixed(HdfsProtoUtil.******java:171)
>>>>>>>>>             at org.apache.hadoop.hdfs.******protocol.datatransfer.**
>>>>>>>>> PipelineAck.readFields(******PipelineAck.java:114)
>>>>>>>>>             at
>>>>>>>>>
>>>>>>>> org.apache.hadoop.hdfs.******DFSOutputStream$DataStreamer$******
>>>>>>> ResponseProcessor.run(******DFSOutputStream.java:670)
>>>>>>>>> 2013-03-10 15:37:46,716 ERROR org.apache.hadoop.hbase.**
>>>>>>>>> regionserver.HRegionServer:
>>>>>>>>> org.apache.hadoop.hbase.ipc.******CallerDisconnectedException:
>>>>>>>>> Aborting
>>>>>>>>> call
>>>>>>>>> get([B@7caf69ed, {"timeRange":[0,******9223372036854775807],"**
>>>>>>>>> totalColumns":1,"cacheBlocks":******true,"families":{"details"**
>>>>>>>>> :[**"**
>>>>>>>>> ALL"]},"maxVersions":1,"row":"******\\x00\\x00\\x00\\x00\\x00\**
>>>>>>>>> \***
>>>>>>>>> *x12\\x93@"}),
>>>>>>>>> rpc version=1, client version=29, methodsFingerPrint=1891768260
from
>>>>>>>>> 172.17.1.71:51294 after 0 ms, since caller disconnected
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> #####GC LOG#####
>>>>>>>>> 2716.635: [GC 2716.635: [ParNew: 57570K->746K(59008K),
0.0046530
>>>>>>>>>
>>>>>>>> secs]
>>>>>> 354857K->300891K(1152704K), 0.0047370 secs] [Times: user=0.03
>>>>>>>> sys=0.00,
>>>>>> real=0.00 secs]
>>>>>>>>> 2789.478: [GC 2789.478: [ParNew: 53226K->1192K(59008K),
0.0041370
>>>>>>>>>
>>>>>>>> secs]
>>>>>> 353371K->301337K(1152704K), 0.0042220 secs] [Times: user=0.04
>>>>>>>> sys=0.00,
>>>>>> real=0.00 secs]
>>>>>>>>> 2868.435: [GC 2868.435: [ParNew: 53672K->740K(59008K),
0.0041570
>>>>>>>>>
>>>>>>>> secs]
>>>>>> 353817K->300886K(1152704K), 0.0042440 secs] [Times: user=0.03
>>>>>>>> sys=0.00,
>>>>>> real=0.01 secs]
>>>>>>>>> 2920.309: [GC 2920.309: [ParNew: 53220K->6202K(59008K),
0.0058440
>>>>>>>>>
>>>>>>>> secs]
>>>>>> 353366K->310443K(1152704K), 0.0059410 secs] [Times: user=0.05
>>>>>>>> sys=0.00,
>>>>>> real=0.01 secs]
>>>>>>>>> 2984.239: [GC 2984.239: [ParNew: 58680K->5282K(59008K),
0.0048070
>>>>>>>>>
>>>>>>>> secs]
>>>>>> 362921K->310977K(1152704K), 0.0049180 secs] [Times: user=0.03
>>>>>>>> sys=0.00,
>>>>>> real=0.01 secs]
>>>>>>>>> 3050.620: [GC 3050.620: [ParNew: 57524K->5292K(59008K),
0.0055510
>>>>>>>>>
>>>>>>>> secs]
>>>>>> 363219K->313384K(1152704K), 0.0056520 secs] [Times: user=0.04
>>>>>>>> sys=0.00,
>>>>>> real=0.01 secs]
>>>>>>>>> 3111.331: [GC 3111.331: [ParNew: 57542K->5637K(59008K),
0.0056900
>>>>>>>>>
>>>>>>>> secs]
>>>>>> 365634K->318079K(1152704K), 0.0057950 secs] [Times: user=0.03
>>>>>>>> sys=0.00,
>>>>>> real=0.01 secs]
>>>>>>>>> 3163.510: [GC 3163.510: [ParNew: 58117K->1138K(59008K),
0.0046540
>>>>>>>>>
>>>>>>>> secs]
>>>>>> 370559K->316086K(1152704K), 0.0047470 secs] [Times: user=0.03
>>>>>>>> sys=0.00,
>>>>>> real=0.01 secs]
>>>>>>>>> 3217.642: [GC 3217.643: [ParNew: 53618K->1001K(59008K),
0.0038530
>>>>>>>>>
>>>>>>>> secs]
>>>>>> 368566K->315949K(1152704K), 0.0039450 secs] [Times: user=0.03
>>>>>>>> sys=0.00,
>>>>>> real=0.00 secs]
>>>>>>>>> 3464.067: [GC 3464.067: [ParNew: 53481K->5348K(59008K),
0.0068370
>>>>>>>>>
>>>>>>>> secs]
>>>>>> 368429K->322343K(1152704K), 0.0069170 secs] [Times: user=0.07
>>>>>>>> sys=0.01,
>>>>>> real=0.00 secs]
>>>>>>>>> 3465.648: [GC 3465.648: [ParNew: 57828K->1609K(59008K),
0.0111520
>>>>>>>>>
>>>>>>>> secs]
>>>>>> 374823K->321496K(1152704K), 0.0112170 secs] [Times: user=0.07
>>>>>>>> sys=0.00,
>>>>>> real=0.01 secs]
>>>>>>>>> 3466.291: [GC 3466.291: [ParNew: 54089K->2186K(59008K),
0.0093190
>>>>>>>>>
>>>>>>>> secs]
>>>>>> 373976K->322073K(1152704K), 0.0093800 secs] [Times: user=0.06
>>>>>>>> sys=0.00,
>>>>>> real=0.01 secs]
>>>>>>>>> Heap
>>>>>>>>>      par new generation   total 59008K, used 15497K
>>>>>>>>> [0x00000005fae00000,
>>>>>>>>> 0x00000005fee00000, 0x00000005fee00000)
>>>>>>>>>       eden space 52480K,  25% used [0x00000005fae00000,
>>>>>>>>>
>>>>>>>> 0x00000005fbaff8c8,
>>>>>>> 0x00000005fe140000)
>>>>>>>>>       from space 6528K,  33% used [0x00000005fe140000,
>>>>>>>>>
>>>>>>>> 0x00000005fe362b70,
>>>>>> 0x00000005fe7a0000)
>>>>>>>>>       to   space 6528K,   0% used [0x00000005fe7a0000,
>>>>>>>>>
>>>>>>>> 0x00000005fe7a0000,
>>>>>> 0x00000005fee00000)
>>>>>>>>>      concurrent mark-sweep generation total 1093696K,
used 319887K
>>>>>>>>> [0x00000005fee00000, 0x0000000641a10000, 0x00000007fae00000)
>>>>>>>>>      concurrent-mark-sweep perm gen total 30464K, used
30427K
>>>>>>>>> [0x00000007fae00000, 0x00000007fcbc0000, 0x0000000800000000)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 03/08/2013 07:02 PM, Stack wrote:
>>>>>>>>>
>>>>>>>>>     On Fri, Mar 8, 2013 at 10:58 AM, Pablo Musa <pablo@psafe.com>
>>>>>>>>>
>>>>>>>> wrote:
>>>>>>      0.94 currently doesn't support hadoop 2.0
>>>>>>>>>>   Can you deploy hadoop 1.1.1 instead ?
>>>>>>>>>>>>      I am using cdh4.2.0 which uses this
version as default
>>>>>>>>>>>>
>>>>>>>>>>> installation.
>>>>>>> I think it will be a problem for me to deploy 1.1.1 because I
would
>>>>>>>>>>> need
>>>>>>>>>>> to
>>>>>>>>>>> "upgrade" the whole cluster with 70TB of data
(backup everything,
>>>>>>>>>>>
>>>>>>>>>> go
>>>>>> offline, etc.).
>>>>>>>>>>> Is there a problem to use cdh4.2.0?
>>>>>>>>>>> I should send my email to cdh list?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>      That combo should be fine.
>>>>>>>>>>>
>>>>>>>>>>>         You Full GC'ing around this time?
>>>>>>>>>>   The GC shows it took a long time. However it does
not make any
>>>>>>>>>> sense
>>>>>> to be it, since the same ammount of data was cleaned before and
>>>>>>>>>> AFTER
>>>>>> in just 0.01 secs!
>>>>>>>>>>>
>>>>>>>>>>>      If JVM is full GC'ing, the application is
stopped.
>>>>>>>>>>>
>>>>>>>>>>>       [Times: user=0.08 sys=137.62, real=137.62
secs]
>>>>>>>>>>   Besides the whole time was used by system. That
is what is bugging
>>>>>>>>>> me.
>>>>>>>      The below does not look like a full GC but that is a long
pause
>>>>>>>>>> in
>>>>>> system
>>>>>>>>>> time, enough to kill your zk session.
>>>>>>>>>>
>>>>>>>>>> You swapping?
>>>>>>>>>>
>>>>>>>>>> Hardware is good?
>>>>>>>>>>
>>>>>>>>>> St.Ack
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>        ...
>>>>>>>>>>
>>>>>>>>>>   1044.081: [GC 1044.081: [ParNew: 58970K->402K(59008K),
0.0040990
>>>>>>>>>> secs]
>>>>>>> 275097K->216577K(1152704K), 0.0041820 secs] [Times: user=0.03
>>>>>>>>>> sys=0.00,
>>>>>>> real=0.01 secs]
>>>>>>>>>>> 1087.319: [GC 1087.319: [ParNew: 52873K->6528K(59008K),
0.0055000
>>>>>>>>>>>
>>>>>>>>>> secs]
>>>>>>> 269048K->223592K(1152704K), 0.0055930 secs] [Times: user=0.04
>>>>>>>>>> sys=0.01,
>>>>>>> real=0.00 secs]
>>>>>>>>>>> 1087.834: [GC 1087.834: [ParNew: 59008K->6527K(59008K),
>>>>>>>>>>> 137.6353620
>>>>>>>>>>> secs] 276072K->235097K(1152704K), 137.6354700
secs] [Times:
>>>>>>>>>>>
>>>>>>>>>> user=0.08
>>>>>> sys=137.62, real=137.62 secs]
>>>>>>>>>>> 1226.638: [GC 1226.638: [ParNew: 59007K->1897K(59008K),
0.0079960
>>>>>>>>>>>
>>>>>>>>>> secs]
>>>>>>> 287577K->230937K(1152704K), 0.0080770 secs] [Times: user=0.05
>>>>>>>>>> sys=0.00,
>>>>>>> real=0.01 secs]
>>>>>>>>>>> 1227.251: [GC 1227.251: [ParNew: 54377K->2379K(59008K),
0.0095650
>>>>>>>>>>>
>>>>>>>>>> secs]
>>>>>>> 283417K->231420K(1152704K), 0.0096340 secs] [Times: user=0.06
>>>>>>>>>> sys=0.00,
>>>>>>> real=0.01 secs]
>>>>>>>>>>>
>>>>>>>>>>> I really appreciate you guys helping me to find
out what is wrong.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Pablo
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 03/08/2013 02:11 PM, Stack wrote:
>>>>>>>>>>>
>>>>>>>>>>>      What RAM says.
>>>>>>>>>>>
>>>>>>>>>>>   2013-03-07 17:24:57,887 INFO
>>>>>>>>>>> org.apache.zookeeper.**********ClientCnxn:
>>>>>>> Client
>>>>>>>>>>>> session timed out, have not heard from server
in 159348ms for
>>>>>>>>>>>> sessionid
>>>>>>>>>>>> 0x13d3c4bcba600a7, closing socket connection
and attempting
>>>>>>>>>>>>
>>>>>>>>>>> reconnect
>>>>>>> You Full GC'ing around this time?
>>>>>>>>>>>> Put up your configs in a place where we can
take a look?
>>>>>>>>>>>>
>>>>>>>>>>>> St.Ack
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Mar 8, 2013 at 8:32 AM, ramkrishna
vasudevan <
>>>>>>>>>>>> ramkrishna.s.vasudevan@gmail.********com
>>>>>>>>>>>>
>>>>>>>>>>> <ramkrishna.s.vasudevan@gmail.
>>>>>>> ****
>>>>>>>>>>>> com <ramkrishna.s.vasudevan@gmail.****com<
>>>>>>>>>>>>
>>>>>>>>>>> ramkrishna.s.vasudevan@gmail.**com<ramkrishna.s.vasudevan@gmail.com>
>>>>>>> wrote:
>>>>>>>>>>>>       I think it is with your GC config.
 What is your heap size?
>>>>>>>>>>>>
>>>>>>>>>>>     What
>>>>>> is
>>>>>>>>>>>> the
>>>>>>>>>>>>
>>>>>>>>>>>>     data that you pump in and how much is
the block cache size?
>>>>>>>>>>>>
>>>>>>>>>>>>> Regards
>>>>>>>>>>>>> Ram
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, Mar 8, 2013 at 9:31 PM, Ted Yu
<yuzhihong@gmail.com>
>>>>>>>>>>>>>
>>>>>>>>>>>> wrote:
>>>>>>>       0.94 currently doesn't support hadoop 2.0
>>>>>>>>>>>>>     Can you deploy hadoop 1.1.1 instead
?
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Are you using 0.94.5 ?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, Mar 8, 2013 at 7:44 AM, Pablo
Musa <pablo@psafe.com>
>>>>>>>>>>>>>>
>>>>>>>>>>>>> wrote:
>>>>>>>       Hey guys,
>>>>>>>>>>>>>>     as I sent in an email a long
time ago, the RSs in my cluster
>>>>>>>>>>>>>>
>>>>>>>>>>>>> did
>>>>>> not
>>>>>>>>>>>>>>>      get
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>   along
>>>>>>>>>>>>>>     and crashed 3 times a day. I
tried a lot of options we
>>>>>>>>>>>>>>
>>>>>>>>>>>>> discussed
>>>>>> in
>>>>>>
>>>>>>> the
>>>>>>>>>>>>>>> emails, but it not solved the
problem. As I used an old
>>>>>>>>>>>>>>> version
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> of
>>>>>>>      hadoop I
>>>>>>>>>>>>>>>       thought this was the problem.
>>>>>>>>>>>>>>   So, I upgraded from hadoop 0.20
- hbase 0.90 - zookeeper 3.3.5
>>>>>>>>>>>>>> to
>>>>>>      hadoop
>>>>>>>>>>>>>>>   2.0.0
>>>>>>>>>>>>>>     - hbase 0.94 - zookeeper 3.4.5.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Unfortunately the RSs did not
stop crashing, and worst! Now
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> they
>>>>>> crash
>>>>>>>>>>>>>>> every
>>>>>>>>>>>>>>> hour and some times when the
RS that holds the .ROOT. crashes
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> all
>>>>>>      cluster
>>>>>>>>>>>>>>>   get
>>>>>>>>>>>>>>     stuck in transition and everything
stops working.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> In this case I need to clean
zookeeper znodes, restart the
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> master
>>>>>> and
>>>>>>>>>>>>>>>      the
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>   RSs.
>>>>>>>>>>>>>>     To avoid this case I am running
on production with only ONE
>>>>>>>>>>>>>> RS
>>>>>>>>>>>>>>
>>>>>>>>>>>>> and
>>>>>>> a
>>>>>>>>>>>>>>> monitoring
>>>>>>>>>>>>>>> script that check every minute,
if the RS is ok. If not,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> restart
>>>>>> it.
>>>>>>>>>>>>>>> * This case does not get the
cluster stuck.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> This is driving me crazy, but
I really cant find a solution
>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> the
>>>>>>> cluster.
>>>>>>>>>>>>>>> I tracked all logs from the start
time 16:49 from all
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> interesting
>>>>>> nodes
>>>>>>>>>>>>>>> (zoo,
>>>>>>>>>>>>>>> namenode, master, rs, dn2, dn9,
dn10) and copied here what I
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> think
>>>>>>> is
>>>>>>>>>>>>>>> usefull.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> There are some strange errors
in the DATANODE2, as an error
>>>>>>>>>>>>>>> copiyng a
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>      block
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>       to itself.
>>>>>>>>>>>>>>   The gc log points to GC timeout.
However it is very weird that
>>>>>>>>>>>>>> the
>>>>>>> RS
>>>>>>>>>>>>>>>      spend
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>       so much time in GC while
in the other cases it takes
>>>>>>>>>>>>>> 0.001sec.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>   Besides,
>>>>>>>>>>>>>>> the time
>>>>>>>>>>>>>>> spent, is in sys which makes
me think that might be a problem
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> in
>>>>>>      another
>>>>>>>>>>>>>>>   place.
>>>>>>>>>>>>>>     I know that it is a bunch of
logs, and that it is very
>>>>>>>>>>>>>>
>>>>>>>>>>>>> difficult
>>>>>> to
>>>>>>
>>>>>>>      find
>>>>>>>>>>>>>>>   the
>>>>>>>>>>>>>>     problem without much context.
But I REALLY need some help.
>>>>>>>>>>>>>> If
>>>>>>>>>>>>>>
>>>>>>>>>>>>> it
>>>>>> is
>>>>>>
>>>>>>> not
>>>>>>>>>>>>>>>      the
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>       solution, at least what
I should read, where I should
>>>>>>>>>>>>>> look, or
>>>>>>>>>>>>>> which
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>       cases
>>>>>>>>>>>>>>>       I
>>>>>>>>>>>>>>   should monitor.
>>>>>>>>>>>>>>> Thank you very much,
>>>>>>>>>>>>>>> Pablo Musa
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>   --
>>>>>>>> *Regards,*
>>>>>>>> --- *Sreepathi *
>>>>>>>>
>>>>>>>>
>>>> --
>>>> Best regards,
>>>>
>>>>       - Andy
>>>>
>>>> Problems worthy of attack prove their worth by hitting back. - Piet Hein
>>>> (via Tom White)
>>>>


Mime
View raw message