hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From " Marcos Ortiz Valmaseda" <mlor...@uci.cu>
Subject Re: hbase-master-server slept
Date Tue, 12 Feb 2013 04:35:42 GMT
Well my friend, my first advice is to update your completed infrastructure: 
- Update your Hadoop to 1.x branch 
- Update HBase to 0.94.4 
- Update Zookeeper to 3.4.5 

Or simply update your CDH version to 4.1 or 4.2 

----- Mensaje original -----

De: "So Hibino" <hibino.so@lab.ntt.co.jp> 
Para: user@hbase.apache.org 
Enviados: Lunes, 11 de Febrero 2013 23:06:25 
Asunto: Re: hbase-master-server slept 

Hi, 


>The master doesn't have memstores so this wouldn't help. In fact it's 
>pretty rare that we see the master with GC issues. I recall seing 
>issues with time travelling (machine clock's too slow and ntpd resets 
>it) or on EC2 where sometimes you'd see random machine pauses out of 
>nowhere (although that was a long time ago and haven't used EC2 
>since). 
We doesn't use EC2,but this server works with KVM. 

The software version, the logs, the conf files are shown below. 

software version 
---------------------------------------- 
HBase version: 0.90.6-cdh3u4 
Hadoop version: 0.20.2+923.256-1 
Zookeeper version: 3.3.5+19.1-1 
Operating System: CentOS release 5.8 
Linux kernel version: 2.6.18-308.el5 
Java version: 1.6.0_31 
---------------------------------------- 

master log 
------------------ 
2013-02-12 00:10:24,309 DEBUG org.apache.hadoop.hbase.master.LoadBalancer: 
Server information: VM_11,60020,1359691508001=3 
2013-02-12 00:10:24,310 INFO org.apache.hadoop.hbase.master.LoadBalancer: 
Skipping load balancing. servers=1 regions=3 average=3.0 mostloaded=3 
leastloaded=3 
2013-02-12 00:10:24,318 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: 
Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s) 
2013-02-12 00:13:21,105 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 
13417ms instead of 1000ms, this is likely due to a long garbage collecting 
pause and it's usually bad, see 
http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired 
2013-02-12 00:13:55,239 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 
34132ms instead of 10000ms, this is likely due to a long garbage collecting 
pause and it's usually bad, see 
http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired 
2013-02-12 00:13:55,242 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 
24949ms instead of 1000ms, this is likely due to a long garbage collecting 
pause and it's usually bad, see 
http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired 
2013-02-12 00:14:18,441 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 
73255ms instead of 60000ms, this is likely due to a long garbage collecting 
pause and it's usually bad, see 
http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired 
2013-02-12 00:14:18,442 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 
23203ms instead of 10000ms, this is likely due to a long garbage collecting 
pause and it's usually bad, see 
http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired 
2013-02-12 00:14:18,444 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 
14017ms instead of 1000ms, this is likely due to a long garbage collecting 
pause and it's usually bad, see 
http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired 
2013-02-12 00:15:24,358 DEBUG org.apache.hadoop.hbase.master.LoadBalancer: 
Server information: VM_11,60020,1359691508001=3 
2013-02-12 00:15:24,358 INFO org.apache.hadoop.hbase.master.LoadBalancer: 
Skipping load balancing. servers=1 regions=3 average=3.0 mostloaded=3 
leastloaded=3 
2013-02-12 00:15:24,361 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: 
Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s) 
------------------ 


master GC log 
------------------ 
2013-02-11T23:46:37.285+0900: 902498.189: [GC 902498.189: [DefNew: 
17041K->16K(19136K), 0.0017450 secs] 20049K->3025K(83008K) icms_dc=0 , 
0.0018270 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] 
2013-02-12T00:35:25.628+0900: 905426.532: [GC 905426.532: [DefNew: 
17040K->18K(19136K), 0.0017430 secs] 20049K->3026K(83008K) icms_dc=0 , 
0.0018370 secs] [Times: user=0.00 sys=0.00, real=0.01 secs] 
2013-02-12T01:20:26.110+0900: 908127.014: [GC 908127.014: [DefNew: 
17034K->27K(19136K), 0.0023420 secs] 20043K->3036K(83008K) icms_dc=0 , 
0.0025090 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] 
------------------- 


region log 
------------------- 
2013-02-12 00:00:09,968 DEBUG 
org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=1.64 MB, 
free=197.94 MB, max=199.59 MB, blocks=3, accesses=3022, hits=3015, 
hitRatio=99.76%%, cachingAccesses=3015, cachingHits=3012, 
cachingHitsRatio=99.90%%, evictions=0, evicted=0, evictedPerRun=NaN 
2013-02-12 00:05:09,971 DEBUG 
org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=1.64 MB, 
free=197.94 MB, max=199.59 MB, blocks=3, accesses=3023, hits=3016, 
hitRatio=99.76%%, cachingAccesses=3016, cachingHits=3013, 
cachingHitsRatio=99.90%%, evictions=0, evicted=0, evictedPerRun=NaN 
2013-02-12 00:10:12,109 DEBUG 
org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=1.64 MB, 
free=197.94 MB, max=199.59 MB, blocks=3, accesses=3024, hits=3017, 
hitRatio=99.76%%, cachingAccesses=3017, cachingHits=3014, 
cachingHitsRatio=99.90%%, evictions=0, evicted=0, evictedPerRun=NaN 
2013-02-12 00:15:09,969 DEBUG 
org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=1.64 MB, 
free=197.94 MB, max=199.59 MB, blocks=3, accesses=3025, hits=3018, 
hitRatio=99.76%%, cachingAccesses=3018, cachingHits=3015, 
cachingHitsRatio=99.90%%, evictions=0, evicted=0, evictedPerRun=NaN 
2013-02-12 00:20:09,970 DEBUG 
org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=1.64 MB, 
free=197.94 MB, max=199.59 MB, blocks=3, accesses=3026, hits=3019, 
hitRatio=99.76%%, cachingAccesses=3019, cachingHits=3016, 
cachingHitsRatio=99.90%%, evictions=0, evicted=0, evictedPerRun=NaN 
------------------- 


region GC log 
------------------ 
2013-02-11T22:31:11.315+0900: 897964.350: [GC 897964.350: [DefNew: 
17062K->35K(19136K), 0.0036000 secs] 40262K->23234K(83008K) icms_dc=0 , 
0.0037710 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
2013-02-12T00:27:13.313+0900: 904926.348: [GC 904926.348: [DefNew: 
17059K->43K(19136K), 0.0020250 secs] 40258K->23243K(83008K) icms_dc=0 , 
0.0021130 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
2013-02-12T02:23:52.114+0900: 911925.149: [GC 911925.149: [DefNew: 
17067K->43K(19136K), 0.0018170 secs] 40267K->23243K(83008K) icms_dc=0 , 
0.0019330 secs] [Times: user=0.00 sys=0.00, real=0.01 secs] 
--------------------- 


zookeeper log 
------------------- 
no logs at that time 
------------------- 


hbase-site.xml 
---------------------- 
<configuration> 
<property> 
<name>hbase.cluster.distributed</name> 
<value>true</value> 
</property> 
<property> 
<name>hbase.rootdir</name> 
<value>/var/lib/hbase/cache/${user.name}/root</value> 
</property> 
<property> 
<name>hbase.tmp.dir</name> 
<value>/var/lib/hbase/cache/${user.name}/tmp</value> 
</property> 
<property> 
<name>hbase.zookeeper.quorum</name> 
<value>VM_11</value> 
</property> 
</configuration> 
------------------------ 


zoo.cfg 
---------------------- 
tickTime=2000 
initLimit=10 
syncLimit=5 
dataDir=/var/zookeeper 
clientPort=2181 
server.0=VM_11:2888:3888 
---------------------- 



-- 
View this message in context: http://apache-hbase.679495.n3.nabble.com/hbase-master-server-slept-tp4038192p4038406.html

Sent from the HBase User mailing list archive at Nabble.com. 



-- 

Marcos Ortiz Valmaseda, 
Product Manager && Data Scientist at UCI 
Blog : http://marcosluis2186.posterous.com 
LinkedIn: http://www.linkedin.com/in/marcosluis2186 
Twitter : @marcosluis2186 

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message