hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cyril Scetbon <cyril.scet...@free.fr>
Subject hosts unreachables
Date Tue, 29 May 2012 10:13:29 GMT
Hi,

I've installed hbase on the following configuration :

12 x (rest hbase + regionserver hbase + datanode hadoop)
2 x (zookeeper + hbase master)
1 x (zookeeper + hbase master + namenode hadoop)

OS used is ubuntu lucid (10.04)

The issue is that when I try to load data using rest api, some hosts 
become unreachable even if I can ping them. I can no longer connect to 
them and even monitoring tools can not work during a laps of time. For 
example, I use SAR on each host and you can see that between 7:10 and 
7:35 pm the host does not write any information :

06:45:01 PM     all      0.18      0.00      0.37      3.61      
0.25     95.58
06:45:01 PM       0      0.24      0.00      0.54      6.62      
0.35     92.25
06:45:01 PM       1      0.12      0.00      0.20      0.61      
0.15     98.92
06:50:02 PM     all      5.69      0.00      1.79      4.23      
1.94     86.36
06:50:02 PM       0      5.68      0.00      3.00      7.91      
2.21     81.21
06:50:02 PM       1      5.70      0.00      0.59      0.55      
1.66     91.51
06:55:01 PM     all      0.68      0.00      0.14      1.62      
0.23     97.33
06:55:01 PM       0      0.87      0.00      0.20      3.19      
0.31     95.44
06:55:01 PM       1      0.49      0.00      0.08      0.05      
0.15     99.22
06:58:36 PM     all      0.03      0.00      0.02      0.45      
0.07     99.43
06:58:36 PM       0      0.01      0.00      0.02      0.40      
0.13     99.43
06:58:36 PM       1      0.04      0.00      0.01      0.51      
0.00     99.43
07:05:01 PM     all      0.03      0.00      0.00      0.10      
0.07     99.80
07:05:01 PM       0      0.02      0.00      0.00      0.10      
0.10     99.78
07:05:01 PM       1      0.04      0.00      0.01      0.09      
0.03     99.83 <--- last measure before host becomes reachable
07:40:07 PM     all     14.72      0.00     17.93      0.02     
13.31     54.02 <--- new measure after host becomes reachable
07:40:07 PM       0     29.43      0.00     35.87      0.00     
26.57      8.13
07:40:07 PM       1      0.00      0.00      0.00      0.04      
0.04     99.91
07:45:01 PM     all      0.55      0.00      0.25      0.04      
0.27     98.89
07:45:01 PM       0      0.54      0.00      0.14      0.05      
0.21     99.07
07:45:01 PM       1      0.55      0.00      0.36      0.04      
0.33     98.72
07:50:01 PM     all      0.11      0.00      0.05      0.18      
0.06     99.60
07:50:01 PM       0      0.12      0.00      0.06      0.13      
0.09     99.60
07:50:01 PM       1      0.11      0.00      0.04      0.23      
0.04     99.59
07:55:01 PM     all      0.00      0.00      0.01      0.05      
0.07     99.88
07:55:01 PM       0      0.00      0.00      0.01      0.01      
0.13     99.84
07:55:01 PM       1      0.00      0.00      0.00      0.08      
0.00     99.91
08:05:01 PM     all      0.01      0.00      0.00      0.00      
0.05     99.94
08:05:01 PM       0      0.00      0.00      0.00      0.00      
0.08     99.91
08:05:01 PM       1      0.03      0.00      0.00      0.00      
0.01     99.96

I suppose it's caused by a high load but I don't have any proof :( Is 
there a known bug about that ? I had a similar issue with Cassandra that 
forced me to upgrade to linux kernel > 3.0

thanks.

-- 
Cyril SCETBON


Mime
View raw message