hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cyril Scetbon <cyril.scet...@free.fr>
Subject hosts unreachables
Date Tue, 29 May 2012 14:25:18 GMT
Hi,

I've installed hbase on the following configuration :

12 x (rest hbase + regionserver hbase + datanode hadoop)
2 x (zookeeper + hbase master)
1 x (zookeeper + hbase master + namenode hadoop)

OS used is ubuntu lucid (10.04)

The issue is that when I try to load data using rest api, some hosts 
become unreachable even if I can ping them. I can no longer connect to 
them and even monitoring tools can not work during a laps of time. For 
example, I use SAR on each host and you can see that between 7:10 and 
7:35 pm the host does not write any information :

06:45:01 PM     all      0.18      0.00      0.37      3.61      0.25 
   95.58
06:45:01 PM       0      0.24      0.00      0.54      6.62      0.35 
   92.25
06:45:01 PM       1      0.12      0.00      0.20      0.61      0.15 
   98.92
06:50:02 PM     all      5.69      0.00      1.79      4.23      1.94 
   86.36
06:50:02 PM       0      5.68      0.00      3.00      7.91      2.21 
   81.21
06:50:02 PM       1      5.70      0.00      0.59      0.55      1.66 
   91.51
06:55:01 PM     all      0.68      0.00      0.14      1.62      0.23 
   97.33
06:55:01 PM       0      0.87      0.00      0.20      3.19      0.31 
   95.44
06:55:01 PM       1      0.49      0.00      0.08      0.05      0.15 
   99.22
06:58:36 PM     all      0.03      0.00      0.02      0.45      0.07 
   99.43
06:58:36 PM       0      0.01      0.00      0.02      0.40      0.13 
   99.43
06:58:36 PM       1      0.04      0.00      0.01      0.51      0.00 
   99.43
07:05:01 PM     all      0.03      0.00      0.00      0.10      0.07 
   99.80
07:05:01 PM       0      0.02      0.00      0.00      0.10      0.10 
   99.78
07:05:01 PM       1      0.04      0.00      0.01      0.09      0.03 
   99.83 <--- last measure before host becomes reachable
07:40:07 PM     all     14.72      0.00     17.93      0.02     13.31 
   54.02 <--- new measure after host becomes reachable
07:40:07 PM       0     29.43      0.00     35.87      0.00     26.57 
    8.13
07:40:07 PM       1      0.00      0.00      0.00      0.04      0.04 
   99.91
07:45:01 PM     all      0.55      0.00      0.25      0.04      0.27 
   98.89
07:45:01 PM       0      0.54      0.00      0.14      0.05      0.21 
   99.07
07:45:01 PM       1      0.55      0.00      0.36      0.04      0.33 
   98.72
07:50:01 PM     all      0.11      0.00      0.05      0.18      0.06 
   99.60
07:50:01 PM       0      0.12      0.00      0.06      0.13      0.09 
   99.60
07:50:01 PM       1      0.11      0.00      0.04      0.23      0.04 
   99.59
07:55:01 PM     all      0.00      0.00      0.01      0.05      0.07 
   99.88
07:55:01 PM       0      0.00      0.00      0.01      0.01      0.13 
   99.84
07:55:01 PM       1      0.00      0.00      0.00      0.08      0.00 
   99.91
08:05:01 PM     all      0.01      0.00      0.00      0.00      0.05 
   99.94
08:05:01 PM       0      0.00      0.00      0.00      0.00      0.08 
   99.91
08:05:01 PM       1      0.03      0.00      0.00      0.00      0.01 
   99.96

I suppose it's caused by a high load but I don't have any proof :( Is 
there a known bug about that ? I had a similar issue with Cassandra that 
forced me to upgrade to linux kernel > 3.0

thanks.

-- 
Cyril SCETBON


Mime
View raw message