hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cyril Scetbon <cyril.scet...@free.fr>
Subject Re: hosts unreachables
Date Wed, 06 Jun 2012 09:59:00 GMT
I forgot to say that we're using Amazon EC2 instances. Maybe an issue is 
known ?

On 5/29/12 5:17 PM, Cyril Scetbon wrote:
> Hi,
>
> I've installed hbase on the following configuration :
>
> 12 x (rest hbase + regionserver hbase + datanode hadoop)
> 2 x (zookeeper + hbase master)
> 1 x (zookeeper + hbase master + namenode hadoop)
>
> OS used is ubuntu lucid (10.04)
>
> The issue is that when I try to load data using rest api, some hosts 
> become unreachable even if I can ping them. I can no longer connect to 
> them and even monitoring tools can not work during a laps of time. For 
> example, I use SAR on each host and you can see that between 7:10 and 
> 7:35 pm the host does not write any information :
>
> 06:45:01 PM     all      0.18      0.00      0.37      3.61      0.25 
> 95.58
> 06:45:01 PM       0      0.24      0.00      0.54      6.62      0.35 
> 92.25
> 06:45:01 PM       1      0.12      0.00      0.20      0.61      0.15 
> 98.92
> 06:50:02 PM     all      5.69      0.00      1.79      4.23      1.94 
> 86.36
> 06:50:02 PM       0      5.68      0.00      3.00      7.91      2.21 
> 81.21
> 06:50:02 PM       1      5.70      0.00      0.59      0.55      1.66 
> 91.51
> 06:55:01 PM     all      0.68      0.00      0.14      1.62      0.23 
> 97.33
> 06:55:01 PM       0      0.87      0.00      0.20      3.19      0.31 
> 95.44
> 06:55:01 PM       1      0.49      0.00      0.08      0.05      0.15 
> 99.22
> 06:58:36 PM     all      0.03      0.00      0.02      0.45      0.07 
> 99.43
> 06:58:36 PM       0      0.01      0.00      0.02      0.40      0.13 
> 99.43
> 06:58:36 PM       1      0.04      0.00      0.01      0.51      0.00 
> 99.43
> 07:05:01 PM     all      0.03      0.00      0.00      0.10      0.07 
> 99.80
> 07:05:01 PM       0      0.02      0.00      0.00      0.10      0.10 
> 99.78
> 07:05:01 PM       1      0.04      0.00      0.01      0.09      0.03 
> 99.83 <--- last measure before host becomes reachable
> 07:40:07 PM     all     14.72      0.00     17.93      0.02     13.31 
> 54.02 <--- new measure after host becomes reachable
> 07:40:07 PM       0     29.43      0.00     35.87      0.00     26.57 
>  8.13
> 07:40:07 PM       1      0.00      0.00      0.00      0.04      0.04 
> 99.91
> 07:45:01 PM     all      0.55      0.00      0.25      0.04      0.27 
> 98.89
> 07:45:01 PM       0      0.54      0.00      0.14      0.05      0.21 
> 99.07
> 07:45:01 PM       1      0.55      0.00      0.36      0.04      0.33 
> 98.72
> 07:50:01 PM     all      0.11      0.00      0.05      0.18      0.06 
> 99.60
> 07:50:01 PM       0      0.12      0.00      0.06      0.13      0.09 
> 99.60
> 07:50:01 PM       1      0.11      0.00      0.04      0.23      0.04 
> 99.59
> 07:55:01 PM     all      0.00      0.00      0.01      0.05      0.07 
> 99.88
> 07:55:01 PM       0      0.00      0.00      0.01      0.01      0.13 
> 99.84
> 07:55:01 PM       1      0.00      0.00      0.00      0.08      0.00 
> 99.91
> 08:05:01 PM     all      0.01      0.00      0.00      0.00      0.05 
> 99.94
> 08:05:01 PM       0      0.00      0.00      0.00      0.00      0.08 
> 99.91
> 08:05:01 PM       1      0.03      0.00      0.00      0.00      0.01 
> 99.96
>
> I suppose it's caused by a high load but I don't have any proof :( Is 
> there a known bug about that ? I had a similar issue with Cassandra 
> that forced me to upgrade to linux kernel > 3.0
>
> thanks.
>


-- 
Cyril SCETBON


Mime
View raw message