hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thanasis Naskos <anas...@csd.auth.gr>
Subject Re: Newly added regionserver is not severing requests
Date Fri, 04 Oct 2013 11:18:14 GMT
>One possibility could be that the regions got balanced after the write load
>is complete. That means, when the regions were being written they were with
>one RS and once that is done, the region got assigned to the idle RS.

I think that this is the case, but why is this wrong? I write the data 
to the database with 3 RS's and when the write load is finished I add 
one more RS and run hadoop and hbase load balancers to assign some data 
and regions (respectively) to this new node (without adding new 
data).... Shouldn't this work?

>Are you sure you are that YCSB writes to the regions after balancing too?

I should have mentioned that once the data is written to the RS's (3 
RS's), YCSB sends only READ requests and doesn't write/insert/update 
anything else to the database even after new nodes (RS's) are added.

>Also you can run your benchmark now (after regions are balanced) and write
>some data to the regions on idle RS and see if it increases the request
>count.

I've tried to add (put) a new row to the database from inside the idle 
RS (shell) and the row was inserted properly (I've checked it with "get" 
)... but as expected nothing changed still I have 2 RS's idle

Thank you for your interest!!

On 10/04/2013 12:45 PM, Bharath Vissapragada wrote:
> One possibility could be that the regions got balanced after the write load
> is complete. That means, when the regions were being written they were with
> one RS and once that is done, the region got assigned to the idle RS.
>
> Are you sure you are that YCSB writes to the regions after balancing too?
> Also you can run your benchmark now (after regions are balanced) and write
> some data to the regions on idle RS and see if it increases the request
> count.
>
>
> On Fri, Oct 4, 2013 at 2:37 PM, Thanasis Naskos <anaskos@csd.auth.gr> wrote:
>
>> I'm setting up a Hbase cluster on a cloud infrastructure.
>> HBase version: 0.94.11
>> Hadoop version: 1.0.4
>>
>> Currently I have 4 nodes in my cluster (1 master, 3 regionservers) and I'm
>> using YCSB (yahoo benchmarks) to create a table (500.000 rows) and send
>> requests (Asynchronous requests). Everything works fine with this setup (as
>> I'm monitoring the hole process with ganglia and I'm getting lamda,
>> throughput, latency combined with the YCSB's output), but the problem
>> occurs when I add a new regionserver on-the-fly as it doesn't getting any
>> requests.
>>
>> What "on-the-fly" means:
>> While the YCSB is sending request to the cluster, I'm adding new
>> regionservers using python scripts.
>>
>> Addition Process (while the cluster is serving requests):
>>
>> 1. I'm creating a new VM which will act as the new regionserver and
>>     configure every needed aspect (hbase, hadoop, /etc/host, connect to
>>     private network, etc)
>> 2. Stoping **hbase** balancer
>> 3. Configuring every node in the cluster with the new node's information
>>       * adding hostname to regioservers files
>>       * adding hostname to hadoop's slave file
>>       * adding hostname and IP to /etc/host file of every node
>>       * etc
>> 4. Executing on the master node:
>>       * `hadoop/bin/start-dfs.sh`
>>       * `hadoop/bin/start-mapred.sh`
>>       * `hbase/bin/start-hbase.sh`
>>         (I've also tried to run `hbase start regionserver` on the newly
>>         added node and does exactly the same with the last command -
>>         starts the regionserver)
>> 5. Once the newly added node is up and running I'm executing **hadoop**
>>     load balancer
>> 6. When the hadoop load balancer stops I'm starting again the **hbase**
>>     load balancer
>>
>> I'm connecting over ssh to the master node and check that the load
>> balancers (hbase/hadoop) did their job as both the blocks and regions are
>> uniformly spread across all the regionservers/slaves including the new one.
>> But when I run status 'simple' in the hbase shell I see that the new
>> regionservers are not getting any requests. (below is the output of the
>> command after adding 2 new regionserver "okeanos-nodes-4/5")
>>
>> |hbase(main):008:0> status 'simple'
>> 5 live servers
>>      okeanos-nodes-1:60020 1380865800330
>>          requestsPerSecond=5379, numberOfOnlineRegions=4, usedHeapMB=175,
>> maxHeapMB=3067
>>      okeanos-nodes-2:60020 1380865800738
>>          requestsPerSecond=5674, numberOfOnlineRegions=4, usedHeapMB=161,
>> maxHeapMB=3067
>>      okeanos-nodes-5:60020 1380867725605
>>          requestsPerSecond=0, numberOfOnlineRegions=3, usedHeapMB=27,
>> maxHeapMB=3067
>>      okeanos-nodes-3:60020 1380865800162
>>          requestsPerSecond=3871, numberOfOnlineRegions=5, usedHeapMB=162,
>> maxHeapMB=3067
>>      okeanos-nodes-4:60020 1380866702216
>>          requestsPerSecond=0, numberOfOnlineRegions=3, usedHeapMB=29,
>> maxHeapMB=3067
>> 0 dead servers
>> Aggregate load: 14924, regions: 19|
>>
>> The fact that they don't serve any requests is also evidenced by the CPU
>> usage, in a serving regionserver is about 70% while in these 2 regioservers
>> is about 2%.
>>
>> Below is the output of|hadoop dfsadmin -report|, as you can see the block
>> are evenly distributed (according to|hadoop balancer -threshold 2|).
>>
>> |root@okeanos-nodes-master:~# /opt/hadoop-1.0.4/bin/hadoop dfsadmin
>> -report
>> Configured Capacity: 105701683200 (98.44 GB)
>> Present Capacity: 86440648704 (80.5 GB)
>> DFS Remaining: 84188446720 (78.41 GB)
>> DFS Used: 2252201984 (2.1 GB)
>> DFS Used%: 2.61%
>> Under replicated blocks: 0
>> Blocks with corrupt replicas: 0
>> Missing blocks: 0
>>
>> ------------------------------**-------------------
>> Datanodes available: 5 (5 total, 0 dead)
>>
>> Name: 10.0.0.11:50010
>> Decommission Status : Normal
>> Configured Capacity: 21140336640 (19.69 GB)
>> DFS Used: 309166080 (294.84 MB)
>> Non DFS Used: 3851579392 (3.59 GB)
>> DFS Remaining: 16979591168(15.81 GB)
>> DFS Used%: 1.46%
>> DFS Remaining%: 80.32%
>> Last contact: Fri Oct 04 11:30:31 EEST 2013
>>
>>
>> Name: 10.0.0.3:50010
>> Decommission Status : Normal
>> Configured Capacity: 21140336640 (19.69 GB)
>> DFS Used: 531652608 (507.02 MB)
>> Non DFS Used: 3852300288 (3.59 GB)
>> DFS Remaining: 16756383744(15.61 GB)
>> DFS Used%: 2.51%
>> DFS Remaining%: 79.26%
>> Last contact: Fri Oct 04 11:30:32 EEST 2013
>>
>>
>> Name: 10.0.0.5:50010
>> Decommission Status : Normal
>> Configured Capacity: 21140336640 (19.69 GB)
>> DFS Used: 502910976 (479.61 MB)
>> Non DFS Used: 3853029376 (3.59 GB)
>> DFS Remaining: 16784396288(15.63 GB)
>> DFS Used%: 2.38%
>> DFS Remaining%: 79.4%
>> Last contact: Fri Oct 04 11:30:32 EEST 2013
>>
>>
>> Name: 10.0.0.4:50010
>> Decommission Status : Normal
>> Configured Capacity: 21140336640 (19.69 GB)
>> DFS Used: 421974016 (402.43 MB)
>> Non DFS Used: 3852365824 (3.59 GB)
>> DFS Remaining: 16865996800(15.71 GB)
>> DFS Used%: 2%
>> DFS Remaining%: 79.78%
>> Last contact: Fri Oct 04 11:30:29 EEST 2013
>>
>>
>> Name: 10.0.0.10:50010
>> Decommission Status : Normal
>> Configured Capacity: 21140336640 (19.69 GB)
>> DFS Used: 486498304 (463.96 MB)
>> Non DFS Used: 3851759616 (3.59 GB)
>> DFS Remaining: 16802078720(15.65 GB)
>> DFS Used%: 2.3%
>> DFS Remaining%: 79.48%
>> Last contact: Fri Oct 04 11:30:29 EEST 2013|
>>
>> I've tried stopping YCSB, restarting hbase master and restarting YCSB but
>> with no lack.. these 2 nodes don't serve any requests!
>>
>> As there are many log and conf files, I have created a zip file with logs
>> and confs (both hbase and hadoop) of the master, a healthy regionserver
>> serving requests and a regionserver not serving requests.https://dl.**
>> dropboxusercontent.com/u/**13480502/hbase_hadoop_logs__**conf.zip<https://dl.dropboxusercontent.com/u/13480502/hbase_hadoop_logs__conf.zip>
>>
>> Thank you in advance!!
>>
>>
>


Mime
View raw message