hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Gray <jl...@streamy.com>
Subject Re: Adding/Removing regionservers
Date Thu, 02 Jul 2009 18:31:14 GMT
That's how TableInputFormat works.  You get a Map task per Region, which 
  is basically a Scanner per Region.  And those can all run in parallel.

What is the query you are trying to optimize?  It is a scan?  How many 
rows, what kind of rows, are you specifying start/stop, using filters, etc?

llpind wrote:
> Thanks for the response.  Yes that makes sense, and it is the answer I
> expected, but needed to make sure from guys here.
> 
> I understand in theory having thousands of scanners are expected to improve
> performance.  Has this been done in practice, or is your statement purely
> for explanation purposes?  Given a single query (or nested query), how can I
> split up the work per scanner?  What other methodologies can I use to
> improve from a single point on-demand query time?
> 
> 
> Jonathan Gray-2 wrote:
>> Adding additional regionservers does not directly impact scan 
>> performance if there is no other load.  A region only lives on a single 
>> server so scanning a region is limited to the speed of that server.
>>
>> The more servers the have, the more nodes you distribute your dataset
>> over.
>>
>> A single scanner in isolation will run approximately the same on a small 
>> cluster vs a large cluster.  But if you have thousands of scanners 
>> swamping a cluster of 10 machines vs 100 machines, you expect ~10x 
>> speedup with 10x the nodes.  Of course that is a simplification but the 
>> goal of a system like HBase is to scale linearly with the number of nodes.
>>
>> Make sense?
>>
>> JG
>>
>> llpind wrote:
>>> Is there a relation between # of regionservers to performance of
>>> scanners?
>>>
>>> Say I have 6 boxes and I up to 12, will I see much improvement?
>>>
>>>
>>> Jonathan Gray-2 wrote:
>>>> Depends on the use case.
>>>>
>>>> Generally most clusters colocate DataNodes and RegionServers, so an 
>>>> equal number.
>>>>
>>>>
>>>> llpind wrote:
>>>>> Okay looks like it worked that way.  
>>>>>
>>>>> Thanks.
>>>>>
>>>>> What is recommended?  datanodes to regionservers?   Is it best to have
>>>>> equal
>>>>> # or diff?
>>>>>
>>>>>
>>>>>
>>>>> Jonathan Gray-2 wrote:
>>>>>> You do not need to restart the system to add or remove regionservers
>>>>>> (or 
>>>>>> hdfs datanodes).
>>>>>>
>>>>>> You should update your files, yes.  But to add a new regionserver,
set 
>>>>>> up all of the configuration files like the other nodes, and then
start 
>>>>>> it up:
>>>>>>
>>>>>> $HBASE_HOME/bin/hbase-daemon.sh start regionserver
>>>>>>
>>>>>>
>>>>>> Without seeing what was happening prior to the Scanner expiration
in
>>>>>> the 
>>>>>> regionserver logs, it's hard to tell what the problem is.  Can you
>>>>>> post 
>>>>>> the full logs somewhere?
>>>>>>
>>>>>> JG
>>>>>>
>>>>>> llpind wrote:
>>>>>>> Hey Guys,
>>>>>>>
>>>>>>> What do I need to do to add regionservers?  I thought all I had
to do
>>>>>>> was
>>>>>>> modify the regionservers file.  I've added another box, and updated
>>>>>>> the
>>>>>>> regionservers file (and Hadoop slaves), restarted hadoop/hbase
>>>>>>> cluster,
>>>>>>> and
>>>>>>> receive the following when trying to access data via scanner:
>>>>>>>
>>>>>>> 09/07/02 09:10:02 INFO ipc.HBaseClient: Retrying connect to server:
>>>>>>> /192.168.0.222:60020. Already tried 0 time(s).
>>>>>>> 09/07/02 09:10:03 INFO ipc.HBaseClient: Retrying connect to server:
>>>>>>> /192.168.0.222:60020. Already tried 1 time(s).
>>>>>>> 09/07/02 09:10:04 INFO ipc.HBaseClient: Retrying connect to server:
>>>>>>> /192.168.0.222:60020. Already tried 2 time(s).
>>>>>>> 09/07/02 09:10:05 INFO ipc.HBaseClient: Retrying connect to server:
>>>>>>> /192.168.0.222:60020. Already tried 3 time(s).
>>>>>>> 09/07/02 09:10:06 INFO ipc.HBaseClient: Retrying connect to server:
>>>>>>> /192.168.0.222:60020. Already tried 4 time(s).
>>>>>>> 09/07/02 09:10:07 INFO ipc.HBaseClient: Retrying connect to server:
>>>>>>> /192.168.0.222:60020. Already tried 5 time(s).
>>>>>>> 09/07/02 09:10:08 INFO ipc.HBaseClient: Retrying connect to server:
>>>>>>> /192.168.0.222:60020. Already tried 6 time(s).
>>>>>>> 09/07/02 09:10:09 INFO ipc.HBaseClient: Retrying connect to server:
>>>>>>> /192.168.0.222:60020. Already tried 7 time(s).
>>>>>>> 09/07/02 09:10:10 INFO ipc.HBaseClient: Retrying connect to server:
>>>>>>> /192.168.0.222:60020. Already tried 8 time(s).
>>>>>>> 09/07/02 09:10:11 INFO ipc.HBaseClient: Retrying connect to server:
>>>>>>> /192.168.0.222:60020. Already tried 9 time(s).
>>>>>>> 09/07/02 09:10:11 INFO ipc.HbaseRPC: Server at /192.168.0.222:60020
>>>>>>> not
>>>>>>> available yet, Zzzzz...
>>>>>>>
>>>>>>>
>>>>>>> on the box: =======================================================
>>>>>>>
>>>>>>> 2009-07-02 08:59:59,265 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
>>>>>>> 2646315558964059972 lease expired
>>>>>>> 2009-07-02 08:59:59,272 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
>>>>>>> 1810370986171440259 lease expired
>>>>>>> 2009-07-02 08:59:59,273 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
>>>>>>> -4857942742243144770 lease expired
>>>>>>> 2009-07-02 08:59:59,288 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
>>>>>>> 3123162276497438016 lease expired
>>>>>>> 2009-07-02 08:59:59,289 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
>>>>>>> 608738648389654018 lease expired
>>>>>>> 2009-07-02 08:59:59,296 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
>>>>>>> -2521940152150106398 lease expired
>>>>>>> 2009-07-02 08:59:59,297 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
>>>>>>> 8091669741182809260 lease expired
>>>>>>> 2009-07-02 08:59:59,304 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
>>>>>>> 7639149275918227842 lease expired
>>>>>>> 2009-07-02 08:59:59,305 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
>>>>>>> -4243449001964507288 lease expired
>>>>>>> 2009-07-02 08:59:59,336 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
>>>>>>> 1015989167671983355 lease expired
>>>>>>> 2009-07-02 08:59:59,365 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
>>>>>>> 3137801875907893411 lease expired
>>>>>>> 2009-07-02 08:59:59,365 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
>>>>>>> -8325694931774559300 lease expired
>>>>>>> 2009-07-02 08:59:59,372 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
>>>>>>> -8936585826880409980 lease expired
>>>>>>> 2009-07-02 08:59:59,373 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
>>>>>>> -908187178943291682 lease expired
>>>>>>> 2009-07-02 08:59:59,380 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
>>>>>>> 2654166457442359069 lease expired
>>>>>>> 2009-07-02 08:59:59,381 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
>>>>>>> -6211248022427672712 lease expired
>>>>>>> 2009-07-02 08:59:59,381 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
>>>>>>> 3974694446665362397 lease expired
>>>>>>> 2009-07-02 08:59:59,388 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
>>>>>>> 7443667809791934770 lease expired
>>>>>>> 2009-07-02 08:59:59,389 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegion:
>>>>>>> Closed tableA,954242,1245722360640
>>>>>>> 2009-07-02 08:59:59,390 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegion:
>>>>>>> Closed tableA,4603057,1245718674937
>>>>>>> 2009-07-02 08:59:59,390 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegion:
>>>>>>> Closed tableB,DATA\x7DATA\x20DATA\x7C1113950,1245707506310
>>>>>>> 2009-07-02 08:59:59,391 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegion:
>>>>>>> Closed tableB,NAME\x7DATA,\x20DATA\x7C440700,1245711086612
>>>>>>> 2009-07-02 08:59:59,391 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegion:
>>>>>>> Closed tableC,NAME\x7DATA,\x20DATA\x7C1304895,1246039336150
>>>>>>> 2009-07-02 08:59:59,391 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegion:
>>>>>>> Closed tableA,9320921,1245722360640
>>>>>>> 2009-07-02 08:59:59,392 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegion:
>>>>>>> Closed tableB,,1245707233007
>>>>>>> 2009-07-02 08:59:59,392 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegion:
>>>>>>> Closed tableA,706403,1245720721671
>>>>>>> 2009-07-02 08:59:59,393 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegion:
>>>>>>> Closed tableB,DATA\x7CTV\x20DATA\x7C1624424,1245711676808
>>>>>>> 2009-07-02 08:59:59,393 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegion:
>>>>>>> Closed tableA,3809913,1245718111373
>>>>>>> 2009-07-02 09:00:02,049 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
>>>>>>> -2051187117514067432 lease expired
>>>>>>> 2009-07-02 09:00:02,057 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
>>>>>>> -5436750646911643669 lease expired
>>>>>>> 2009-07-02 09:00:02,057 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
>>>>>>> 3703189751852364720 lease expired
>>>>>>> 2009-07-02 09:00:02,065 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
>>>>>>> -1804107906626216054 lease expired
>>>>>>> 2009-07-02 09:00:02,065 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegion:
>>>>>>> Closed tableA,1578094,1245716374054
>>>>>>> 2009-07-02 09:00:03,693 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
>>>>>>> 9131595104858116429 lease expired
>>>>>>> 2009-07-02 09:00:03,693 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
>>>>>>> 8992759827427237274 lease expired
>>>>>>> 2009-07-02 09:00:03,693 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
>>>>>>> -2019578407247616091 lease expired
>>>>>>> 2009-07-02 09:00:03,709 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
>>>>>>> 3596045071882952182 lease expired
>>>>>>> 2009-07-02 09:00:03,753 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
>>>>>>> -7795892844843445855 lease expired
>>>>>>> 2009-07-02 09:00:03,789 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
>>>>>>> -3894684572515627828 lease expired
>>>>>>> 2009-07-02 09:00:12,649 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
>>>>>>> -2055258947739971980 lease expired
>>>>>>> 2009-07-02 09:00:12,650 INFO org.apache.hadoop.hbase.Leases:
>>>>>>> regionserver/0.0.0.0:60020.leaseChecker closing leases
>>>>>>> 2009-07-02 09:00:12,650 INFO org.apache.hadoop.hbase.Leases:
>>>>>>> regionserver/0.0.0.0:60020.leaseChecker closed leases
>>>>>>> 2009-07-02 09:02:29,906 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: compactions
no
>>>>>>> longer
>>>>>>> limited
>>
> 

Mime
View raw message