hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: regions and tables
Date Thu, 01 Dec 2011 23:34:01 GMT
Excellent question.

I would say that if you are planning to have thousands of tables with
the same schema then instead you should use one table with prefixed
rows.

The 20 regions / region server is a general guideline that works best
in the single tenant case, meaning that you have only 1 table and it's
perfectly distributed. My first answer brings you back to that form.

In the multi-tenant case where every table is different not only in
the nature of the data they contain but also in their usage patterns,
the answer basically is YMMV. There really is no universal answer at
the moment. At SU we have >250 tables and we have ~200 regions per
region server, works well for us.

J-D

On Thu, Dec 1, 2011 at 12:26 PM, Sam Seigal <selekt86@yahoo.com> wrote:
> So is it fair to say that the number of tables one can create is also
> bounded by the number of regions that the cluster can support ?
>
> For example, given 5 region servers  and keeping 20 regions / region
> server - with 5 tables, I am restricted to only being able to scale a
> single table to 20 regions across the cluster - this might be fine.
> However, for 20 tables, I can only scale upto 5 regions / table across
> the cluster - which might not be a good idea.  Comments ?
>
>
> On Thu, Dec 1, 2011 at 5:31 AM, Doug Meil <doug.meil@explorysmedical.com> wrote:
>> To expand on what Lars said, there is an example of how this is layed out
>> on disk...
>>
>> http://hbase.apache.org/book.html#trouble.namenode.disk
>>
>> ... regions distribute the table, so two different tables will be
>> distributed by separate sets of regions.
>>
>>
>>
>>
>> On 12/1/11 3:14 AM, "Lars George" <lars.george@gmail.com> wrote:
>>
>>>Hi Sam,
>>>
>>>You need to handle them all separately. The note - I assume - was solely
>>>explaining the fact that the "load" of a region server is defined by the
>>>number of regions it hosts, not the number of tables. If you want to
>>>precreate the regions for one or more than one table is the same work:
>>>create the tables (one by one) with the list of split points.
>>>
>>>Lars
>>>
>>>On Dec 1, 2011, at 7:50 AM, Sam Seigal wrote:
>>>
>>>> HI,
>>>>
>>>> I had a question about the relationship  between regions and tables.
>>>>
>>>> Is there a way to pre-create regions for multiple tables ? or each
>>>> table has its own set of regions managed independently ?
>>>>
>>>> I read on one of the threads that there is really no limit on the
>>>> number of tables, but that we need to be careful about is the number
>>>> of regions. Does this mean that the regions can be pre created for
>>>> multiple tables ?
>>>>
>>>> Thank you,
>>>>
>>>> Sam
>>>
>>>
>>
>>

Mime
View raw message