hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Francis Liu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-6721) RegionServer Group based Assignment
Date Tue, 25 Nov 2014 07:35:17 GMT

    [ https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14224150#comment-14224150

Francis Liu commented on HBASE-6721:

Enis, thanks for the review:

Will it be better to have one-to-many relationship from servers to groups? All servers will
have the default group, as well as any more groups that it is assigned. We can do the same
thing for tables as well. By default, tables will belong to group default, but can be added
I skimmed through YARN-796 doc. To me it seems the notion of yarn labels and rs groups are
very different. Labels are intended to add attributes to certain resources (ie gpu, high-memory,
etc) so you can request for specific attributes when requesting for resources. While groups
mentioned in this jira is meant to identify a logical grouping of resources, meant to guarantee
an amount of resources/capacity for a tenant. We have not seen a need for it to date. What
do we gain from doing the same for tables? It seems to me the purpose of groups has no need
for overlaps either way. Please correct me if I missed something. 

For all new features, I think it is fair to request one single source of truth and also some
more transactional guarantees for operations. If possible, we should not use zk at all (for
caching as done in patch). Instead we can just open the region from master. This is different
than the co-location discussion for master and meta. This table is tiny, and not query'able
from clients. The data is just there for the master to access. If we do this, we do not even
have to have a cache.
I have thought of this as well, we can do this but it would require handling group table in
a special way. ie make sure it is assigned after meta and before any other table. have a groupWAL,
have it's one sets of handlers similar to meta, etc. As well as have a special case for making
sure meta and root end up in their designated groups even when group table hasn't been assigned
yet. So we're essentially trading one set of complexities for another. Having said that, I'm
ok with either approach. Should write up a patch with this approach?

On a related note, region servers consume group information from ZK similar to security ACL.
This is used in replication and other features we are working on. As mentioned it is possible
to remove caching usage of ZK but it would be good to keep the information in there until
the same facility security acl will eventually be migrating to is available. Thoughts?

I am ok with bringing this as a core functionality rather than CP + LB. The reasoning is that
as clusters grow bigger, more users might be interested in this for better isolation.
Great with this we can do the approach mentioned earlier if need be.

> RegionServer Group based Assignment
> -----------------------------------
>                 Key: HBASE-6721
>                 URL: https://issues.apache.org/jira/browse/HBASE-6721
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Francis Liu
>            Assignee: Vandana Ayyalasomayajula
>         Attachments: 6721-master-webUI.patch, HBASE-6721-DesigDoc.pdf, HBASE-6721-DesigDoc.pdf,
HBASE-6721-DesigDoc.pdf, HBASE-6721-DesigDoc.pdf, HBASE-6721_10.patch, HBASE-6721_8.patch,
HBASE-6721_9.patch, HBASE-6721_9.patch, HBASE-6721_94.patch, HBASE-6721_94.patch, HBASE-6721_94_2.patch,
HBASE-6721_94_3.patch, HBASE-6721_94_3.patch, HBASE-6721_94_4.patch, HBASE-6721_94_5.patch,
HBASE-6721_94_6.patch, HBASE-6721_94_7.patch, HBASE-6721_trunk.patch, HBASE-6721_trunk.patch,
HBASE-6721_trunk.patch, HBASE-6721_trunk1.patch, HBASE-6721_trunk2.patch
> In multi-tenant deployments of HBase, it is likely that a RegionServer will be serving
out regions from a number of different tables owned by various client applications. Being
able to group a subset of running RegionServers and assign specific tables to it, provides
a client application a level of isolation and resource allocation.
> The proposal essentially is to have an AssignmentManager which is aware of RegionServer
groups and assigns tables to region servers based on groupings. Load balancing will occur
on a per group basis as well. 
> This is essentially a simplification of the approach taken in HBASE-4120. See attached

This message was sent by Atlassian JIRA

View raw message