Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Date: Mon, 18 Mar 2013 12:06:18 +0000 (UTC)
From: "nkeywal (JIRA)" <jira@apache.org>
To: issues@hbase.apache.org
Message-ID: <JIRA.12627933.1358372719743.2748.1363608378661@arcas>
In-Reply-To: <JIRA.12627933.1358372719743@arcas>
References: <JIRA.12627933.1358372719743@arcas>
Subject: [jira] [Updated] (HBASE-7590) Add a costless notifications
 mechanism from master to regionservers & clients
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


     [ https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

nkeywal updated HBASE-7590:
---------------------------

    Status: Open  (was: Patch Available)
    
> Add a costless notifications mechanism from master to regionservers & clients
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-7590
>                 URL: https://issues.apache.org/jira/browse/HBASE-7590
>             Project: HBase
>          Issue Type: Bug
>          Components: Client, master, regionserver
>    Affects Versions: 0.96.0
>            Reporter: nkeywal
>            Assignee: nkeywal
>         Attachments: 7590.inprogress.patch, 7590.v12.patch, 7590.v12.patch, 7590.v1.patch, 7590.v1-rebased.patch, 7590.v2.patch, 7590.v3.patch, 7590.v5.patch, 7590.v5.patch
>
>
> t would be very useful to add a mechanism to distribute some information to the clients and regionservers. Especially It would be useful to know globally (regionservers + clients apps) that some regionservers are dead. This would allow:
> - to lower the load on the system, without clients using staled information and going on dead machines
> - to make the recovery faster from a client point of view. It's common to use large timeouts on the client side, so the client may need a lot of time before declaring a region server dead and trying another one. If the client receives the information separatly about a region server states, it can take the right decision, and continue/stop to wait accordingly.
> We can also send more information, for example instructions like 'slow down' to instruct the client to increase the retries delay and so on.
>  Technically, the master could send this information. To lower the load on the system, we should:
> - have a multicast communication (i.e. the master does not have to connect to all servers by tcp), with once packet every 10 seconds or so.
> - receivers should not depend on this: if the information is available great. If not, it should not break anything.
> - it should be optional.
> So at the end we would have a thread in the master sending a protobuf message about the dead servers on a multicast socket. If the socket is not configured, it does not do anything. On the client side, when we receive an information that a node is dead, we refresh the cache about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira