Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Date: Wed, 6 Mar 2013 02:46:14 +0000 (UTC)
From: "Jeffrey Zhong (JIRA)" <jira@apache.org>
To: issues@hbase.apache.org
Message-ID: <JIRA.12607449.1347540746920.388585.1362537974238@arcas>
In-Reply-To: <JIRA.12607449.1347540746920@arcas>
References: <JIRA.12607449.1347540746920@arcas>
Subject: [jira] [Assigned] (HBASE-6772) Make the Distributed Split HDFS
 Location aware
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


     [ https://issues.apache.org/jira/browse/HBASE-6772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jeffrey Zhong reassigned HBASE-6772:
------------------------------------

    Assignee: Jeffrey Zhong
    
> Make the Distributed Split HDFS Location aware
> ----------------------------------------------
>
>                 Key: HBASE-6772
>                 URL: https://issues.apache.org/jira/browse/HBASE-6772
>             Project: HBase
>          Issue Type: Improvement
>          Components: master, regionserver
>    Affects Versions: 0.96.0
>            Reporter: nkeywal
>            Assignee: Jeffrey Zhong
>
> During a hlog split, each log file (a single hdfs block) is allocated to a different region server. This region server reads the file and creates the recovery edit files.
> The allocation to the region server is random. We could take into account the locations of the log file to split:
> - the reads would be local, hence faster. This allows short circuit as well.
> - less network i/o used during a failure (and this is important)
> - we would be sure to read from a working datanode, hence we're sure we won't have read errors. Read errors slow the split process a lot, as we often enter the "timeouted world". 
> We need to limit the calls to the namenode however.
> Typical algo could be:
> - the master gets the locations of the hlog files
> - it writes it into ZK, if possible in one transaction (this way all the tasks are visible alltogether, allowing some arbitrage by the region server).
> - when the regionserver receives the event, it checks for all logs and all locations.
> - if there is a match, it takes it
> - if not it waits something like 0.2s (to give the time to other regionserver to take it if the location matches), and take any remaining task.
> Drawbacks are:
> - a 0.2s delay added if there is no regionserver available on one of the locations. It's likely possible to remove it with some extra synchronization.
> - Small increase in complexity and dependency to HDFS
> Considering the advantages, it's worth it imho.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira