Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id F3881E513 for ; Wed, 5 Dec 2012 19:35:58 +0000 (UTC) Received: (qmail 53750 invoked by uid 500); 5 Dec 2012 19:35:58 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 53719 invoked by uid 500); 5 Dec 2012 19:35:58 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 53710 invoked by uid 99); 5 Dec 2012 19:35:58 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 05 Dec 2012 19:35:58 +0000 Date: Wed, 5 Dec 2012 19:35:58 +0000 (UTC) From: "stack (JIRA)" To: issues@hbase.apache.org Message-ID: <1956757064.64586.1354736158753.JavaMail.jiratomcat@arcas> In-Reply-To: <555659583.45965.1354297919182.JavaMail.jiratomcat@arcas> Subject: [jira] [Commented] (HBASE-7247) Assignment performances decreased by 50% because of regionserver.OpenRegionHandler#tickleOpening MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13510712#comment-13510712 ] stack commented on HBASE-7247: ------------------------------ bq. Implicitly, it means we still have a race condition here, just that the probability is quite low. Yeah. By-product of our keeping state across multiple systems (up in zk and then some state in .meta.). We could change this to a checkAndPut. Read .META. at start of the opening or have master pass over the .META. timestamp or something key to .META. and we'd use it doing checkAndSet into .META. table... would be more strict than this updating zk. bq. It would be a huge simplification imho. It's worth trying, I would say. It actually makes sense to do it now, because once the current trunk code will be production proven, touching it will be scarier. I'd go along. We should discuss out on dev first. I have short-term memory. I'm currently of the opinion that this expensive facility of master failing an open because it has been taking too long on a particular regionserver has been of no use -- worse, it has only caused headache -- but I may be just not remembering and others out on dev list will have better recall than I. > Assignment performances decreased by 50% because of regionserver.OpenRegionHandler#tickleOpening > ------------------------------------------------------------------------------------------------ > > Key: HBASE-7247 > URL: https://issues.apache.org/jira/browse/HBASE-7247 > Project: HBase > Issue Type: Improvement > Components: master, Region Assignment, regionserver > Affects Versions: 0.96.0 > Reporter: nkeywal > Assignee: nkeywal > Priority: Critical > Fix For: 0.96.0 > > Attachments: 7247.v1.patch > > > The regionserver.OpenRegionHandler#tickleOpening updates the region znode as "Do this so master doesn't timeout this region-in-transition.". > However, on the usual test, this makes the assignment time of 1500 regions goes from 70s to 100s, that is, we're 50% slower because of this. > More generally, ZooKeper commits to disk all the data update, and this takes time. Using it to provide a keep alive seems overkill. At the very list, it could be made asynchronous. > I'm not sure how necessary these updates are required (I need to go deeper in the internal, feedback welcome), but it seems very important to optimize this... The trival fix would be to make this optional. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira