Return-Path: X-Original-To: apmail-hbase-dev-archive@www.apache.org Delivered-To: apmail-hbase-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 62FBA17D99 for ; Mon, 16 Mar 2015 22:55:31 +0000 (UTC) Received: (qmail 14812 invoked by uid 500); 16 Mar 2015 22:55:30 -0000 Delivered-To: apmail-hbase-dev-archive@hbase.apache.org Received: (qmail 14728 invoked by uid 500); 16 Mar 2015 22:55:30 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Received: (qmail 14716 invoked by uid 99); 16 Mar 2015 22:55:30 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 16 Mar 2015 22:55:30 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of syuanjiangdev@gmail.com designates 209.85.213.178 as permitted sender) Received: from [209.85.213.178] (HELO mail-ig0-f178.google.com) (209.85.213.178) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 16 Mar 2015 22:55:04 +0000 Received: by igbue6 with SMTP id ue6so57203838igb.1 for ; Mon, 16 Mar 2015 15:52:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=vAL5APiJYTuv6R1/myW5bZUwZnGhSlxIgorp0XKPRk0=; b=xZrbyHfMzL/2lF7sCEBvYD5ZVrqlkl/kfPoree9rO1NCeuh3f7g/0fiFkmh0ENZbQN u+4Hvkh923RxKAmodeQJU2tMSpgginIYOOXsxoaqp9bvIg8zf86xMDJrRV4ZXd5djJcA QcpvCqXfBsG7jok8cQyA9jKeo9SrqdxStJfgts0HcXlALQ/zV6cHMu+KiyxcQ/9NPILz CpdU0c5k/b3fLlff1uGgvjDISHOQ8WcTj8MYUpHunapTP3NMvKNsFZSgSz//ku9KaF3W W+oed0HEVFbBUQ8jhYjq0xRkKFWHkbpqNmV0QLReTLcMAZv+5rX7fA00QnLkmxV1kuP9 vsVQ== MIME-Version: 1.0 X-Received: by 10.107.17.155 with SMTP id 27mr938344ior.64.1426546367089; Mon, 16 Mar 2015 15:52:47 -0700 (PDT) Received: by 10.36.127.87 with HTTP; Mon, 16 Mar 2015 15:52:47 -0700 (PDT) In-Reply-To: References: Date: Mon, 16 Mar 2015 15:52:47 -0700 Message-ID: Subject: Re: Question on EnableTableHandler code From: Stephen Jiang To: Andrey Stepachev Cc: dev@hbase.apache.org Content-Type: multipart/alternative; boundary=001a113ea86677564f05116fb511 X-Virus-Checked: Checked by ClamAV on apache.org --001a113ea86677564f05116fb511 Content-Type: text/plain; charset=UTF-8 Now (1) is under control (HBASE-13254). Let us talk about (2). Looks like we are doing best effort to online all regions of a table during 'enable table' operation. My argument is that we should be consistent with all conditions. Currently, we fail if bulk assignment failed with some reason; but if we don't even do assignment, we declare successful. It is not consistent; if we are doing best effort, then we should always succeed on 'making regions online' operation (with warning messages - by the way, warning message is good for debugging, but client could not see it). Here is the current logic * done = false;* * if (we can find servers to do bulk assignment) {* * if (bulk assignment is complete) {* * done = true; * * } else { // either bulk assignment failed or interrupted * * done = false;* * }* * }* * } else { // bulk plan could not be found* * done = true;* * } * On Mon, Mar 16, 2015 at 11:48 AM, Andrey Stepachev wrote: > Thanks Stephen. > > on (2): I think that much better to guarantee that table was enabled (i.e. > all internal structures reflect that fact and balancer knows about new > table). But result of that could be checked asyncronically from Admin. > Does it make sense? > > On Mon, Mar 16, 2015 at 6:10 PM, Stephen Jiang > wrote: > >> Andrey, I will take care of (1). >> >> And (2) :-) if your guys agree. Because it is not consistent, if the >> bulk assigned failed, we would fail the enabling table; however, if the >> bulk assign not starts, we would enable table with offline regions - really >> inconsistent - we either all fail in those scenarios or all succeed with >> offline region (best effort approach). >> >> Thanks >> Stephen >> >> On Mon, Mar 16, 2015 at 11:01 AM, Andrey Stepachev >> wrote: >> >>> Stephen, would you like to create jira for case (1)? >>> >>> Thank you. >>> >>> On Mon, Mar 16, 2015 at 5:58 PM, Andrey Stepachev >>> wrote: >>> >>> > Thanks Stephen. >>> > >>> > Looks like you are right. For (1) case we really don't need there >>> > state cleanup. That is a bug. Should throw TableNotFoundException. >>> > >>> > As for (2) in case of no online region servers available we could >>> > leave table enabled, but no regions would be assigned. >>> > >>> > Actually that rises good question what enable table means, >>> > i.e. do we really need to guarantee that on table enable absolutely >>> > all regions are online, or that could be done in Admin on client side. >>> > >>> > So for now it seems that Enable handler do what is best, and leave >>> > table enabled but unassigned to be later assigned by Balancer. >>> > >>> > On Mon, Mar 16, 2015 at 5:34 PM, Stephen Jiang < >>> syuanjiangdev@gmail.com> >>> > wrote: >>> > >>> >> I want to make sure that the following logic in EnableTableHandler is >>> >> correct: >>> >> >>> >> (1). In EnableTableHandler#prepare - if the table is not existed, it >>> >> marked >>> >> the table as deleted and not throw exception. The result is the table >>> >> lock >>> >> is released and the caller has no knowledge that the table not exist >>> or >>> >> already deleted, it would continue the next step. >>> >> >>> >> Currently, this would happen during recovery (the caller is >>> >> AssignmentManager#recoverTableInEnablingState()) - however, looking at >>> >> recovery code, it expects TableNotFoundException Should we always >>> throw >>> >> exception - if the table not exist? I want to make sure that I don't >>> >> break >>> >> recovery logic by modifying. >>> >> >>> >> public EnableTableHandler prepare() { >>> >> >>> >> ... >>> >> >>> >> // Check if table exists >>> >> >>> >> if (!MetaTableAccessor.tableExists(this.server.getConnection(), >>> >> tableName)) { >>> >> >>> >> // retainAssignment is true only during recovery. In normal >>> case >>> >> it is false >>> >> >>> >> if (!this.skipTableStateCheck) { >>> >> >>> >> throw new TableNotFoundException(tableName); >>> >> >>> >> } >>> >> >>> >> this.assignmentManager.getTableStateManager().setDeletedTable( >>> >> tableName); >>> >> >>> >> } >>> >> >>> >> ... >>> >> >>> >> } >>> >> (2). In EnableTableHandler#handleEnableTable() - if the bulk assign >>> plan >>> >> could not be find, it would leave regions to be offline and declare >>> enable >>> >> table succeed - i think this is a bug and we should retry or fail - >>> but I >>> >> want to make sure that there are some merit behind this logic >>> >> >>> >> private void handleEnableTable() { >>> >> >>> >> Map> bulkPlan = >>> >> >>> >> this.assignmentManager.getBalancer().retainAssignment( >>> >> regionsToAssign, onlineServers); >>> >> >>> >> if (bulkPlan != null) { >>> >> >>> >> ... >>> >> >>> >> } else { >>> >> >>> >> LOG.info("Balancer was unable to find suitable servers for >>> table " + >>> >> tableName >>> >> >>> >> + ", leaving unassigned"); >>> >> >>> >> done = true; >>> >> >>> >> } >>> >> >>> >> if (done) { >>> >> >>> >> // Flip the table to enabled. >>> >> >>> >> this.assignmentManager.getTableStateManager().setTableState( >>> >> >>> >> this.tableName, TableState.State.ENABLED); >>> >> >>> >> LOG.info("Table '" + this.tableName >>> >> >>> >> + "' was successfully enabled. Status: done=" + done); >>> >> >>> >> } >>> >> >>> >> ... >>> >> >>> >> } >>> >> >>> >> >>> >> thanks >>> >> Stephen >>> >> >>> > >>> > >>> > >>> > -- >>> > Andrey. >>> > >>> >>> >>> >>> -- >>> Andrey. >>> >> >> > > > -- > Andrey. > --001a113ea86677564f05116fb511--