Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@hbase.apache.org
Received-SPF: pass (nike.apache.org: domain of syuanjiangdev@gmail.com
 designates 209.85.213.178 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CAADCt4KVD5kJM0cQnu_fYWSWoiv1OC6j-6B9PEX5AAZsuQ2npQ@mail.gmail.com>
References: 
 <CAPQQKxd1xBUDDPSdYU-oA9P02yYj0AW3XH_Z3XT0xWzoG+5vYg@mail.gmail.com>
	<CAADCt4+7GEGUMcauKjNjqep0k4VwfOJdWDoMUcv7ex0UzFohEw@mail.gmail.com>
	<CAADCt4J5vvopvaZjbd+yeJbeqCrSvRRLVLEqngeOg1k4HSrYSw@mail.gmail.com>
	<CAPQQKxcDxwacFs8uE5b-KdNsNiR1mRzvyXex04crCL=EeXJtuQ@mail.gmail.com>
	<CAADCt4KVD5kJM0cQnu_fYWSWoiv1OC6j-6B9PEX5AAZsuQ2npQ@mail.gmail.com>
Date: Mon, 16 Mar 2015 15:52:47 -0700
Message-ID: 
 <CAPQQKxdD7Ch9qG3+H8Aq_NOz7BVwoKdwPpuCODqLbXX_7EW0pg@mail.gmail.com>
Subject: Re: Question on EnableTableHandler code
From: Stephen Jiang <syuanjiangdev@gmail.com>
To: Andrey Stepachev <octo47@gmail.com>
Cc: dev@hbase.apache.org
Content-Type: multipart/alternative; boundary=001a113ea86677564f05116fb511

--001a113ea86677564f05116fb511
Content-Type: text/plain; charset=UTF-8

Now (1) is under control (HBASE-13254).  Let us talk about (2).  Looks like
we are doing best effort to online all regions of a table during 'enable
table' operation.  My argument is that we should be consistent with all
conditions.  Currently, we fail if bulk assignment failed with some reason;
but if we don't even do assignment, we declare successful.  It is not
consistent; if we are doing best effort, then we should always succeed on
'making regions online' operation (with warning messages - by the way,
warning message is good for debugging, but client could not see it).

Here is the current logic
*    done = false;*
*    if (we can find servers to do bulk assignment) {*
*       if (bulk assignment is complete) {*
*          done = true; *
*        } else { // either bulk assignment failed or interrupted   *
*          done = false;*
*        }*
*      }*
*    } else { // bulk plan could not be found*
*      done = true;*
*    }  *


On Mon, Mar 16, 2015 at 11:48 AM, Andrey Stepachev <octo47@gmail.com> wrote:

> Thanks Stephen.
>
> on (2): I think that much better to guarantee that table was enabled (i.e.
> all internal structures reflect that fact and balancer knows about new
> table). But result of that could be checked asyncronically from Admin.
> Does it make sense?
>
> On Mon, Mar 16, 2015 at 6:10 PM, Stephen Jiang <syuanjiangdev@gmail.com>
> wrote:
>
>> Andrey, I will take care of (1).
>>
>> And (2) :-) if your guys agree.  Because it is not consistent, if the
>> bulk assigned failed, we would fail the enabling table; however, if the
>> bulk assign not starts, we would enable table with offline regions - really
>> inconsistent - we either all fail in those scenarios or all succeed with
>> offline region (best effort approach).
>>
>> Thanks
>> Stephen
>>
>> On Mon, Mar 16, 2015 at 11:01 AM, Andrey Stepachev <octo47@gmail.com>
>> wrote:
>>
>>> Stephen, would you like to create jira for case (1)?
>>>
>>> Thank you.
>>>
>>> On Mon, Mar 16, 2015 at 5:58 PM, Andrey Stepachev <octo47@gmail.com>
>>> wrote:
>>>
>>> > Thanks Stephen.
>>> >
>>> > Looks like you are right. For (1) case we really don't need there
>>> > state cleanup. That is a bug. Should throw TableNotFoundException.
>>> >
>>> > As for (2) in case of no online region servers available we could
>>> > leave table enabled, but no regions would be assigned.
>>> >
>>> > Actually that rises good question what enable table means,
>>> > i.e. do we really need to guarantee that on table enable absolutely
>>> > all regions are online, or that could be done in Admin on client side.
>>> >
>>> > So for now it seems that Enable handler do what is best, and leave
>>> > table enabled but unassigned to be later assigned by Balancer.
>>> >
>>> > On Mon, Mar 16, 2015 at 5:34 PM, Stephen Jiang <
>>> syuanjiangdev@gmail.com>
>>> > wrote:
>>> >
>>> >> I want to make sure that the following logic in EnableTableHandler is
>>> >> correct:
>>> >>
>>> >> (1). In EnableTableHandler#prepare - if the table is not existed, it
>>> >> marked
>>> >> the table as deleted and not throw exception.  The result is the table
>>> >> lock
>>> >> is released and the caller has no knowledge that the table not exist
>>> or
>>> >> already deleted, it would continue the next step.
>>> >>
>>> >> Currently, this would happen during recovery (the caller is
>>> >> AssignmentManager#recoverTableInEnablingState()) - however, looking at
>>> >> recovery code, it expects TableNotFoundException  Should we always
>>> throw
>>> >> exception - if the table not exist?  I want to make sure that I don't
>>> >> break
>>> >> recovery logic by modifying.
>>> >>
>>> >> public EnableTableHandler prepare() {
>>> >>
>>> >> ...
>>> >>
>>> >> // Check if table exists
>>> >>
>>> >>       if (!MetaTableAccessor.tableExists(this.server.getConnection(),
>>> >> tableName)) {
>>> >>
>>> >>         // retainAssignment is true only during recovery.  In normal
>>> case
>>> >> it is false
>>> >>
>>> >>         if (!this.skipTableStateCheck) {
>>> >>
>>> >>           throw new TableNotFoundException(tableName);
>>> >>
>>> >>         }
>>> >>
>>> >>         this.assignmentManager.getTableStateManager().setDeletedTable(
>>> >> tableName);
>>> >>
>>> >>       }
>>> >>
>>> >> ...
>>> >>
>>> >> }
>>> >> (2). In EnableTableHandler#handleEnableTable() - if the bulk assign
>>> plan
>>> >> could not be find, it would leave regions to be offline and declare
>>> enable
>>> >> table succeed - i think this is a bug and we should retry or fail -
>>> but I
>>> >> want to make sure that there are some merit behind this logic
>>> >>
>>> >>   private void handleEnableTable() {
>>> >>
>>> >>     Map<ServerName, List<HRegionInfo>> bulkPlan =
>>> >>
>>> >>         this.assignmentManager.getBalancer().retainAssignment(
>>> >> regionsToAssign, onlineServers);
>>> >>
>>> >>     if (bulkPlan != null) {
>>> >>
>>> >>           ...
>>> >>
>>> >>       } else {
>>> >>
>>> >>       LOG.info("Balancer was unable to find suitable servers for
>>> table " +
>>> >> tableName
>>> >>
>>> >>           + ", leaving unassigned");
>>> >>
>>> >>       done = true;
>>> >>
>>> >>     }
>>> >>
>>> >>     if (done) {
>>> >>
>>> >>       // Flip the table to enabled.
>>> >>
>>> >>       this.assignmentManager.getTableStateManager().setTableState(
>>> >>
>>> >>         this.tableName, TableState.State.ENABLED);
>>> >>
>>> >>       LOG.info("Table '" + this.tableName
>>> >>
>>> >>       + "' was successfully enabled. Status: done=" + done);
>>> >>
>>> >>    }
>>> >>
>>> >>      ...
>>> >>
>>> >> }
>>> >>
>>> >>
>>> >> thanks
>>> >> Stephen
>>> >>
>>> >
>>> >
>>> >
>>> > --
>>> > Andrey.
>>> >
>>>
>>>
>>>
>>> --
>>> Andrey.
>>>
>>
>>
>
>
> --
> Andrey.
>

--001a113ea86677564f05116fb511--