Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Date: Thu, 12 Oct 2017 20:28:02 +0000 (UTC)
From: "Josh Elser (JIRA)" <jira@apache.org>
To: issues@hbase.apache.org
Message-ID: <JIRA.13109035.1507839781000.44344.1507840082692@Atlassian.JIRA>
In-Reply-To: <JIRA.13109035.1507839781000@Atlassian.JIRA>
References: <JIRA.13109035.1507839781000@Atlassian.JIRA> <JIRA.13109035.1507839781000@jira-lw-us.apache.org>
Subject: [jira] [Commented] (HBASE-18998) processor.getRowsToLock() always
 assumes there is some row being locked
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
archived-at: Thu, 12 Oct 2017 20:28:09 -0000


    [ https://issues.apache.org/jira/browse/HBASE-18998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202576#comment-16202576 ] 

Josh Elser commented on HBASE-18998:
------------------------------------

This is a bit strange looking (why, you might ask, is Phoenix calling {{mutateRowsWithLocks}} without specifying any rows to lock?).

The Phoenix CP MetaDataEndpointImpl has already grabbed the necessary rows to lock that it needs. So here, when it's actually submitting the mutations that its built, it knows that it has already exclusively locked that row (necessary to prevent concurrent, conflicting DDL operations) and it provides no row locks. Pretty simple fix from our side that just looks a little weird.

> processor.getRowsToLock() always assumes there is some row being locked
> -----------------------------------------------------------------------
>
>                 Key: HBASE-18998
>                 URL: https://issues.apache.org/jira/browse/HBASE-18998
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Ted Yu
>
> During testing, we observed the following exception:
> {code}
> 2017-10-12 02:52:26,683|INFO|MainThread|machine.py:164 - run()||GUID=f4cd2a25-3040-41cc-b423-9ec7990048f4|1/1          DROP TABLE testTable;
> 2017-10-12 02:52:30,320|INFO|MainThread|machine.py:164 - run()||GUID=f4cd2a25-3040-41cc-b423-9ec7990048f4|17/10/12 02:52:30 WARN ipc.CoprocessorRpcChannel: Call failed on IOException
> 2017-10-12 02:52:30,320|INFO|MainThread|machine.py:164 - run()||GUID=f4cd2a25-3040-41cc-b423-9ec7990048f4|org.apache.hadoop.hbase.DoNotRetryIOException: org.apache.hadoop.hbase.DoNotRetryIOException: TESTTABLE: null
> 2017-10-12 02:52:30,321|INFO|MainThread|machine.py:164 - run()||GUID=f4cd2a25-3040-41cc-b423-9ec7990048f4|at org.apache.phoenix.util.ServerUtil.createIOException(ServerUtil.java:93)
> 2017-10-12 02:52:30,321|INFO|MainThread|machine.py:164 - run()||GUID=f4cd2a25-3040-41cc-b423-9ec7990048f4|at org.apache.phoenix.coprocessor.MetaDataEndpointImpl.dropTable(MetaDataEndpointImpl.java:1671)
> 2017-10-12 02:52:30,321|INFO|MainThread|machine.py:164 - run()||GUID=f4cd2a25-3040-41cc-b423-9ec7990048f4|at org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:14347)
> 2017-10-12 02:52:30,321|INFO|MainThread|machine.py:164 - run()||GUID=f4cd2a25-3040-41cc-b423-9ec7990048f4|at org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:7849)
> 2017-10-12 02:52:30,321|INFO|MainThread|machine.py:164 - run()||GUID=f4cd2a25-3040-41cc-b423-9ec7990048f4|at org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:1980)
> 2017-10-12 02:52:30,321|INFO|MainThread|machine.py:164 - run()||GUID=f4cd2a25-3040-41cc-b423-9ec7990048f4|at org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:1962)
> 2017-10-12 02:52:30,321|INFO|MainThread|machine.py:164 - run()||GUID=f4cd2a25-3040-41cc-b423-9ec7990048f4|at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32389)
> 2017-10-12 02:52:30,322|INFO|MainThread|machine.py:164 - run()||GUID=f4cd2a25-3040-41cc-b423-9ec7990048f4|at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2150)
> 2017-10-12 02:52:30,322|INFO|MainThread|machine.py:164 - run()||GUID=f4cd2a25-3040-41cc-b423-9ec7990048f4|at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
> 2017-10-12 02:52:30,322|INFO|MainThread|machine.py:164 - run()||GUID=f4cd2a25-3040-41cc-b423-9ec7990048f4|at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:187)
> 2017-10-12 02:52:30,322|INFO|MainThread|machine.py:164 - run()||GUID=f4cd2a25-3040-41cc-b423-9ec7990048f4|at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:167)
> 2017-10-12 02:52:30,322|INFO|MainThread|machine.py:164 - run()||GUID=f4cd2a25-3040-41cc-b423-9ec7990048f4|Caused by: java.util.NoSuchElementException
> 2017-10-12 02:52:30,322|INFO|MainThread|machine.py:164 - run()||GUID=f4cd2a25-3040-41cc-b423-9ec7990048f4|at java.util.Collections$EmptyIterator.next(Collections.java:4189)
> 2017-10-12 02:52:30,322|INFO|MainThread|machine.py:164 - run()||GUID=f4cd2a25-3040-41cc-b423-9ec7990048f4|at org.apache.hadoop.hbase.regionserver.HRegion.processRowsWithLocks(HRegion.java:7137)
> 2017-10-12 02:52:30,322|INFO|MainThread|machine.py:164 - run()||GUID=f4cd2a25-3040-41cc-b423-9ec7990048f4|at org.apache.hadoop.hbase.regionserver.HRegion.mutateRowsWithLocks(HRegion.java:6980)
> 2017-10-12 02:52:30,322|INFO|MainThread|machine.py:164 - run()||GUID=f4cd2a25-3040-41cc-b423-9ec7990048f4|at org.apache.phoenix.coprocessor.MetaDataEndpointImpl.mutateRowsWithLocks(MetaDataEndpointImpl.java:1966)
> 2017-10-12 02:52:30,323|INFO|MainThread|machine.py:164 - run()||GUID=f4cd2a25-3040-41cc-b423-9ec7990048f4|at org.apache.phoenix.coprocessor.MetaDataEndpointImpl.dropTable(MetaDataEndpointImpl.java:1650)
> {code}
> Here is code from branch-1.1 :
> {code}
>         if (!mutations.isEmpty() && !walSyncSuccessful) {
>           LOG.warn("Wal sync failed. Roll back " + mutations.size() +
>               " memstore keyvalues for row(s):" + StringUtils.byteToHexString(
>               processor.getRowsToLock().iterator().next()) + "...");
> {code}
> The assumption that processor.getRowsToLock().iterator() would always be non-empty was wrong.
> In other branches, taking the iterator seems to have the same issue.
> Thanks to [~elserj] who spotted this issue.


--
This message was sent by Atlassian JIRA
(v6.4.14#64029)