hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Elser (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-18998) processor.getRowsToLock() always assumes there is some row being locked
Date Thu, 12 Oct 2017 20:28:02 GMT

    [ https://issues.apache.org/jira/browse/HBASE-18998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202576#comment-16202576
] 

Josh Elser commented on HBASE-18998:
------------------------------------

This is a bit strange looking (why, you might ask, is Phoenix calling {{mutateRowsWithLocks}}
without specifying any rows to lock?).

The Phoenix CP MetaDataEndpointImpl has already grabbed the necessary rows to lock that it
needs. So here, when it's actually submitting the mutations that its built, it knows that
it has already exclusively locked that row (necessary to prevent concurrent, conflicting DDL
operations) and it provides no row locks. Pretty simple fix from our side that just looks
a little weird.

> processor.getRowsToLock() always assumes there is some row being locked
> -----------------------------------------------------------------------
>
>                 Key: HBASE-18998
>                 URL: https://issues.apache.org/jira/browse/HBASE-18998
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Ted Yu
>
> During testing, we observed the following exception:
> {code}
> 2017-10-12 02:52:26,683|INFO|MainThread|machine.py:164 - run()||GUID=f4cd2a25-3040-41cc-b423-9ec7990048f4|1/1
         DROP TABLE testTable;
> 2017-10-12 02:52:30,320|INFO|MainThread|machine.py:164 - run()||GUID=f4cd2a25-3040-41cc-b423-9ec7990048f4|17/10/12
02:52:30 WARN ipc.CoprocessorRpcChannel: Call failed on IOException
> 2017-10-12 02:52:30,320|INFO|MainThread|machine.py:164 - run()||GUID=f4cd2a25-3040-41cc-b423-9ec7990048f4|org.apache.hadoop.hbase.DoNotRetryIOException:
org.apache.hadoop.hbase.DoNotRetryIOException: TESTTABLE: null
> 2017-10-12 02:52:30,321|INFO|MainThread|machine.py:164 - run()||GUID=f4cd2a25-3040-41cc-b423-9ec7990048f4|at
org.apache.phoenix.util.ServerUtil.createIOException(ServerUtil.java:93)
> 2017-10-12 02:52:30,321|INFO|MainThread|machine.py:164 - run()||GUID=f4cd2a25-3040-41cc-b423-9ec7990048f4|at
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.dropTable(MetaDataEndpointImpl.java:1671)
> 2017-10-12 02:52:30,321|INFO|MainThread|machine.py:164 - run()||GUID=f4cd2a25-3040-41cc-b423-9ec7990048f4|at
org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:14347)
> 2017-10-12 02:52:30,321|INFO|MainThread|machine.py:164 - run()||GUID=f4cd2a25-3040-41cc-b423-9ec7990048f4|at
org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:7849)
> 2017-10-12 02:52:30,321|INFO|MainThread|machine.py:164 - run()||GUID=f4cd2a25-3040-41cc-b423-9ec7990048f4|at
org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:1980)
> 2017-10-12 02:52:30,321|INFO|MainThread|machine.py:164 - run()||GUID=f4cd2a25-3040-41cc-b423-9ec7990048f4|at
org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:1962)
> 2017-10-12 02:52:30,321|INFO|MainThread|machine.py:164 - run()||GUID=f4cd2a25-3040-41cc-b423-9ec7990048f4|at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32389)
> 2017-10-12 02:52:30,322|INFO|MainThread|machine.py:164 - run()||GUID=f4cd2a25-3040-41cc-b423-9ec7990048f4|at
org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2150)
> 2017-10-12 02:52:30,322|INFO|MainThread|machine.py:164 - run()||GUID=f4cd2a25-3040-41cc-b423-9ec7990048f4|at
org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
> 2017-10-12 02:52:30,322|INFO|MainThread|machine.py:164 - run()||GUID=f4cd2a25-3040-41cc-b423-9ec7990048f4|at
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:187)
> 2017-10-12 02:52:30,322|INFO|MainThread|machine.py:164 - run()||GUID=f4cd2a25-3040-41cc-b423-9ec7990048f4|at
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:167)
> 2017-10-12 02:52:30,322|INFO|MainThread|machine.py:164 - run()||GUID=f4cd2a25-3040-41cc-b423-9ec7990048f4|Caused
by: java.util.NoSuchElementException
> 2017-10-12 02:52:30,322|INFO|MainThread|machine.py:164 - run()||GUID=f4cd2a25-3040-41cc-b423-9ec7990048f4|at
java.util.Collections$EmptyIterator.next(Collections.java:4189)
> 2017-10-12 02:52:30,322|INFO|MainThread|machine.py:164 - run()||GUID=f4cd2a25-3040-41cc-b423-9ec7990048f4|at
org.apache.hadoop.hbase.regionserver.HRegion.processRowsWithLocks(HRegion.java:7137)
> 2017-10-12 02:52:30,322|INFO|MainThread|machine.py:164 - run()||GUID=f4cd2a25-3040-41cc-b423-9ec7990048f4|at
org.apache.hadoop.hbase.regionserver.HRegion.mutateRowsWithLocks(HRegion.java:6980)
> 2017-10-12 02:52:30,322|INFO|MainThread|machine.py:164 - run()||GUID=f4cd2a25-3040-41cc-b423-9ec7990048f4|at
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.mutateRowsWithLocks(MetaDataEndpointImpl.java:1966)
> 2017-10-12 02:52:30,323|INFO|MainThread|machine.py:164 - run()||GUID=f4cd2a25-3040-41cc-b423-9ec7990048f4|at
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.dropTable(MetaDataEndpointImpl.java:1650)
> {code}
> Here is code from branch-1.1 :
> {code}
>         if (!mutations.isEmpty() && !walSyncSuccessful) {
>           LOG.warn("Wal sync failed. Roll back " + mutations.size() +
>               " memstore keyvalues for row(s):" + StringUtils.byteToHexString(
>               processor.getRowsToLock().iterator().next()) + "...");
> {code}
> The assumption that processor.getRowsToLock().iterator() would always be non-empty was
wrong.
> In other branches, taking the iterator seems to have the same issue.
> Thanks to [~elserj] who spotted this issue.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message