phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Taylor (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (PHOENIX-3111) Possible Deadlock/delay while building index, upsert select, delete rows at server
Date Thu, 28 Jul 2016 22:17:20 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-3111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15398303#comment-15398303
] 

James Taylor edited comment on PHOENIX-3111 at 7/28/16 10:17 PM:
-----------------------------------------------------------------

Thanks again for the patch, [~rajeshbabu] & [~sergey.soldatov]. Here are some comments/questions:
- What sets {{isRegionClosing}} back to false if it's set to true in preClose()? Since the
region is closing, do we always get a new instance of the UngroupedAggregateRegionObserver,
so when it opens again, it be initialized to false? Or should we also explicitly be setting
{{isRegionClosing}} to false in pre or postOpen()? Is there any chance that it would get stuck
in a true state (i.e. can preClose() complete and then the close not actually happen?
- We need more comments to explain this (even the mighty Lars is having a hard time - imagine
the HBase novice like me trying to understand it). Can we have the Sergey's very nice comment
above as javadoc around the declaration of the lock object? Plus, add to this a (5),(6),(7)...
that describes at a high level the approach to solve this.
- How about adding an {{@GuardedBy("lock")}} to the {{scansReferenceCount}} declaration?
- What does {{blockingMemStoreSize}} represent exactly? Is it an upper bound on how many bytes
can occur before the memstore is full? Add comment, please.
{code}
+        final long blockingMemStoreSize = flushSize * (
+                conf.getLong(HConstants.HREGION_MEMSTORE_BLOCK_MULTIPLIER,
+                        HConstants.DEFAULT_HREGION_MEMSTORE_BLOCK_MULTIPLIER)-1) ;
{code}
- Then this code will basically delay the writing to the memstore while we're over this threshold
for 3 seconds. Is this to give the flush a chance to happen (since that's what'll cause the
memstore size to decrease)? Would be good to document this more. Will this throttling occur
in the normal course of things and how significant a delay is the three seconds? In the non
local index case, we take this code path purely as an optimization - would it be better to
just not do this optimization and let the data go back to the client and be pushed back to
the server?
{code}
+      // We are waiting 3 seconds for the memstore flush happen
+      for (int i = 0; region.getMemstoreSize() > blockingMemstoreSize && i <
30; i++) {
+          try {
+              checkForRegionClosing();
+              Thread.sleep(100);
+          } catch (InterruptedException e) {
+              throw new IOException(e);
+          }
+      }
+      checkForRegionClosing();
{code}
- Under what condition would the region close during the commitBatch call (since we're already
blocking splits)? Is it if the region gets reassigned for some reason? Can we document this
there?
- Should we be blocking a merge too? Or are we not so worried about those because they're
not as common and user initiated?


was (Author: jamestaylor):
Thanks again for the patch, [~rajeshbabu] & [~sergey.soldatov]. Here are some comments/questions:
- What sets {{isRegionClosing}} back to false if it's set to true in preClose()? Since the
region is closing, do we always get a new instance of the UngroupedAggregateRegionObserver,
so when it opens again, it be initialized to false? Or should we also explicitly be setting
{{isRegionClosing}} to false in pre or postOpen()? Is there any chance that it would get stuck
in a true state (i.e. can preClose() complete and then the close not actually happen?
- We need more comments to explain this (even the mighty Lars is having a hard time - imagine
the HBase novice like me trying to understand it). Can we have the Sergey's very nice comment
above as javadoc around the declaration of the lock object? Plus, add to this a (5),(6),(7)...
that describes at a high level the approach to solve this.
- How about adding an {{@GuardedBy("lock")}} to the {{scansReferenceCount}} declaration?
- What does {{blockingMemStoreSize}} represent exactly? Is it an upper bound on how many bytes
can occur before the memstore is full? Add comment, please.
{code}
+        final long blockingMemStoreSize = flushSize * (
+                conf.getLong(HConstants.HREGION_MEMSTORE_BLOCK_MULTIPLIER,
+                        HConstants.DEFAULT_HREGION_MEMSTORE_BLOCK_MULTIPLIER)-1) ;
+
- Then this code will basically delay the writing to the memstore while we're over this threshold
for 3 seconds. Is this to give the flush a chance to happen (since that's what'll cause the
memstore size to decrease)? Would be good to document this more. Will this throttling occur
in the normal course of things and how significant a delay is the three seconds? In the non
local index case, we take this code path purely as an optimization - would it be better to
just not do this optimization and let the data go back to the client and be pushed back to
the server?
{code}
+      // We are waiting 3 seconds for the memstore flush happen
+      for (int i = 0; region.getMemstoreSize() > blockingMemstoreSize && i <
30; i++) {
+          try {
+              checkForRegionClosing();
+              Thread.sleep(100);
+          } catch (InterruptedException e) {
+              throw new IOException(e);
+          }
+      }
+      checkForRegionClosing();
{code}
- Under what condition would the region close during the commitBatch call (since we're already
blocking splits)? Is it if the region gets reassigned for some reason? Can we document this
there?
- Should we be blocking a merge too? Or are we not so worried about those because they're
not as common and user initiated?

> Possible Deadlock/delay while building index, upsert select, delete rows at server
> ----------------------------------------------------------------------------------
>
>                 Key: PHOENIX-3111
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3111
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Sergio Peleato
>            Assignee: Rajeshbabu Chintaguntla
>            Priority: Critical
>             Fix For: 4.8.1
>
>         Attachments: PHOENIX-3111.patch
>
>
> There is a possible deadlock while building local index or running upsert select, delete
at server. The situation might happen in this case.
> In the above queries we scan mutations from table and write back to same table in that
case there is a chance of memstore might reach the threshold of blocking memstore size then
RegionTooBusyException might be thrown back to client and queries might retry scanning.
> Let's suppose if we take a local index build index case we first scan from the data table
and prepare index mutations and write back to same table.
> So there is chance of memstore full as well in that case we try to flush the region.
But if the split happen in between then split might be waiting for write lock on the region
to close and flush wait for readlock because the write lock in the queue until the local index
build completed. Local index build won't complete because we are not allowed to write until
there is flush. This might not be complete deadlock situation but the queries might take lot
of time to complete in this cases.
> {noformat}
> "regionserver//192.168.0.53:16201-splits-1469165876186" #269 prio=5 os_prio=31 tid=0x00007f7fb2050800
nid=0x1c033 waiting on condition [0x0000000139b68000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x00000006ede72550> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>         at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
>         at org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1422)
>         at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1370)
>         - locked <0x00000006ede69d00> (a java.lang.Object)
>         at org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.stepsBeforePONR(SplitTransactionImpl.java:394)
>         at org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.createDaughters(SplitTransactionImpl.java:278)
>         at org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.execute(SplitTransactionImpl.java:561)
>         at org.apache.hadoop.hbase.regionserver.SplitRequest.doSplitting(SplitRequest.java:82)
>         at org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:154)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
>    Locked ownable synchronizers:
>         - <0x00000006ee132098> (a java.util.concurrent.ThreadPoolExecutor$Worker)
> {noformat}
> {noformat}
> "MemStoreFlusher.0" #170 prio=5 os_prio=31 tid=0x00007f7fb6842000 nid=0x19303 waiting
on condition [0x00000001388e9000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x00000006ede72550> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283)
>         at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727)
>         at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1986)
>         at org.apache.hadoop.hbase.regionserver.HRegion.flush(HRegion.java:1950)
>         at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:501)
>         at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471)
>         at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75)
>         at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259)
>         at java.lang.Thread.run(Thread.java:745)
> {noformat}
> As a fix we need to block region splits if building index, upsert select, delete rows
running at server.
> Thanks [~sergey.soldatov] for the help in understanding the bug and analyzing it. [~speleato]
for finding it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message