hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ted Yu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
Date Wed, 02 Nov 2016 15:49:58 GMT

    [ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15629369#comment-15629369
] 

Ted Yu commented on HBASE-14417:
--------------------------------

I observed this in the TestHRegionServerBulkLoad-output for the version (v11 and earlier)
where bulk load marker is written directly to hbase:backup table in postAppend hook:
{code}
2016-09-13 23:10:14,072 DEBUG [B.defaultRpcServer.handler=4,queue=0,port=35667] ipc.CallRunner(112):
B.defaultRpcServer.handler=4,queue=0,port=35667: callId: 10646 service: ClientService methodName:
Scan size: 264 connection:    172.18.128.12:59780
org.apache.hadoop.hbase.RegionTooBusyException: failed to get a lock in 60000 ms. regionName=atomicBulkLoad,,1473808150804.6b6c67612b01bce3348c144b959b7f0e.,
server=cn012.l42scl.hortonworks.com,35667,1473808145352
  at org.apache.hadoop.hbase.regionserver.HRegion.lock(HRegion.java:7744)
  at org.apache.hadoop.hbase.regionserver.HRegion.lock(HRegion.java:7725)
  at org.apache.hadoop.hbase.regionserver.HRegion.startRegionOperation(HRegion.java:7634)
  at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:2588)
  at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:2582)
  at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2569)
  at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:33516)
  at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2229)
  at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:109)
  at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:136)
  at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:111)
  at java.lang.Thread.run(Thread.java:745)
{code}
Here was the state of the BulkLoadHandler thread (stuck):
{code}
"RS:0;cn012:36301.append-pool9-t1" #453 prio=5 os_prio=0 tid=0x00007fc3945bb000 nid=0x18ec
in Object.wait() [0x00007fc30dada000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
  at java.lang.Object.wait(Native Method)
  at org.apache.hadoop.hbase.client.AsyncProcess.waitForMaximumCurrentTasks(AsyncProcess.java:1727)
  - locked <0x0000000794750580> (a java.util.concurrent.atomic.AtomicLong)
  at org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1756)
  at org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:241)
  at org.apache.hadoop.hbase.client.BufferedMutatorImpl.flush(BufferedMutatorImpl.java:191)
  - locked <0x0000000794750048> (a org.apache.hadoop.hbase.client.BufferedMutatorImpl)
  at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:949)
  at org.apache.hadoop.hbase.client.HTable.put(HTable.java:569)
  at org.apache.hadoop.hbase.backup.impl.BackupSystemTable.writeBulkLoadDesc(BackupSystemTable.java:227)
  at org.apache.hadoop.hbase.backup.impl.BulkLoadHandler.postAppend(BulkLoadHandler.java:83)
  at org.apache.hadoop.hbase.regionserver.wal.FSHLog.postAppend(FSHLog.java:1448)
{code}
Even increasing handler count didn't help:
{code}
diff --git a/hbase-server/src/test/resources/hbase-site.xml b/hbase-server/src/test/resources/hbase-site.xml
index bca90a3..829fcc9 100644
--- a/hbase-server/src/test/resources/hbase-site.xml
+++ b/hbase-server/src/test/resources/hbase-site.xml
@@ -30,6 +30,10 @@
     </description>
   </property>
   <property>
+    <name>hbase.backup.enable</name>
+    <value>true</value>
+  </property>
+  <property>
     <name>hbase.defaults.for.version.skip</name>
     <value>true</value>
   </property>
@@ -48,11 +52,11 @@
   </property>
   <property>
     <name>hbase.regionserver.handler.count</name>
-    <value>5</value>
+    <value>50</value>
   </property>
   <property>
{code}
Post v11, the data stored in zookeeper is temporary: once an incremental backup is run for
the table receiving bulk load, data in zookeeper would be stored for the backup Id and removed
from zookeeper.

> Incremental backup and bulk loading
> -----------------------------------
>
>                 Key: HBASE-14417
>                 URL: https://issues.apache.org/jira/browse/HBASE-14417
>             Project: HBase
>          Issue Type: New Feature
>    Affects Versions: 2.0.0
>            Reporter: Vladimir Rodionov
>            Assignee: Ted Yu
>            Priority: Critical
>              Labels: backup
>             Fix For: 2.0.0
>
>         Attachments: 14417.v1.txt, 14417.v11.txt, 14417.v13.txt, 14417.v2.txt, 14417.v21.txt,
14417.v23.txt, 14417.v24.txt, 14417.v25.txt, 14417.v6.txt
>
>
> Currently, incremental backup is based on WAL files. Bulk data loading bypasses WALs
for obvious reasons, breaking incremental backups. The only way to continue backups after
bulk loading is to create new full backup of a table. This may not be feasible for customers
who do bulk loading regularly (say, every day).
> Google doc for design:
> https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message