hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Hsieh (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HBASE-7352) clone operation from HBaseAdmin can hang forever.
Date Fri, 14 Dec 2012 01:28:27 GMT

    [ https://issues.apache.org/jira/browse/HBASE-7352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13531857#comment-13531857
] 

Jonathan Hsieh edited comment on HBASE-7352 at 12/14/12 1:27 AM:
-----------------------------------------------------------------

The snapshot can be cloned from another shell instance but must write to a different name.

If you attempt to clone the snapshot (pe-11) to the same target table (pe-11-table), you get:
{code}
hbase(main):006:0> clone_snapshot 'pe-11', 'pe-11-table'

ERROR: org.apache.hadoop.hbase.snapshot.exception.RestoreSnapshotException: org.apache.hadoop.hbase.snapshot.exception.RestoreSnapshotExcept
ion: Couldn't clone the snapshot=name: "pe-11"
table: "TestTable" 
creationTime: 1355441918484
type: FLUSH
version: 0
 on table=pe-11-table
        at org.apache.hadoop.hbase.master.snapshot.manage.SnapshotManager.cloneSnapshot(SnapshotManager.java:558)
        at org.apache.hadoop.hbase.master.snapshot.manage.SnapshotManager.restoreSnapshot(SnapshotManager.java:597)
        at org.apache.hadoop.hbase.master.HMaster.restoreSnapshot(HMaster.java:2528)
        at sun.reflect.GeneratedMethodAccessor59.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.hbase.ipc.ProtobufRpcEngine$Server.call(ProtobufRpcEngine.java:356)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1816)
Caused by: org.apache.hadoop.hbase.TableExistsException: pe-11-table
        at org.apache.hadoop.hbase.master.handler.CreateTableHandler.<init>(CreateTableHandler.java:96)
        at org.apache.hadoop.hbase.master.snapshot.CloneSnapshotHandler.<init>(CloneSnapshotHandler.java:65)
        at org.apache.hadoop.hbase.master.snapshot.manage.SnapshotManager.cloneSnapshot(SnapshotManager.java:551)

        at org.apache.hadoop.hbase.master.snapshot.manage.SnapshotManager.cloneSnapshot(SnapshotManager.java:558)
        at org.apache.hadoop.hbase.master.snapshot.manage.SnapshotManager.restoreSnapshot(SnapshotManager.java:597)
        at org.apache.hadoop.hbase.master.HMaster.restoreSnapshot(HMaster.java:2528)
        at sun.reflect.GeneratedMethodAccessor59.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.hbase.ipc.ProtobufRpcEngine$Server.call(ProtobufRpcEngine.java:356)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1816)
Caused by: org.apache.hadoop.hbase.TableExistsException: pe-11-table
        at org.apache.hadoop.hbase.master.handler.CreateTableHandler.<init>(CreateTableHandler.java:96)
        at org.apache.hadoop.hbase.master.snapshot.CloneSnapshotHandler.<init>(CloneSnapshotHandler.java:65)
        at org.apache.hadoop.hbase.master.snapshot.manage.SnapshotManager.cloneSnapshot(SnapshotManager.java:551)
        ... 7 more
{code}

The dir for seems to be present for table pe-11, but there seems to be a large number of missing
files.

In this particular case, the snapshot has 16 regions, while the failed attempt to restore
has 12 regions moved into real table position.  This suggests that something failed internally
but was allowed to continue to do the dir rename at some point.

If one removes the data bad data from the hdfs we still cannot clone to the same pe-11-table
name because there is some in memory state that blocks this.

(grammar fixes)
                
      was (Author: jmhsieh):
    The snapshot can be cloned from another shell instance but must write to a different name.

If you attempt to clone the snapshot (pe-11) to the same target table (pe-11-table), you get:
{code}
hbase(main):006:0> clone_snapshot 'pe-11', 'pe-11-table'

ERROR: org.apache.hadoop.hbase.snapshot.exception.RestoreSnapshotException: org.apache.hadoop.hbase.snapshot.exception.RestoreSnapshotExcept
ion: Couldn't clone the snapshot=name: "pe-11"
table: "TestTable" 
creationTime: 1355441918484
type: FLUSH
version: 0
 on table=pe-11-table
        at org.apache.hadoop.hbase.master.snapshot.manage.SnapshotManager.cloneSnapshot(SnapshotManager.java:558)
        at org.apache.hadoop.hbase.master.snapshot.manage.SnapshotManager.restoreSnapshot(SnapshotManager.java:597)
        at org.apache.hadoop.hbase.master.HMaster.restoreSnapshot(HMaster.java:2528)
        at sun.reflect.GeneratedMethodAccessor59.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.hbase.ipc.ProtobufRpcEngine$Server.call(ProtobufRpcEngine.java:356)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1816)
Caused by: org.apache.hadoop.hbase.TableExistsException: pe-11-table
        at org.apache.hadoop.hbase.master.handler.CreateTableHandler.<init>(CreateTableHandler.java:96)
        at org.apache.hadoop.hbase.master.snapshot.CloneSnapshotHandler.<init>(CloneSnapshotHandler.java:65)
        at org.apache.hadoop.hbase.master.snapshot.manage.SnapshotManager.cloneSnapshot(SnapshotManager.java:551)

        at org.apache.hadoop.hbase.master.snapshot.manage.SnapshotManager.cloneSnapshot(SnapshotManager.java:558)
        at org.apache.hadoop.hbase.master.snapshot.manage.SnapshotManager.restoreSnapshot(SnapshotManager.java:597)
        at org.apache.hadoop.hbase.master.HMaster.restoreSnapshot(HMaster.java:2528)
        at sun.reflect.GeneratedMethodAccessor59.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.hbase.ipc.ProtobufRpcEngine$Server.call(ProtobufRpcEngine.java:356)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1816)
Caused by: org.apache.hadoop.hbase.TableExistsException: pe-11-table
        at org.apache.hadoop.hbase.master.handler.CreateTableHandler.<init>(CreateTableHandler.java:96)
        at org.apache.hadoop.hbase.master.snapshot.CloneSnapshotHandler.<init>(CloneSnapshotHandler.java:65)
        at org.apache.hadoop.hbase.master.snapshot.manage.SnapshotManager.cloneSnapshot(SnapshotManager.java:551)
        ... 7 more
{code}

The dir for seems to be present for table pe-11, but there seems to be a large number of missing
files.

In this particular case, the snapshot has 16 regions, while the failed attempt to restore
has 12 regions moved into real table position.  This suggest that something failed internally
but was allowed to continue to do the dir rename at some point.

If one removing the data bad data from the hdfs we still cannot clone to the same pe-11-table
name because there is some in memory state that blocks this.


                  
> clone operation from HBaseAdmin can hang forever.
> -------------------------------------------------
>
>                 Key: HBASE-7352
>                 URL: https://issues.apache.org/jira/browse/HBASE-7352
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Client, master, regionserver, snapshots, Zookeeper
>            Reporter: Jonathan Hsieh
>             Fix For: hbase-6055, 0.96.0
>
>
> Sometimes the clone operation from the hbase shell can hang.  The table has been created
(it shows up in the web ui), but does not have any entries in META.
> There don't seem to be any clone, snapshot, enable or disable found in the master's jstack.
> Here's a trace from the HBaseAdmin:
> {code}
> "main" prio=10 tid=0x00007f782800d000 nid=0x25c waiting on condition [0x00007f782f9bf000]
>    java.lang.Thread.State: TIMED_WAITING (sleeping)
>         at java.lang.Thread.sleep(Native Method)
>         at org.apache.hadoop.hbase.client.HBaseAdmin.cloneSnapshot(HBaseAdmin.java:2413)
>         at org.apache.hadoop.hbase.client.HBaseAdmin.cloneSnapshot(HBaseAdmin.java:2393)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.jruby.javasupport.JavaMethod.invokeDirectWithExceptionHandling(JavaMethod.java:465)
>         at org.jruby.javasupport.JavaMethod.invokeDirect(JavaMethod.java:323)
>         at org.jruby.java.invokers.InstanceMethodInvoker.call(InstanceMethodInvoker.java:69)
>         at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:201)
>         at org.jruby.ast.CallTwoArgNode.interpret(CallTwoArgNode.java:59)
>         at org.jruby.ast.NewlineNode.interpret(NewlineNode.java:104)
> ... (more jruby stack) ... 
> {code}  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message