hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HBASE-3295) Dropping a 1k+ regions table likely ends in a client socket timeout and it's very confusing
Date Wed, 01 Dec 2010 19:41:11 GMT

     [ https://issues.apache.org/jira/browse/HBASE-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

stack updated HBASE-3295:
-------------------------

    Attachment: 3295.txt

This one-line change should do it.

It makes deleteTable async over on master.

In HBaseAdmin, the current code sends the deleteTable to the master then spins waiting on
all entries in .META. to disappear.  The deleteTable was synchronous.  With this change the
deleteTable is now async.... so we should be spinning client-side rather than server-side.
 Should get rid of the socket timeout.

> Dropping a 1k+ regions table likely ends in a client socket timeout and it's very confusing
> -------------------------------------------------------------------------------------------
>
>                 Key: HBASE-3295
>                 URL: https://issues.apache.org/jira/browse/HBASE-3295
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.90.0
>
>         Attachments: 3295.txt
>
>
> I tried truncating a 1.6k regions table from the shell and, after the usual disabling
timeout, I then got a socket timeout on the second invocation while it was dropping. It looked
like this:
> {noformat}
> ERROR: java.net.SocketTimeoutException: Call to sv2borg180/10.20.20.180:61000 failed
on socket timeout exception:
>  java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be
ready for read. ch :
>  java.nio.channels.SocketChannel[connected local=/10.20.20.180:59153 remote=sv2borg180/10.20.20.180:61000]
> {noformat}
> At first I thought that was coming from the master because HDFS was somehow slow, but
then understood that it was my socket that timed out meaning that the master was still dropping
the table. Calling truncate again, I got:
> {noformat}
> ERROR: Unknown table TestTable!
> {noformat}
> Which means that the table would be deleted... I learned later that it wasn't totally
deleted after I shut down the cluster. So it leaves me in a situation where I have to manually
delete the files on the FS and the remaining .META. entries.
> Since I expect a few people will hit this issue rather soon, for 0.90.0, I propose we
just set the socket timeout really high in the shell. For 0.90.1, or 0.92, we should do for
drop what we do for disabling.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message