hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kumar, Deepak8 " <deepak8.ku...@citi.com>
Subject RE: Regionserver goes down while endpoint execution
Date Wed, 13 Mar 2013 15:19:18 GMT
Thanks guys for assisting. I am getting OOM exception yet. I have one query about Endpoints.
As endpoint executes in parallel, so if I have a table which is distributed at 101 regions
across 5 regionserver. Would it be 101 threads of endpoint executing in parallel?

Regards,
Deepak

From: Gary Helmling [mailto:ghelmling@gmail.com]
Sent: Tuesday, March 12, 2013 2:14 PM
To: user@hbase.apache.org
Cc: lars hofhansl; Kumar, Deepak8 [CCC-OT_IT NE]
Subject: Re: Regionserver goes down while endpoint execution

To expand on what Himanshu said, your endpoint is doing an unbounded scan on the region, so
with a region with a lot of rows it's taking more than 60 seconds to run to the region end,
which is why the client side of the call is timing out.  In addition you're building up an
in memory list of all the values for that qualifier in that region, which could cause you
to bump into OOM issues, depending on how big your values are and how sparse the given column
qualifier is.  If you trigger an OOMException, then the region server would abort.

For this usage specifically, though -- scanning through a single column qualifier for all
rows -- you would be better off just doing a normal client side scan, ie. HTable.getScanner().
 Then you will avoid the client timeout and potential server-side memory issues.

On Tue, Mar 12, 2013 at 9:29 AM, Ted Yu <yuzhihong@gmail.com<mailto:yuzhihong@gmail.com>>
wrote:
>From region server log:

2013-03-12 03:07:22,605 DEBUG org.apache.hadoop.hdfs.DFSClient: Error
making BlockReader. Closing stale
Socket[addr=/10.42.105.112<http://10.42.105.112>,port=50010,localport=54114]
java.io.EOFException: Premature EOF: no length prefix available
        at org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:162)
        at org.apache.hadoop.hdfs.RemoteBlockReader2.newBlockReader(RemoteBlockReader2.java:407)

What version of HBase and hadoop are you using ?
Do versions of hadoop on Eclipse machine and in your cluster match ?

Cheers

On Tue, Mar 12, 2013 at 4:46 AM, Kumar, Deepak8 <deepak8.kumar@citi.com<mailto:deepak8.kumar@citi.com>>wrote:
>  Lars,****
>
> I am getting following errors at datanode & region servers.****
>
> ** **
>
> Regards,****
>
> Deepak****
>
> ** **
>
> *From:* Kumar, Deepak8 [CCC-OT_IT NE]
> *Sent:* Tuesday, March 12, 2013 3:00 AM
> *To:* Kumar, Deepak8 [CCC-OT_IT NE]; 'user@hbase.apache.org<mailto:user@hbase.apache.org>';
'lars
> hofhansl'
>
> *Subject:* RE: Regionserver goes down while endpoint execution****
>
>  ** **
>
> Lars,****
>
> It is having following errors when I execute the Endpoint RPC client from
> eclipse. It seems some of the regions at regionserver
> vm-8aa9-fe74.nam.nsroot.net<http://vm-8aa9-fe74.nam.nsroot.net> is taking more
time to reponse.****
>
> ** **
>
> Could you guide how to fix it. I don't find any option to set hbase.rpc.timeout
> from hbase configuration menu in CDH4 CM server for hbase configuration.**
> **
>
> ** **
>
> Regards,****
>
> Deepak****
>
> ** **
>
> 3/03/12 02:33:12 INFO zookeeper.ClientCnxn: Session establishment complete
> on server vm-15c2-3bbf.nam.nsroot.net/10.96.172.44:2181<http://vm-15c2-3bbf.nam.nsroot.net/10.96.172.44:2181>,
sessionid =
> 0x53d591b77090026, negotiated timeout = 60000****
>
> Mar 12, 2013 2:33:13 AM org.apache.hadoop.conf.Configuration
> warnOnceIfDeprecated****
>
> WARNING: hadoop.native.lib is deprecated. Instead, use
> io.native.lib.available****
>
> Mar 12, 2013 2:44:00 AM
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation
> processExecs****
>
> WARNING: Error executing for row 153299:1362780381523:2932572079500658:
> vm-ab1f-dd21.nam.nsroot.net:****
>
> *java.util.concurrent.ExecutionException*: *
> org.apache.hadoop.hbase.client.RetriesExhaustedException*: Failed after
> attempts=10, exceptions:****
>
> Tue Mar 12 02:34:15 EDT 2013,
> org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f<mailto:org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f>,
*
> java.net.SocketTimeoutException*: Call to
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020>
failed on socket timeout
> exception: *java.net.SocketTimeoutException*: 60000 millis timeout while
> waiting for channel to be ready for read. ch :
> java.nio.channels.SocketChannel[connected local=/150.110.96.212:2271remote=
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****>
>
> Tue Mar 12 02:35:16 EDT 2013,
> org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f<mailto:org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f>,
*
> java.net.SocketTimeoutException*: Call to
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020>
failed on socket timeout
> exception: *java.net.SocketTimeoutException*: 60000 millis timeout while
> waiting for channel to be ready for read. ch :
> java.nio.channels.SocketChannel[connected local=/150.110.96.212:2403remote=
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****>
>
> Tue Mar 12 02:36:18 EDT 2013,
> org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f<mailto:org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f>,
*
> java.net.SocketTimeoutException*: Call to
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020>
failed on socket timeout
> exception: *java.net.SocketTimeoutException*: 60000 millis timeout while
> waiting for channel to be ready for read. ch :
> java.nio.channels.SocketChannel[connected local=/150.110.96.212:2465remote=
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****>
>
> Tue Mar 12 02:37:20 EDT 2013,
> org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f<mailto:org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f>,
*
> java.net.SocketTimeoutException*: Call to
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020>
failed on socket timeout
> exception: *java.net.SocketTimeoutException*: 60000 millis timeout while
> waiting for channel to be ready for read. ch :
> java.nio.channels.SocketChannel[connected local=/150.110.96.212:2500remote=
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****>
>
> Tue Mar 12 02:38:22 EDT 2013,
> org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f<mailto:org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f>,
*
> java.net.SocketTimeoutException*: Call to
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020>
failed on socket timeout
> exception: *java.net.SocketTimeoutException*: 60000 millis timeout while
> waiting for channel to be ready for read. ch :
> java.nio.channels.SocketChannel[connected local=/150.110.96.212:2538remote=
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****>
>
> Tue Mar 12 02:39:25 EDT 2013,
> org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f<mailto:org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f>,
*
> java.net.SocketTimeoutException*: Call to
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020>
failed on socket timeout
> exception: *java.net.SocketTimeoutException*: 60000 millis timeout while
> waiting for channel to be ready for read. ch :
> java.nio.channels.SocketChannel[connected local=/150.110.96.212:2572remote=
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****>
>
> Tue Mar 12 02:40:30 EDT 2013,
> org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f<mailto:org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f>,
*
> java.net.SocketTimeoutException*: Call to
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020>
failed on socket timeout
> exception: *java.net.SocketTimeoutException*: 60000 millis timeout while
> waiting for channel to be ready for read. ch :
> java.nio.channels.SocketChannel[connected local=/150.110.96.212:2606remote=
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****>
>
> Tue Mar 12 02:41:34 EDT 2013,
> org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f<mailto:org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f>,
*
> java.net.SocketTimeoutException*: Call to
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020>
failed on socket timeout
> exception: *java.net.SocketTimeoutException*: 60000 millis timeout while
> waiting for channel to be ready for read. ch :
> java.nio.channels.SocketChannel[connected local=/150.110.96.212:2640remote=
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****>
>
> Tue Mar 12 02:42:43 EDT 2013,
> org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f<mailto:org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f>,
*
> java.net.SocketTimeoutException*: Call to
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020>
failed on socket timeout
> exception: *java.net.SocketTimeoutException*: 60000 millis timeout while
> waiting for channel to be ready for read. ch :
> java.nio.channels.SocketChannel[connected local=/150.110.96.212:2677remote=
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****>
>
> Tue Mar 12 02:44:00 EDT 2013,
> org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f<mailto:org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f>,
*
> java.net.SocketTimeoutException*: Call to
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020>
failed on socket timeout
> exception: *java.net.SocketTimeoutException*: 60000 millis timeout while
> waiting for channel to be ready for read. ch :
> java.nio.channels.SocketChannel[connected local=/150.110.96.212:2842remote=
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****>
>
> ** **
>
>       at java.util.concurrent.FutureTask$Sync.innerGet(Unknown Source)****
>
>       at java.util.concurrent.FutureTask.get(Unknown Source)****
>
>       at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processExecs(
> *HConnectionManager.java:1466*)****
>
>       at org.apache.hadoop.hbase.client.HTable.coprocessorExec(*
> HTable.java:1577*)****
>
>       at org.apache.hadoop.hbase.client.HTable.coprocessorExec(*
> HTable.java:1557*)****
>
>       at com.citi.sponge.hbase.endpoint.HBaseEndPointClientForElfLog.main(
> *HBaseEndPointClientForElfLog.java:33*)****
>
> Caused by: *org.apache.hadoop.hbase.client.RetriesExhaustedException*:
> Failed after attempts=10, exceptions:****
>
> Tue Mar 12 02:34:15 EDT 2013,
> org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f<mailto:org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f>,
*
> java.net.SocketTimeoutException*: Call to
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020>
failed on socket timeout
> exception: *java.net.SocketTimeoutException*: 60000 millis timeout while
> waiting for channel to be ready for read. ch :
> java.nio.channels.SocketChannel[connected local=/150.110.96.212:2271remote=
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****>
>
> Tue Mar 12 02:35:16 EDT 2013,
> org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f<mailto:org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f>,
*
> java.net.SocketTimeoutException*: Call to
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020>
failed on socket timeout
> exception: *java.net.SocketTimeoutException*: 60000 millis timeout while
> waiting for channel to be ready for read. ch :
> java.nio.channels.SocketChannel[connected local=/150.110.96.212:2403remote=
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****>
>
> Tue Mar 12 02:36:18 EDT 2013,
> org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f<mailto:org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f>,
*
> java.net.SocketTimeoutException*: Call to
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020>
failed on socket timeout
> exception: *java.net.SocketTimeoutException*: 60000 millis timeout while
> waiting for channel to be ready for read. ch :
> java.nio.channels.SocketChannel[connected local=/150.110.96.212:2465remote=
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****>
>
> Tue Mar 12 02:37:20 EDT 2013,
> org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f<mailto:org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f>,
*
> java.net.SocketTimeoutException*: Call to
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020>
failed on socket timeout
> exception: *java.net.SocketTimeoutException*: 60000 millis timeout while
> waiting for channel to be ready for read. ch :
> java.nio.channels.SocketChannel[connected local=/150.110.96.212:2500remote=
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****>
>
> Tue Mar 12 02:38:22 EDT 2013,
> org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f<mailto:org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f>,
*
> java.net.SocketTimeoutException*: Call to
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020>
failed on socket timeout
> exception: *java.net.SocketTimeoutException*: 60000 millis timeout while
> waiting for channel to be ready for read. ch :
> java.nio.channels.SocketChannel[connected local=/150.110.96.212:2538remote=
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****>
>
> Tue Mar 12 02:39:25 EDT 2013,
> org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f<mailto:org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f>,
*
> java.net.SocketTimeoutException*: Call to
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020>
failed on socket timeout
> exception: *java.net.SocketTimeoutException*: 60000 millis timeout while
> waiting for channel to be ready for read. ch :
> java.nio.channels.SocketChannel[connected local=/150.110.96.212:2572remote=
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****>
>
> Tue Mar 12 02:40:30 EDT 2013,
> org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f<mailto:org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f>,
*
> java.net.SocketTimeoutException*: Call to
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020>
failed on socket timeout
> exception: *java.net.SocketTimeoutException*: 60000 millis timeout while
> waiting for channel to be ready for read. ch :
> java.nio.channels.SocketChannel[connected local=/150.110.96.212:2606remote=
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****>
>
> Tue Mar 12 02:41:34 EDT 2013,
> org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f<mailto:org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f>,
*
> java.net.SocketTimeoutException*: Call to
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020>
failed on socket timeout
> exception: *java.net.SocketTimeoutException*: 60000 millis timeout while
> waiting for channel to be ready for read. ch :
> java.nio.channels.SocketChannel[connected local=/150.110.96.212:2640remote=
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****>
>
> Tue Mar 12 02:42:43 EDT 2013,
> org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f<mailto:org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f>,
*
> java.net.SocketTimeoutException*: Call to
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020>
failed on socket timeout
> exception: *java.net.SocketTimeoutException*: 60000 millis timeout while
> waiting for channel to be ready for read. ch :
> java.nio.channels.SocketChannel[connected local=/150.110.96.212:2677remote=
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****>
>
> Tue Mar 12 02:44:00 EDT 2013,
> org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f<mailto:org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f>,
*
> java.net.SocketTimeoutException*: Call to
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020>
failed on socket timeout
> exception: *java.net.SocketTimeoutException*: 60000 millis timeout while
> waiting for channel to be ready for read. ch :
> java.nio.channels.SocketChannel[connected local=/150.110.96.212:2842remote=
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****>
>
> ** **
>
>       at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(
> *HConnectionManager.java:1345*)****
>
>       at org.apache.hadoop.hbase.ipc.ExecRPCInvoker.invoke(*
> ExecRPCInvoker.java:79*)****
>
>       at $Proxy8.getValues(Unknown Source)****
>
>       at
> com.citi.sponge.hbase.endpoint.HBaseEndPointClientForElfLog$1.call(*
> HBaseEndPointClientForElfLog.java:38*)****
>
>       at
> com.citi.sponge.hbase.endpoint.HBaseEndPointClientForElfLog$1.call(*
> HBaseEndPointClientForElfLog.java:1*)****
>
>       at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$4.call(
> *HConnectionManager.java:1454*)****
>
>       at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)****
>
>       at java.util.concurrent.FutureTask.run(Unknown Source)****
>
>       at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown
> Source)****
>
>       at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
> ****
>
>       at java.lang.Thread.run(Unknown Source)****
>
> *org.apache.hadoop.hbase.client.RetriesExhaustedException*: Failed after
> attempts=10, exceptions:****
>
> Tue Mar 12 02:34:15 EDT 2013,
> org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f<mailto:org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f>,
*
> java.net.SocketTimeoutException*: Call to
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020>
failed on socket timeout
> exception: *java.net.SocketTimeoutException*: 60000 millis timeout while
> waiting for channel to be ready for read. ch :
> java.nio.channels.SocketChannel[connected local=/150.110.96.212:2271remote=
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****>
>
> Tue Mar 12 02:35:16 EDT 2013,
> org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f<mailto:org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f>,
*
> java.net.SocketTimeoutException*: Call to
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020>
failed on socket timeout
> exception: *java.net.SocketTimeoutException*: 60000 millis timeout while
> waiting for channel to be ready for read. ch :
> java.nio.channels.SocketChannel[connected local=/150.110.96.212:2403remote=
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****>
>
> Tue Mar 12 02:36:18 EDT 2013,
> org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f<mailto:org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f>,
*
> java.net.SocketTimeoutException*: Call to
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020>
failed on socket timeout
> exception: *java.net.SocketTimeoutException*: 60000 millis timeout while
> waiting for channel to be ready for read. ch :
> java.nio.channels.SocketChannel[connected local=/150.110.96.212:2465remote=
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****>
>
> Tue Mar 12 02:37:20 EDT 2013,
> org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f<mailto:org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f>,
*
> java.net.SocketTimeoutException*: Call to
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020>
failed on socket timeout
> exception: *java.net.SocketTimeoutException*: 60000 millis timeout while
> waiting for channel to be ready for read. ch :
> java.nio.channels.SocketChannel[connected local=/150.110.96.212:2500remote=
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****>
>
> Tue Mar 12 02:38:22 EDT 2013,
> org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f<mailto:org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f>,
*
> java.net.SocketTimeoutException*: Call to
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020>
failed on socket timeout
> exception: *java.net.SocketTimeoutException*: 60000 millis timeout while
> waiting for channel to be ready for read. ch :
> java.nio.channels.SocketChannel[connected local=/150.110.96.212:2538remote=
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****>
>
> Tue Mar 12 02:39:25 EDT 2013,
> org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f<mailto:org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f>,
*
> java.net.SocketTimeoutException*: Call to
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020>
failed on socket timeout
> exception: *java.net.SocketTimeoutException*: 60000 millis timeout while
> waiting for channel to be ready for read. ch :
> java.nio.channels.SocketChannel[connected local=/150.110.96.212:2572remote=
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****>
>
> Tue Mar 12 02:40:30 EDT 2013,
> org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f<mailto:org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f>,
*
> java.net.SocketTimeoutException*: Call to
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020>
failed on socket timeout
> exception: *java.net.SocketTimeoutException*: 60000 millis timeout while
> waiting for channel to be ready for read. ch :
> java.nio.channels.SocketChannel[connected local=/150.110.96.212:2606remote=
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****>
>
> Tue Mar 12 02:41:34 EDT 2013,
> org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f<mailto:org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f>,
*
> java.net.SocketTimeoutException*: Call to
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020>
failed on socket timeout
> exception: *java.net.SocketTimeoutException*: 60000 millis timeout while
> waiting for channel to be ready for read. ch :
> java.nio.channels.SocketChannel[connected local=/150.110.96.212:2640remote=
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****>
>
> Tue Mar 12 02:42:43 EDT 2013,
> org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f<mailto:org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f>,
*
> java.net.SocketTimeoutException*: Call to
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020>
failed on socket timeout
> exception: *java.net.SocketTimeoutException*: 60000 millis timeout while
> waiting for channel to be ready for read. ch :
> java.nio.channels.SocketChannel[connected local=/150.110.96.212:2677remote=
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****>
>
> Tue Mar 12 02:44:00 EDT 2013,
> org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f<mailto:org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f>,
*
> java.net.SocketTimeoutException*: Call to
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020>
failed on socket timeout
> exception: *java.net.SocketTimeoutException*: 60000 millis timeout while
> waiting for channel to be ready for read. ch :
> java.nio.channels.SocketChannel[connected local=/150.110.96.212:2842remote=
> vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****>
>
> ** **
>
>       at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(
> *HConnectionManager.java:1345*)****
>
>       at org.apache.hadoop.hbase.ipc.ExecRPCInvoker.invoke(*
> ExecRPCInvoker.java:79*)****
>
>       at $Proxy8.getValues(Unknown Source)****
>
>       at
> com.citi.sponge.hbase.endpoint.HBaseEndPointClientForElfLog$1.call(*
> HBaseEndPointClientForElfLog.java:38*)****
>
>       at
> com.citi.sponge.hbase.endpoint.HBaseEndPointClientForElfLog$1.call(*
> HBaseEndPointClientForElfLog.java:1*)****
>
>       at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$4.call(
> *HConnectionManager.java:1454*)****
>
>       at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)****
>
>       at java.util.concurrent.FutureTask.run(Unknown Source)****
>
>       at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown
> Source)****
>
>       at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
> ****
>
>       at java.lang.Thread.run(Unknown Source)****
>
> ** **
>
> ** **
>
> *From:* Kumar, Deepak8 [CCC-OT_IT NE]
> *Sent:* Tuesday, March 12, 2013 2:27 AM
> *To:* 'user@hbase.apache.org<mailto:user@hbase.apache.org>'; 'lars hofhansl'
> *Subject:* RE: Regionserver goes down while endpoint execution****
>
> ** **
>
> Lars,****
>
> Thanks for your quick response.There is not much info in region server
> log. I am again executing it with DEBUG log level in region servers.****
>
> ** **
>
> *Here is the endpoint code*
>
> ** **
>
> public class ColumnAggregationEndpoint extends BaseEndpointCoprocessor****
>
> implements  ColumnAggregationProtocol {****
>
>       ****
>
>       @Override****
>
>         public List<String> getValues(byte[] family, byte[] qualifier, int
> batchSize, int cacheSize)****
>
>         throws IOException {****
>
>           // aggregate at each region****
>
>           Scan scan = new Scan();****
>
>           scan.addColumn(family, qualifier);****
>
>           scan.setCaching(cacheSize);****
>
>           scan.setBatch(batchSize);****
>
>           List<String> values = new ArrayList<String>();****
>
>           RegionCoprocessorEnvironment environment =****
>
>                   (RegionCoprocessorEnvironment) getEnvironment();****
>
>           ****
>
>           InternalScanner scanner =
> environment.getRegion().getScanner(scan);****
>
>           try {****
>
>             List<KeyValue> curVals = new ArrayList<KeyValue>();****
>
>             boolean hasMore = false;****
>
>             do {****
>
>           curVals.clear();****
>
>           hasMore = scanner.next(curVals);****
>
>           KeyValue kv = curVals.get(0);****
>
>           values.add(Bytes.toString(kv.getValue()));****
>
>             } while (hasMore);****
>
>           } finally {****
>
>               scanner.close();****
>
>           }****
>
>           return values;****
>
>         }****
>
> }****
>
> ** **
>
> ** **
>
> ** **
>
> *The RPC client to invoke the Endpoint is as follows:*
>
> ** **
>
> public class HBaseEndPointClientForElfLog {****
>
>       public static void main(String[] args) {****
>
>             try {****
>
>                   Configuration conf = HBaseConfiguration.create();****
>
>                   conf.set(****
>
>                               "hbase.zookeeper.quorum",****
>
>                               "vm-ab1f-dd21.nam.nsroot.net<http://vm-ab1f-dd21.nam.nsroot.net>,
> vm-cb03-2277.nam.nsroot.net<http://vm-cb03-2277.nam.nsroot.net>,vm-15c2-3bbf.nam.nsroot.net<http://vm-15c2-3bbf.nam.nsroot.net>");****
>
>                   String tableName = "elf_log";****
>
>                   final String columnFamily = "content";****
>
>                   final String columnQualifier = "logFileName";****
>
>                   final String startRowKey =
> "153299:1362780381523:2932572079500658:vm-ab1f-dd21.nam.nsroot.net<http://vm-ab1f-dd21.nam.nsroot.net>:";****
>
>                   final String endRowKey = "153299:1362953388000";****
>
>                   HTableInterface table = new HTable(conf, tableName);****
>
>                   Scan scan;****
>
>                   Map<byte[], List<String>> results;****
>
>                   ****
>
>                   // scan: for all regions****
>
>                   scan = new Scan();****
>
>                   ****
>
>                   results =
> table.coprocessorExec(ColumnAggregationProtocol.class,****
>
>                               startRowKey.getBytes(), endRowKey.getBytes(),
> ****
>
>                               new Batch.Call<ColumnAggregationProtocol,
> List<String>>() {****
>
>                                     public List<String>
> call(ColumnAggregationProtocol instance)****
>
>                                                 throws IOException {****
>
>                                           return
> instance.getValues(columnFamily.getBytes(),****
>
>
> columnQualifier.getBytes(),2,5);****
>
>                                     }****
>
>                               });****
>
>                   ****
>
>                   for (Map.Entry<byte[], List<String>> e :
> results.entrySet()) {****
>
>                         System.out.println("Size of list returned:
> "+e.getValue().size());****
>
>                         for(String singleVal: e.getValue()){****
>
>                               System.out.println(singleVal);****
>
>                         }****
>
>                         ****
>
>                         ****
>
>                   }****
>
>             } catch (Throwable throwable) {****
>
>                   throwable.printStackTrace();****
>
>             }****
>
>       }****
>
> }****
>
> ** **
>
> Regards,****
>
> Deepak****
>
> ** **
>
> -----Original Message-----
> From: lars hofhansl [mailto:larsh@apache.org<mailto:larsh@apache.org> <larsh@apache.org<mailto:larsh@apache.org>>]
> Sent: Tuesday, March 12, 2013 2:01 AM
> To: user@hbase.apache.org<mailto:user@hbase.apache.org>
> Subject: Re: Regionserver goes down while endpoint execution****
>
> ** **
>
> What does the region server log say?****
>
> ** **
>
> ** **
>
> Endpoints do not run in a sandbox. You can call System.exit(...) and your
> RegionServer will happily exit.****
>
> If you can, please show us your endpoint code.****
>
> ** **
>
> -- Lars****
>
> ** **
>
> ** **
>
> ** **
>
> ________________________________****
>
> From: "Kumar, Deepak8 " <deepak8.kumar@citi.com<mailto:deepak8.kumar@citi.com>>****
>
> To: "'user@hbase.apache.org<mailto:user@hbase.apache.org>'" <user@hbase.apache.org<mailto:user@hbase.apache.org>>
****
>
> Sent: Monday, March 11, 2013 10:51 PM****
>
> Subject: Regionserver goes down while endpoint execution****
>
> ** **
>
> Hi,****
>
> I have a table in hbase which has more than 5GB of data, it is distributed
> at 101 regions at 5 regionservers.****
>
> ** **
>
> When I execute an endpoint which is supposed to fetch a column qualifier
> value using an endpoint RPC client, the region server goes down. The hbase
> master log says "Can't connect to region, retrying.." The same endpoint
> works fine for tables which has 300 records.****
>
> ** **
>
> Could you please guide me the reason for being regionserver down?****
>
> ** **
>
> Regards,****
>
> Deepak****
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message