hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jeff saremi <jeffsar...@hotmail.com>
Subject Re: Baffling RPC exceptions with our Thrift servers
Date Wed, 09 Aug 2017 17:30:27 GMT
we are getting inundated with RPC exceptions in Thrift server. Is anyone here that could point
us to where the problem is?
According to the Master UI everything is good, no regions in transition, no FAILED_CLOSE,
no red tasks or anything like that, no Offline regions, nothing


________________________________
From: jeff saremi <jeffsaremi@hotmail.com>
Sent: Friday, August 4, 2017 2:25:52 PM
To: user@hbase.apache.org
Subject: Re: Baffling RPC exceptions with our Thrift servers

actually going further back in the RS logs I see these:

java.io.IOException: Got error, status message org.apache.hadoop.yarn.server.nodemanager.util.UtilizationBasedNodeBusyChecker
CPU: 18.97175> 10 , for OP_READ_BLOCK, self=/25.123.83.126:41098, remote=/10.27.138.10:10010,
for file /hbase/SomeData/data/default/SomeTable122016/bfce55b49e2ade82e1bac73c4205d967/info/995f0a2a24b84a048ea55a4879f46e28,
for pool BP-575538346-25.126.51.77-1446116651710 block 1096040129_23473372
    at org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:142)
    at org.apache.hadoop.hdfs.RemoteBlockReader2.checkSuccess(RemoteBlockReader2.java:456)
    at org.apache.hadoop.hdfs.RemoteBlockReader2.newBlockReader(RemoteBlockReader2.java:424)
    at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReader(BlockReaderFactory.java:821)
    at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:700)
    at org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:358)
    at org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:729)
    at org.apache.hadoop.hdfs.DFSInputStream.actualGetFromOneDataNode(DFSInputStream.java:1651)
    at org.apache.hadoop.hdfs.DFSInputStream$3.call(DFSInputStream.java:1610)
    at org.apache.hadoop.hdfs.DFSInputStream$3.call(DFSInputStream.java:1602)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)


________________________________
From: jeff saremi <jeffsaremi@hotmail.com>
Sent: Friday, August 4, 2017 2:22:54 PM
To: user@hbase.apache.org
Subject: Baffling RPC exceptions with our Thrift servers

Every once in a while (and this is getting more frequent) our Thrift clients report errors
all over.

I check say one of the Thrift server logs. I see a lot of lines like the following:


2017-08-04 14:15:17,089 INFO  [thrift-worker-29] client.RpcRetryingCaller: Call exception,
tries=14, retries=35, started=108853 ms ago, cancelled=false, msg=row 'http://hobartexchange.com.au/classifieds/_g397381.html'
on table 'ClickStreamTable122016' at region=ClickStreamTable122016,http://hifimov.com/youtube-videos/mcent-hack-unlimited-money-cracked-apk,1501285634230.bfce55b49e2ade82e1bac73c4205d967.,
hostname=co4aps197b537e,16020,1501339699340, seqNum=54295


I go to mater. Check status. No issues whatsoever.

I check the logs for the RS mentioned in the log. No issues that I can tell you.

I restarted all Thrift servers and that didn't help. I bounced the active master. still nothing


What else can I check? what could be the reason? How can we get Thrift working again?

thanks

Jeff



Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message