kylin-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lijun Cao <641507...@qq.com>
Subject Re: Convert Cuboid Data to HFile出现异常
Date Mon, 22 Oct 2018 03:15:48 GMT
Hi 林荣任, 

The problem seems caused by HBase regionserver according to "java.io.IOException: Call to node2/192.168.88.215:16020 failed on local exception”.

Have you checked the health of HBase ?

Best Regards

Lijun Cao

> 在 2018年10月22日,10:50,林荣任 <3546059729@qq.com> 写道:
> 
> 集群环境:
> 
> hadoop 2.7.7
> hive 1.2.1   hbase-1.1.3-bin.tar
> zookeeper 3.4.6
> apache-kylin-2.3.1-bin.tar
> 
> 创建cube的时候,在Monitor模块观察,
> 执行到:Convert Cuboid Data to HFile出现异常,找不到原因.
> 
> 日志提示:
> 
> 2018-10-20 10:21:18,861 ERROR [Scheduler 125013446 FetcherRunner-40] dao.ExecutableDao:139 : error get all Jobs:
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Can't get the location
>         at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:316)
>         at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:156)
>         at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:60)
>         at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:212)
>         at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:314)
>         at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:289)
>         at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:164)
>         at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:159)
>         at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:796)
>         at org.apache.kylin.storage.hbase.HBaseResourceStore.visitFolder(HBaseResourceStore.java:180)
>         at org.apache.kylin.storage.hbase.HBaseResourceStore.listResourcesImpl(HBaseResourceStore.java:124)
>         at org.apache.kylin.common.persistence.ResourceStore.listResources(ResourceStore.java:131)
>         at org.apache.kylin.job.dao.ExecutableDao.getJobIds(ExecutableDao.java:129)
>         at org.apache.kylin.job.execution.ExecutableManager.getAllJobIds(ExecutableManager.java:243)
>         at org.apache.kylin.job.impl.threadpool.DefaultScheduler$FetcherRunner.run(DefaultScheduler.java:221)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: Failed to get result within timeout, timeout=10000ms
>         at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:206)
>         at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:60)
>         at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:212)
>         at org.apache.hadoop.hbase.client.ClientSmallReversedScanner.loadCache(ClientSmallReversedScanner.java:228)
>         at org.apache.hadoop.hbase.client.ClientSmallReversedScanner.next(ClientSmallReversedScanner.java:202)
>         at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1277)
>         at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1183)
>         at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:305)
>         ... 21 more
> 2018-10-20 10:21:18,863 ERROR [Scheduler 125013446 FetcherRunner-40] execution.ExecutableManager:245 : error get All Job Ids
> org.apache.kylin.job.exception.PersistentException: org.apache.hadoop.hbase.client.RetriesExhaustedException: Can't get the location
>         at org.apache.kylin.job.dao.ExecutableDao.getJobIds(ExecutableDao.java:140)
>         at org.apache.kylin.job.execution.ExecutableManager.getAllJobIds(ExecutableManager.java:243)
>         at org.apache.kylin.job.impl.threadpool.DefaultScheduler$FetcherRunner.run(DefaultScheduler.java:221)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Can't get the location
>         at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:316)
>         at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:156)
>         at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:60)
>         at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:212)
>         at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:314)
>         at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:289)
>         at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:164)
>         at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:159)
>         at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:796)
>         at org.apache.kylin.storage.hbase.HBaseResourceStore.visitFolder(HBaseResourceStore.java:180)
>         at org.apache.kylin.storage.hbase.HBaseResourceStore.listResourcesImpl(HBaseResourceStore.java:124)
>         at org.apache.kylin.common.persistence.ResourceStore.listResources(ResourceStore.java:131)
>         at org.apache.kylin.job.dao.ExecutableDao.getJobIds(ExecutableDao.java:129)
>         ... 9 more
> Caused by: java.io.IOException: Failed to get result within timeout, timeout=10000ms
>         at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:206)
>         at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:60)
>         at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:212)
>         at org.apache.hadoop.hbase.client.ClientSmallReversedScanner.loadCache(ClientSmallReversedScanner.java:228)
>         at org.apache.hadoop.hbase.client.ClientSmallReversedScanner.next(ClientSmallReversedScanner.java:202)
>         at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1277)
>         at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1183)
>         at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:305)
>         ... 21 more
> 2018-10-20 10:21:18,865 WARN  [pool-9-thread-1] threadpool.DefaultScheduler:273 : Job Fetcher caught a exception 
> java.lang.RuntimeException: org.apache.kylin.job.exception.PersistentException: org.apache.hadoop.hbase.client.RetriesExhaustedException: Can't get the location
>         at org.apache.kylin.job.execution.ExecutableManager.getAllJobIds(ExecutableManager.java:246)
>         at org.apache.kylin.job.impl.threadpool.DefaultScheduler$FetcherRunner.run(DefaultScheduler.java:221)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.kylin.job.exception.PersistentException: org.apache.hadoop.hbase.client.RetriesExhaustedException: Can't get the location
>         at org.apache.kylin.job.dao.ExecutableDao.getJobIds(ExecutableDao.java:140)
>         at org.apache.kylin.job.execution.ExecutableManager.getAllJobIds(ExecutableManager.java:243)
>         ... 8 more
> Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Can't get the location
>         at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:316)
>         at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:156)
>         at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:60)
>         at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:212)
>         at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:314)
>         at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:289)
>         at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:164)
>         at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:159)
>         at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:796)
>         at org.apache.kylin.storage.hbase.HBaseResourceStore.visitFolder(HBaseResourceStore.java:180)
>         at org.apache.kylin.storage.hbase.HBaseResourceStore.listResourcesImpl(HBaseResourceStore.java:124)
>         at org.apache.kylin.common.persistence.ResourceStore.listResources(ResourceStore.java:131)
>         at org.apache.kylin.job.dao.ExecutableDao.getJobIds(ExecutableDao.java:129)
>         ... 9 more
> Caused by: java.io.IOException: Failed to get result within timeout, timeout=10000ms
>         at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:206)
>         at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:60)
>         at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:212)
>         at org.apache.hadoop.hbase.client.ClientSmallReversedScanner.loadCache(ClientSmallReversedScanner.java:228)
>         at org.apache.hadoop.hbase.client.ClientSmallReversedScanner.next(ClientSmallReversedScanner.java:202)
>         at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1277)
>         at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1183)
>         at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:305)
>         ... 21 more
> 2018-10-20 10:21:24,371 INFO  [localhost-startStop-1-SendThread(node2:2181)] zookeeper.ClientCnxn:1098 : Unable to read additional data from server sessionid 0x1668f182edf0001, likely server has closed socket, closing socket connection and attempting reconnect
> 2018-10-20 10:21:24,372 INFO  [Scheduler 125013446 Job c61f7909-f57f-4ae6-8f3e-2992f3d1338c-359-SendThread(node2:2181)] zookeeper.ClientCnxn:1098 : Unable to read additional data from server sessionid 0x3668f184a920007, likely server has closed socket, closing socket connection and attempting reconnect
> 2018-10-20 10:21:24,386 INFO  [Scheduler 125013446 Job c61f7909-f57f-4ae6-8f3e-2992f3d1338c-615-SendThread(node2:2181)] zookeeper.ClientCnxn:1098 : Unable to read additional data from server sessionid 0x2668f182ec50003, likely server has closed socket, closing socket connection and attempting reconnect
> 2018-10-20 10:21:24,388 INFO  [Thread-8-SendThread(node2:2181)] zookeeper.ClientCnxn:1098 : Unable to read additional data from server sessionid 0x2668f182ec50002, likely server has closed socket, closing socket connection and attempting reconnect
> 2018-10-20 10:21:24,558 ERROR [hconnection-0x20233fed-metaLookup-shared--pool2-t3] zookeeper.RecoverableZooKeeper:274 : ZooKeeper getData failed after 4 attempts
> 2018-10-20 10:21:24,558 WARN  [hconnection-0x20233fed-metaLookup-shared--pool2-t3] zookeeper.ZKUtil:632 : hconnection-0x20233fed-0x1668f182edf0001, quorum=node1:2181,node2:2181,node3:2181, baseZNode=/hbase Unable to get data of znode /hbase/meta-region-server
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/meta-region-server
>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>         at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155)
>         at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:354)
>         at org.apache.hadoop.hbase.zookeeper.ZKUtil.getData(ZKUtil.java:624)
>         at org.apache.hadoop.hbase.zookeeper.MetaTableLocator.getMetaRegionState(MetaTableLocator.java:486)
>         at org.apache.hadoop.hbase.zookeeper.MetaTableLocator.getMetaRegionLocation(MetaTableLocator.java:167)
>         at org.apache.hadoop.hbase.zookeeper.MetaTableLocator.blockUntilAvailable(MetaTableLocator.java:606)
>         at org.apache.hadoop.hbase.zookeeper.MetaTableLocator.blockUntilAvailable(MetaTableLocator.java:587)
>         at org.apache.hadoop.hbase.zookeeper.MetaTableLocator.blockUntilAvailable(MetaTableLocator.java:560)
>         at org.apache.hadoop.hbase.client.ZooKeeperRegistry.getMetaRegionLocation(ZooKeeperRegistry.java:61)
>         at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateMeta(ConnectionManager.java:1213)
>         at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1180)
>         at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.relocateRegion(ConnectionManager.java:1154)
>         at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:303)
>         at org.apache.hadoop.hbase.client.ReversedScannerCallable.prepare(ReversedScannerCallable.java:105)
>         at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.prepare(ScannerCallableWithReplicas.java:376)
>         at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:135)
>         at org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:65)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> 2018-10-20 10:21:24,559 ERROR [hconnection-0x20233fed-metaLookup-shared--pool2-t3] zookeeper.ZooKeeperWatcher:719 : hconnection-0x20233fed-0x1668f182edf0001, quorum=node1:2181,node2:2181,node3:2181, baseZNode=/hbase Received unexpected KeeperException, re-throwing exception
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/meta-region-server
>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>         at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155)
>         at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:354)
>         at org.apache.hadoop.hbase.zookeeper.ZKUtil.getData(ZKUtil.java:624)
>         at org.apache.hadoop.hbase.zookeeper.MetaTableLocator.getMetaRegionState(MetaTableLocator.java:486)
>         at org.apache.hadoop.hbase.zookeeper.MetaTableLocator.getMetaRegionLocation(MetaTableLocator.java:167)
>         at org.apache.hadoop.hbase.zookeeper.MetaTableLocator.blockUntilAvailable(MetaTableLocator.java:606)
>         at org.apache.hadoop.hbase.zookeeper.MetaTableLocator.blockUntilAvailable(MetaTableLocator.java:587)
>         at org.apache.hadoop.hbase.zookeeper.MetaTableLocator.blockUntilAvailable(MetaTableLocator.java:560)
>         at org.apache.hadoop.hbase.client.ZooKeeperRegistry.getMetaRegionLocation(ZooKeeperRegistry.java:61)
>         at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateMeta(ConnectionManager.java:1213)
>         at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1180)
>         at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.relocateRegion(ConnectionManager.java:1154)
>         at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:303)
>         at org.apache.hadoop.hbase.client.ReversedScannerCallable.prepare(ReversedScannerCallable.java:105)
>         at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.prepare(ScannerCallableWithReplicas.java:376)
>         at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:135)
>         at org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:65)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> 2018-10-20 10:21:25,382 INFO  [Thread-8-SendThread(node3:2181)] zookeeper.ClientCnxn:975 : Opening socket connection to server node3/192.168.88.216:2181. Will not attempt to authenticate using SASL (unknown error)
> 2018-10-20 10:21:25,383 INFO  [Thread-8-SendThread(node3:2181)] zookeeper.ClientCnxn:852 : Socket connection established to node3/192.168.88.216:2181, initiating session
> 2018-10-20 10:21:25,384 INFO  [Thread-8-SendThread(node3:2181)] zookeeper.ClientCnxn:1098 : Unable to read additional data from server sessionid 0x2668f182ec50002, likely server has closed socket, closing socket connection and attempting reconnect
> 2018-10-20 10:21:25,518 INFO  [localhost-startStop-1-SendThread(node1:2181)] zookeeper.ClientCnxn:975 : Opening socket connection to server node1/192.168.88.214:2181. Will not attempt to authenticate using SASL (unknown error)
> 2018-10-20 10:21:25,519 INFO  [localhost-startStop-1-SendThread(node1:2181)] zookeeper.ClientCnxn:852 : Socket connection established to node1/192.168.88.214:2181, initiating session
> 2018-10-20 10:21:25,524 INFO  [localhost-startStop-1-SendThread(node1:2181)] zookeeper.ClientCnxn:1098 : Unable to read additional data from server sessionid 0x1668f182edf0001, likely server has closed socket, closing socket connection and attempting reconnect
> 2018-10-20 10:21:25,665 INFO  [Scheduler 125013446 Job c61f7909-f57f-4ae6-8f3e-2992f3d1338c-359-SendThread(node3:2181)] zookeeper.ClientCnxn:975 : Opening socket connection to server node3/192.168.88.216:2181. Will not attempt to authenticate using SASL (unknown error)
> 2018-10-20 10:21:25,666 INFO  [Scheduler 125013446 Job c61f7909-f57f-4ae6-8f3e-2992f3d1338c-359-SendThread(node3:2181)] zookeeper.ClientCnxn:852 : Socket connection established to node3/192.168.88.216:2181, initiating session
> 2018-10-20 10:21:25,668 INFO  [Scheduler 125013446 Job c61f7909-f57f-4ae6-8f3e-2992f3d1338c-359-SendThread(node3:2181)] zookeeper.ClientCnxn:1098 : Unable to read additional data from server sessionid 0x3668f184a920007, likely server has closed socket, closing socket connection and attempting reconnect
> 2018-10-20 10:21:25,685 INFO  [Scheduler 125013446 Job c61f7909-f57f-4ae6-8f3e-2992f3d1338c-615-SendThread(node1:2181)] zookeeper.ClientCnxn:975 : Opening socket connection to server node1/192.168.88.214:2181. Will not attempt to authenticate using SASL (unknown error)
> 2018-10-20 10:21:25,686 INFO  [Scheduler 125013446 Job c61f7909-f57f-4ae6-8f3e-2992f3d1338c-615-SendThread(node1:2181)] zookeeper.ClientCnxn:852 : Socket connection established to node1/192.168.88.214:2181, initiating session
> 2018-10-20 10:21:25,687 INFO  [Scheduler 125013446 Job c61f7909-f57f-4ae6-8f3e-2992f3d1338c-615-SendThread(node1:2181)] zookeeper.ClientCnxn:1098 : Unable to read additional data from server sessionid 0x2668f182ec50003, likely server has closed socket, closing socket connection and attempting reconnect
> 2018-10-20 10:21:25,841 INFO  [Scheduler 125013446 Job c61f7909-f57f-4ae6-8f3e-2992f3d1338c-359-SendThread(node1:2181)] zookeeper.ClientCnxn:975 : Opening socket connection to server node1/192.168.88.214:2181. Will not attempt to authenticate using SASL (unknown error)
> 2018-10-20 10:21:25,842 INFO  [Scheduler 125013446 Job c61f7909-f57f-4ae6-8f3e-2992f3d1338c-359-SendThread(node1:2181)] zookeeper.ClientCnxn:852 : Socket connection established to node1/192.168.88.214:2181, initiating session
> 2018-10-20 10:21:25,858 INFO  [Scheduler 125013446 Job c61f7909-f57f-4ae6-8f3e-2992f3d1338c-359-SendThread(node1:2181)] zookeeper.ClientCnxn:1098 : Unable to read additional data from server sessionid 0x3668f184a920007, likely server has closed socket, closing socket connection and attempting reconnect
> 2018-10-20 10:21:26,103 INFO  [Thread-8-SendThread(node1:2181)] zookeeper.ClientCnxn:975 : Opening socket connection to server node1/192.168.88.214:2181. Will not attempt to authenticate using SASL (unknown error)
> 2018-10-20 10:21:26,104 INFO  [Thread-8-SendThread(node1:2181)] zookeeper.ClientCnxn:852 : Socket connection established to node1/192.168.88.214:2181, initiating session
> 2018-10-20 10:21:26,107 INFO  [Thread-8-SendThread(node1:2181)] zookeeper.ClientCnxn:1098 : Unable to read additional data from server sessionid 0x2668f182ec50002, likely server has closed socket, closing socket connection and attempting reconnect
> 2018-10-20 10:21:26,166 INFO  [localhost-startStop-1-SendThread(node3:2181)] zookeeper.ClientCnxn:975 : Opening socket connection to server node3/192.168.88.216:2181. Will not attempt to authenticate using SASL (unknown error)
> 2018-10-20 10:21:26,167 INFO  [localhost-startStop-1-SendThread(node3:2181)] zookeeper.ClientCnxn:852 : Socket connection established to node3/192.168.88.216:2181, initiating session
> 2018-10-20 10:21:26,174 INFO  [localhost-startStop-1-SendThread(node3:2181)] zookeeper.ClientCnxn:1098 : Unable to read additional data from server sessionid 0x1668f182edf0001, likely server has closed socket, closing socket connection and attempting reconnect
> 2018-10-20 10:21:26,327 INFO  [Scheduler 125013446 Job c61f7909-f57f-4ae6-8f3e-2992f3d1338c-615-SendThread(node3:2181)] zookeeper.ClientCnxn:975 : Opening socket connection to server node3/192.168.88.216:2181. Will not attempt to authenticate using SASL (unknown error)
> 2018-10-20 10:21:26,329 INFO  [Scheduler 125013446 Job c61f7909-f57f-4ae6-8f3e-2992f3d1338c-615-SendThread(node3:2181)] zookeeper.ClientCnxn:852 : Socket connection established to node3/192.168.88.216:2181, initiating session
> 2018-10-20 10:21:26,784 INFO  [Scheduler 125013446 Job c61f7909-f57f-4ae6-8f3e-2992f3d1338c-615-SendThread(node3:2181)] zookeeper.ClientCnxn:1098 : Unable to read additional data from server sessionid 0x2668f182ec50003, likely server has closed socket, closing socket connection and attempting reconnect
> 2018-10-20 10:21:26,905 INFO  [Scheduler 125013446 Job c61f7909-f57f-4ae6-8f3e-2992f3d1338c-359-SendThread(node2:2181)] zookeeper.ClientCnxn:975 : Opening socket connection to server node2/192.168.88.215:2181. Will not attempt to authenticate using SASL (unknown error)
> 2018-10-20 10:21:26,905 INFO  [Scheduler 125013446 Job c61f7909-f57f-4ae6-8f3e-2992f3d1338c-359-SendThread(node2:2181)] zookeeper.ClientCnxn:852 : Socket connection established to node2/192.168.88.215:2181, initiating session
> 2018-10-20 10:21:26,907 INFO  [Scheduler 125013446 Job c61f7909-f57f-4ae6-8f3e-2992f3d1338c-359-SendThread(node2:2181)] zookeeper.ClientCnxn:1098 : Unable to read additional data from server sessionid 0x3668f184a920007, likely server has closed socket, closing socket connection and attempting reconnect
> 2018-10-20 10:21:27,232 INFO  [localhost-startStop-1-SendThread(node2:2181)] zookeeper.ClientCnxn:975 : Opening socket connection to server node2/192.168.88.215:2181. Will not attempt to authenticate using SASL (unknown error)
> 2018-10-20 10:21:27,236 INFO  [localhost-startStop-1-SendThread(node2:2181)] zookeeper.ClientCnxn:852 : Socket connection established to node2/192.168.88.215:2181, initiating session
> 2018-10-20 10:21:27,239 INFO  [localhost-startStop-1-SendThread(node2:2181)] zookeeper.ClientCnxn:1098 : Unable to read additional data from server sessionid 0x1668f182edf0001, likely server has closed socket, closing socket connection and attempting reconnect
> 2018-10-20 10:21:27,299 INFO  [Thread-8-SendThread(node2:2181)] zookeeper.ClientCnxn:975 : Opening socket connection to server node2/192.168.88.215:2181. Will not attempt to authenticate using SASL (unknown error)
> 2018-10-20 10:21:27,301 INFO  [Thread-8-SendThread(node2:2181)] zookeeper.ClientCnxn:852 : Socket connection established to node2/192.168.88.215:2181, initiating session
> 2018-10-20 10:21:27,306 INFO  [Thread-8-SendThread(node2:2181)] zookeeper.ClientCnxn:1098 : Unable to read additional data from server sessionid 0x2668f182ec50002, likely server has closed socket, closing socket connection and attempting reconnect
> 2018-10-20 10:21:27,619 INFO  [Scheduler 125013446 Job c61f7909-f57f-4ae6-8f3e-2992f3d1338c-615-SendThread(node2:2181)] zookeeper.ClientCnxn:975 : Opening socket connection to server node2/192.168.88.215:2181. Will not attempt to authenticate using SASL (unknown error)
> 2018-10-20 10:21:27,621 INFO  [Scheduler 125013446 Job c61f7909-f57f-4ae6-8f3e-2992f3d1338c-615-SendThread(node2:2181)] zookeeper.ClientCnxn:852 : Socket connection established to node2/192.168.88.215:2181, initiating session
> 2018-10-20 10:21:27,625 INFO  [Scheduler 125013446 Job c61f7909-f57f-4ae6-8f3e-2992f3d1338c-615-SendThread(node2:2181)] zookeeper.ClientCnxn:1098 : Unable to read additional data from server sessionid 0x2668f182ec50003, likely server has closed socket, closing socket connection and attempting reconnect
> 2018-10-20 10:21:28,022 INFO  [Thread-8-SendThread(node3:2181)] zookeeper.ClientCnxn:975 : Opening socket connection to server node3/192.168.88.216:2181. Will not attempt to authenticate using SASL (unknown error)
> 2018-10-20 10:21:28,023 INFO  [Thread-8-SendThread(node3:2181)] zookeeper.ClientCnxn:852 : Socket connection established to node3/192.168.88.216:2181, initiating session
> 2018-10-20 10:21:28,026 INFO  [Thread-8-SendThread(node3:2181)] zookeeper.ClientCnxn:1098 : Unable to read additional data from server sessionid 0x2668f182ec50002, likely server has closed socket, closing socket connection and attempting reconnect
> 2018-10-20 10:21:28,185 INFO  [Scheduler 125013446 Job c61f7909-f57f-4ae6-8f3e-2992f3d1338c-359-SendThread(node3:2181)] zookeeper.ClientCnxn:975 : Opening socket connection to server node3/192.168.88.216:2181. Will not attempt to authenticate using SASL (unknown error)
> 2018-10-20 10:21:28,187 INFO  [Scheduler 125013446 Job c61f7909-f57f-4ae6-8f3e-2992f3d1338c-359-SendThread(node3:2181)] zookeeper.ClientCnxn:852 : Socket connection established to node3/192.168.88.216:2181, initiating session
> 2018-10-20 10:21:28,191 INFO  [Scheduler 125013446 Job c61f7909-f57f-4ae6-8f3e-2992f3d1338c-359-SendThread(node3:2181)] zookeeper.ClientCnxn:1098 : Unable to read additional data from server sessionid 0x3668f184a920007, likely server has closed socket, closing socket connection and attempting reconnect
> 2018-10-20 10:21:28,798 INFO  [Thread-8-SendThread(node1:2181)] zookeeper.ClientCnxn:975 : Opening socket connection to server node1/192.168.88.214:2181. Will not attempt to authenticate using SASL (unknown error)
> 2018-10-20 10:21:28,798 INFO  [Thread-8-SendThread(node1:2181)] zookeeper.ClientCnxn:852 : Socket connection established to node1/192.168.88.214:2181, initiating session
> 2018-10-20 10:21:28,799 INFO  [Thread-8-SendThread(node1:2181)] zookeeper.ClientCnxn:1098 : Unable to read additional data from server sessionid 0x2668f182ec50002, likely server has closed socket, closing socket connection and attempting reconnect
> 2018-10-20 10:21:29,055 INFO  [localhost-startStop-1-SendThread(node1:2181)] zookeeper.ClientCnxn:975 : Opening socket connection to server node1/192.168.88.214:2181. Will not attempt to authenticate using SASL (unknown error)
> 2018-10-20 10:21:29,056 INFO  [localhost-startStop-1-SendThread(node1:2181)] zookeeper.ClientCnxn:852 : Socket connection established to node1/192.168.88.214:2181, initiating session
> 2018-10-20 10:21:29,061 INFO  [localhost-startStop-1-SendThread(node1:2181)] zookeeper.ClientCnxn:1098 : Unable to read additional data from server sessionid 0x1668f182edf0001, likely server has closed socket, closing socket connection and attempting reconnect
> 2018-10-20 10:21:29,196 INFO  [Scheduler 125013446 Job c61f7909-f57f-4ae6-8f3e-2992f3d1338c-615-SendThread(node1:2181)] zookeeper.ClientCnxn:975 : Opening socket connection to server node1/192.168.88.214:2181. Will not attempt to authenticate using SASL (unknown error)
> 2018-10-20 10:21:29,197 INFO  [Scheduler 125013446 Job c61f7909-f57f-4ae6-8f3e-2992f3d1338c-615-SendThread(node1:2181)] zookeeper.ClientCnxn:852 : Socket connection established to node1/192.168.88.214:2181, initiating session
> 2018-10-20 10:21:29,200 INFO  [Scheduler 125013446 Job c61f7909-f57f-4ae6-8f3e-2992f3d1338c-615-SendThread(node1:2181)] zookeeper.ClientCnxn:1098 : Unable to read additional data from server sessionid 0x2668f182ec50003, likely server has closed socket, closing socket connection and attempting reconnect
> 2018-10-20 10:21:29,253 INFO  [Scheduler 125013446 Job c61f7909-f57f-4ae6-8f3e-2992f3d1338c-359-SendThread(node1:2181)] zookeeper.ClientCnxn:975 : Opening socket connection to server node1/192.168.88.214:2181. Will not attempt to authenticate using SASL (unknown error)
> 2018-10-20 10:21:29,255 INFO  [Scheduler 125013446 Job c61f7909-f57f-4ae6-8f3e-2992f3d1338c-359-SendThread(node1:2181)] zookeeper.ClientCnxn:852 : Socket connection established to node1/192.168.88.214:2181, initiating session
> 2018-10-20 10:21:29,259 INFO  [Scheduler 125013446 Job c61f7909-f57f-4ae6-8f3e-2992f3d1338c-359-SendThread(node1:2181)] zookeeper.ClientCnxn:1098 : Unable to read additional data from server sessionid 0x3668f184a920007, likely server has closed socket, closing socket connection and attempting reconnect
> 2018-10-20 10:21:29,517 INFO  [Scheduler 125013446 Job c61f7909-f57f-4ae6-8f3e-2992f3d1338c-359-SendThread(node2:2181)] zookeeper.ClientCnxn:975 : Opening socket connection to server node2/192.168.88.215:2181. Will not attempt to authenticate using SASL (unknown error)
> 2018-10-20 10:21:29,518 INFO  [Scheduler 125013446 Job c61f7909-f57f-4ae6-8f3e-2992f3d1338c-359-SendThread(node2:2181)] zookeeper.ClientCnxn:852 : Socket connection established to node2/192.168.88.215:2181, initiating session
> 2018-10-20 10:21:29,798 INFO  [Scheduler 125013446 Job c61f7909-f57f-4ae6-8f3e-2992f3d1338c-615-SendThread(node3:2181)] zookeeper.ClientCnxn:975 : Opening socket connection to server node3/192.168.88.216:2181. Will not attempt to authenticate using SASL (unknown error)
> 2018-10-20 10:21:29,800 INFO  [Scheduler 125013446 Job c61f7909-f57f-4ae6-8f3e-2992f3d1338c-615-SendThread(node3:2181)] zookeeper.ClientCnxn:852 : Socket connection established to node3/192.168.88.216:2181, initiating session
> 2018-10-20 10:21:29,802 INFO  [Scheduler 125013446 Job c61f7909-f57f-4ae6-8f3e-2992f3d1338c-615-SendThread(node3:2181)] zookeeper.ClientCnxn:1098 : Unable to read additional data from server sessionid 0x2668f182ec50003, likely server has closed socket, closing socket connection and attempting reconnect
> 2018-10-20 10:21:30,058 INFO  [localhost-startStop-1-SendThread(node3:2181)] zookeeper.ClientCnxn:975 : Opening socket connection to server node3/192.168.88.216:2181. Will not attempt to authenticate using SASL (unknown error)
> 2018-10-20 10:21:30,060 INFO  [localhost-startStop-1-SendThread(node3:2181)] zookeeper.ClientCnxn:852 : Socket connection established to node3/192.168.88.216:2181, initiating session
> 2018-10-20 10:21:30,063 INFO  [localhost-startStop-1-SendThread(node3:2181)] zookeeper.ClientCnxn:1098 : Unable to read additional data from server sessionid 0x1668f182edf0001, likely server has closed socket, closing socket connection and attempting reconnect
> 2018-10-20 10:21:30,133 INFO  [Scheduler 125013446 Job c61f7909-f57f-4ae6-8f3e-2992f3d1338c-615-SendThread(node2:2181)] zookeeper.ClientCnxn:975 : Opening socket connection to server node2/192.168.88.215:2181. Will not attempt to authenticate using SASL (unknown error)
> 2018-10-20 10:21:30,136 INFO  [Scheduler 125013446 Job c61f7909-f57f-4ae6-8f3e-2992f3d1338c-615-SendThread(node2:2181)] zookeeper.ClientCnxn:852 : Socket connection established to node2/192.168.88.215:2181, initiating session
> 2018-10-20 10:21:30,399 INFO  [Thread-8-SendThread(node2:2181)] zookeeper.ClientCnxn:975 : Opening socket connection to server node2/192.168.88.215:2181. Will not attempt to authenticate using SASL (unknown error)
> 2018-10-20 10:21:30,401 INFO  [Thread-8-SendThread(node2:2181)] zookeeper.ClientCnxn:852 : Socket connection established to node2/192.168.88.215:2181, initiating session
> 2018-10-20 10:21:30,931 INFO  [localhost-startStop-1-SendThread(node2:2181)] zookeeper.ClientCnxn:975 : Opening socket connection to server node2/192.168.88.215:2181. Will not attempt to authenticate using SASL (unknown error)
> 2018-10-20 10:21:30,932 INFO  [localhost-startStop-1-SendThread(node2:2181)] zookeeper.ClientCnxn:852 : Socket connection established to node2/192.168.88.215:2181, initiating session
> 2018-10-20 10:21:38,848 ERROR [Scheduler 125013446 FetcherRunner-40] dao.ExecutableDao:139 : error get all Jobs:
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Can't get the location
>         at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:316)
>         at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:156)
>         at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:60)
>         at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:212)
>         at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:314)
>         at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:289)
>         at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:164)
>         at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:159)
>         at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:796)
>         at org.apache.kylin.storage.hbase.HBaseResourceStore.visitFolder(HBaseResourceStore.java:180)
>         at org.apache.kylin.storage.hbase.HBaseResourceStore.listResourcesImpl(HBaseResourceStore.java:124)
>         at org.apache.kylin.common.persistence.ResourceStore.listResources(ResourceStore.java:131)
>         at org.apache.kylin.job.dao.ExecutableDao.getJobIds(ExecutableDao.java:129)
>         at org.apache.kylin.job.execution.ExecutableManager.getAllJobIds(ExecutableManager.java:243)
>         at org.apache.kylin.job.impl.threadpool.DefaultScheduler$FetcherRunner.run(DefaultScheduler.java:221)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: Failed to get result within timeout, timeout=10000ms
>         at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:206)
>         at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:60)
>         at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:212)
>         at org.apache.hadoop.hbase.client.ClientSmallReversedScanner.loadCache(ClientSmallReversedScanner.java:228)
>         at org.apache.hadoop.hbase.client.ClientSmallReversedScanner.next(ClientSmallReversedScanner.java:202)
>         at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1277)
>         at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1183)
>         at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:305)
>         ... 21 more
> 2018-10-20 10:21:38,849 ERROR [Scheduler 125013446 FetcherRunner-40] execution.ExecutableManager:245 : error get All Job Ids
> org.apache.kylin.job.exception.PersistentException: org.apache.hadoop.hbase.client.RetriesExhaustedException: Can't get the location
>         at org.apache.kylin.job.dao.ExecutableDao.getJobIds(ExecutableDao.java:140)
>         at org.apache.kylin.job.execution.ExecutableManager.getAllJobIds(ExecutableManager.java:243)
>         at org.apache.kylin.job.impl.threadpool.DefaultScheduler$FetcherRunner.run(DefaultScheduler.java:221)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Can't get the location
>         at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:316)
>         at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:156)
>         at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:60)
>         at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:212)
>         at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:314)
>         at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:289)
>         at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:164)
>         at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:159)
>         at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:796)
>         at org.apache.kylin.storage.hbase.HBaseResourceStore.visitFolder(HBaseResourceStore.java:180)
>         at org.apache.kylin.storage.hbase.HBaseResourceStore.listResourcesImpl(HBaseResourceStore.java:124)
>         at org.apache.kylin.common.persistence.ResourceStore.listResources(ResourceStore.java:131)
>         at org.apache.kylin.job.dao.ExecutableDao.getJobIds(ExecutableDao.java:129)
>         ... 9 more
> Caused by: java.io.IOException: Failed to get result within timeout, timeout=10000ms
>         at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:206)
>         at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:60)
>         at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:212)
>         at org.apache.hadoop.hbase.client.ClientSmallReversedScanner.loadCache(ClientSmallReversedScanner.java:228)
>         at org.apache.hadoop.hbase.client.ClientSmallReversedScanner.next(ClientSmallReversedScanner.java:202)
>         at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1277)
>         at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1183)
>         at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:305)
>         ... 21 more
> 2018-10-20 10:21:38,849 WARN  [pool-9-thread-1] threadpool.DefaultScheduler:273 : Job Fetcher caught a exception 
> java.lang.RuntimeException: org.apache.kylin.job.exception.PersistentException: org.apache.hadoop.hbase.client.RetriesExhaustedException: Can't get the location
>         at org.apache.kylin.job.execution.ExecutableManager.getAllJobIds(ExecutableManager.java:246)
>         at org.apache.kylin.job.impl.threadpool.DefaultScheduler$FetcherRunner.run(DefaultScheduler.java:221)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.kylin.job.exception.PersistentException: org.apache.hadoop.hbase.client.RetriesExhaustedException: Can't get the location
>         at org.apache.kylin.job.dao.ExecutableDao.getJobIds(ExecutableDao.java:140)
>         at org.apache.kylin.job.execution.ExecutableManager.getAllJobIds(ExecutableManager.java:243)
>         ... 8 more
> Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Can't get the location
>         at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:316)
>         at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:156)
>         at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:60)
>         at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:212)
>         at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:314)
>         at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:289)
>         at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:164)
>         at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:159)
>         at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:796)
>         at org.apache.kylin.storage.hbase.HBaseResourceStore.visitFolder(HBaseResourceStore.java:180)
>         at org.apache.kylin.storage.hbase.HBaseResourceStore.listResourcesImpl(HBaseResourceStore.java:124)
>         at org.apache.kylin.common.persistence.ResourceStore.listResources(ResourceStore.java:131)
>         at org.apache.kylin.job.dao.ExecutableDao.getJobIds(ExecutableDao.java:129)
>         ... 9 more
> Caused by: java.io.IOException: Failed to get result within timeout, timeout=10000ms
>         at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:206)
>         at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:60)
>         at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:212)
>         at org.apache.hadoop.hbase.client.ClientSmallReversedScanner.loadCache(ClientSmallReversedScanner.java:228)
>         at org.apache.hadoop.hbase.client.ClientSmallReversedScanner.next(ClientSmallReversedScanner.java:202)
>         at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1277)
>         at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1183)
>         at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:305)
>         ... 21 more
>   
>   
>   
> web端异常信息(http://192.168.88.214:7070/kylin <http://192.168.88.214:7070/kylin>)
> 
> org.apache.kylin.job.exception.PersistentException: 
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=1, exceptions:
> Sat Oct 20 11:49:33 GMT+08:00 2018, RpcRetryingCaller{globalStartTime=1540007366350, pause=100, retries=1}, 
> java.io.IOException: Call to node2/192.168.88.215:16020 failed on local exception: 
> org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=205, waitTime=7190, operationTimeout=5000 expired.
> 
>  
> 找不到原因,hbase-site.xml配置信息如下:
> 
> <configuration>
> <property>
> <name>hbase.rootdir</name>
> <value>hdfs://node1:9000/hbase</value>
> </property>
> <property>
> <name>hbase.master</name>
> <value>node1</value>
> </property>
> <property>
> <name>hbase.cluster.distributed</name>
> <value>true</value>
> </property>
> <property>
> <name>hbase.tmp.dir</name>
> <value>/home/hadoop/hbase</value>
> </property>
> <property>
> <name>hbase.master.maxclockskew</name>
> <value>2000000</value>
> </property>
> <property>
> <name>hbase.zookeeper.property.clientPort</name>
> <value>2181</value>
> </property>
> <property>
> <name>hbase.zookeeper.quorum</name>
> <value>node1,node2,node3</value>
> </property>
> <property>
> <name>hbase.thrift.connection.max-idletime</name>
> <value>180000000</value>
> </property>
> <property>
> <name>hbase.client.operation.timeout</name>
> <value>300000000</value>
> </property>
> <property>
> <name>hbase.client.scanner.timeout.period</name>
> <value>300000000</value>
> </property>
> <property>
> <name>hbase.rpc.timeout</name>
> <value>360000000</value>
> </property>
> <property>
> <name>zookeeper.session.timeout</name>
> <value>60000000</value>
> </property>
> <property>
> <name>hbase.zookeeper.property.dataDir</name>
> <value>/home/hadoop/hbase/zookeeper</value>
> </property>
> 
> </configuration>
> 
> <QQ图片20181022102953.png>


Mime
View raw message