kylin-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ShaoFeng Shi <shaofeng...@apache.org>
Subject Re: 答复: Kylin 2.3.1 cluster - some nodes fail to query against cube
Date Mon, 16 Jul 2018 05:55:10 GMT
I saw this error once in my colleague's server. It seems like an
environment issue. Later he restarted HBase and the problem was
disappeared.

2018-07-12 21:53 GMT+08:00 Ge Silas <gosoy@live.cn>:

> This really sounds like an environmental issue. kylin.log in node 2 should
> have logs about a query ID started and completed. Does kylin.log on node 3
> have any logs like that?
>
> And from your log, this is already the 35th retry so I think long before
> the query fails, the kylin's status was abnormal already on node 3.
>
> Thanks,
> Silas
> ------------------------------
> *发件人:* Phil Scott <phil.scott@pricespider.com>
> *发送时间:* 2018年7月6日 8:24
> *收件人:* user@kylin.apache.org
> *主题:* Fwd: Kylin 2.3.1 cluster - some nodes fail to query against cube
>
>
>
>
> *Problem*:
>
> When I perform a simple SUM() query on my built cube, it runs sub-second
> on 1 cluster node, but the other 2 cluster nodes don't recognize a cube for
> that query and they run forever (or fail silently without telling the UI
> that execution has halted).
>
>
>
> *Context*:
>
> Version: Kylin 2.3.1
>
> Mode: Clustered
>
>
>
> I have created a Kylin Cube on top of a Fact table in Hive, and Built a
> data segment using a sample date range. My Kylin configuration is running
> as a 3 node cluster.
>
> Node 1 is configured as a job & query server (in conf/kylin.properties the
> setting is:*kylin.server.mode=all*).
>
> Nodes 2 and 3 are configured as query-only servers (in
> conf/kylin.properties the setting is:*kylin.server.mode=query*)
>
> Once I have successfully built my cube with a data segment, I try to run a
> query like this in the Kylin UI Insight tab:
>
>          SELECT SUM(some_metric) AS value FROM my_fact_table
>
>
>
> If I execute this query from the web UI on node 1 or node 3, the query
> goes into [executing] status forever.
>
> If I execute the exact same query from the web UI on node 2, the query
> returns in 0.02 seconds.
>
> So, my nodes 1 and 3 are rendered useless as end-points for querying.
>
> See picture of results on node 2 and 3:
>
> [image: https://i.stack.imgur.com/V8Yvs.png]
>
>
>
> I've compared the kylin/logs/kylin.log files for node 1 (failing) and node
> 2 (working). Both logs matched each other message for message up until the
> following spot where node 1 fails... See below:
>
>
>
> 2018-07-02 16:38:25,629 DEBUG [Query eaf48991-94fd-40cd-9834-1097e79c6840-74]
> enumerator.OLAPEnumerator:120 : return TupleIterator...
>
>
>
> 2018-07-02 16:38:46,337 INFO  [Scheduler 256150323 FetcherRunner-69]
> threadpool.DefaultScheduler:268 : Job Fetcher: 0 should running, 0 actual
> running, 0 stopped, 0 ready, 588 already succeed, 43 error, 49 discarded, 0
> others
>
>
>
> 2018-07-02 16:39:05,911 INFO  [kylin-coproc--pool3-t1]
> client.RpcRetryingCaller:146 : Call exception, tries=10, retries=35,
> started=68253 ms ago, cancelled=false, msg=*java.io.IOException: Message
> missing required fields: compressedRows, stats*
>
>         at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2195)
>
>         at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
>
>         at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecu
> tor.java:187)
>
>         at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecu
> tor.java:167)
>
> Caused by: com.google.protobuf.UninitializedMessageException: Message
> missing required fields: compressedRows, stats
>
>         at com.google.protobuf.AbstractMessage$Builder.newUninitialized
> MessageException(AbstractMessage.java:770)
>
>         at org.apache.kylin.storage.hbase.cube.v2.coprocessor.endpoint.
> generated.CubeVisitProtos$CubeVisitResponse$Builder.build(
> CubeVisitProtos.java:5019)
>
>         at org.apache.kylin.storage.hbase.cube.v2.coprocessor.endpoint.
> generated.CubeVisitProtos$CubeVisitResponse$Builder.build(
> CubeVisitProtos.java:4949)
>
>         at org.apache.hadoop.hbase.regionserver.HRegion.execService(
> HRegion.java:7866)
>
>         at org.apache.hadoop.hbase.regionserver.RSRpcServices.execServi
> ceOnRegion(RSRpcServices.java:1980)
>
>         at org.apache.hadoop.hbase.regionserver.RSRpcServices.execServi
> ce(RSRpcServices.java:1962)
>
>         at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$
> ClientService$2.callBlockingMethod(ClientProtos.java:32389)
>
>         at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2150)
>
>
>
> So, the *client.RpcRetryingCaller* (I assume that's the Kylin server
> making an RPC call to HBase) is failing. The error message is:
>
>
>
>          java.io.IOException: Message missing required fields:
> compressedRows, stats
>
>
>
>
>
>
>
> *Questions*
>
> 1.      What might cause this?
>
> 2.      Is there a way that I can make nodes 1 & 3 "sync up" or
> clear/reload from built cube data so that they respond (without having to
> rebuild my cube)?  Or is this an issue with Nodes 1 & 3 failing to
> communicate with HBase?  I’ve run command-line hbase queries on all 3 nodes
> to make sure they can all communicate with hbase…
>
> 3.      How can I diagnose whether a cube is being recognized by a
> particular cluster node?
>
>
>
>
>
>
>
> -Phil Scott
>
>
>
>


-- 
Best regards,

Shaofeng Shi 史少锋

Mime
View raw message