kylin-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ge Silas <>
Subject 答复: Kylin 2.3.1 cluster - some nodes fail to query against cube
Date Thu, 12 Jul 2018 13:53:41 GMT
This really sounds like an environmental issue. kylin.log in node 2 should have logs about
a query ID started and completed. Does kylin.log on node 3 have any logs like that?

And from your log, this is already the 35th retry so I think long before the query fails,
the kylin's status was abnormal already on node 3.

发件人: Phil Scott <>
发送时间: 2018年7月6日 8:24
主题: Fwd: Kylin 2.3.1 cluster - some nodes fail to query against cube


When I perform a simple SUM() query on my built cube, it runs sub-second on 1 cluster node,
but the other 2 cluster nodes don't recognize a cube for that query and they run forever (or
fail silently without telling the UI that execution has halted).


Version: Kylin 2.3.1

Mode: Clustered

I have created a Kylin Cube on top of a Fact table in Hive, and Built a data segment using
a sample date range. My Kylin configuration is running as a 3 node cluster.

Node 1 is configured as a job & query server (in conf/ the setting is:kylin.server.mode=all).

Nodes 2 and 3 are configured as query-only servers (in conf/ the setting is:kylin.server.mode=query)

Once I have successfully built my cube with a data segment, I try to run a query like this
in the Kylin UI Insight tab:

         SELECT SUM(some_metric) AS value FROM my_fact_table

If I execute this query from the web UI on node 1 or node 3, the query goes into [executing]
status forever.

If I execute the exact same query from the web UI on node 2, the query returns in 0.02 seconds.

So, my nodes 1 and 3 are rendered useless as end-points for querying.

See picture of results on node 2 and 3:


I've compared the kylin/logs/kylin.log files for node 1 (failing) and node 2 (working). Both
logs matched each other message for message up until the following spot where node 1 fails...
See below:

2018-07-02 16:38:25,629 DEBUG [Query eaf48991-94fd-40cd-9834-1097e79c6840-74] enumerator.OLAPEnumerator:120
: return TupleIterator...

2018-07-02 16:38:46,337 INFO  [Scheduler 256150323 FetcherRunner-69] threadpool.DefaultScheduler:268
: Job Fetcher: 0 should running, 0 actual running, 0 stopped, 0 ready, 588 already succeed,
43 error, 49 discarded, 0 others

2018-07-02 16:39:05,911 INFO  [kylin-coproc--pool3-t1] client.RpcRetryingCaller:146 : Call
exception, tries=10, retries=35, started=68253 ms ago, cancelled=false,
Message missing required fields: compressedRows, stats



        at org.apache.hadoop.hbase.ipc.RpcExecutor$

        at org.apache.hadoop.hbase.ipc.RpcExecutor$

Caused by: Message missing required fields:
compressedRows, stats




        at org.apache.hadoop.hbase.regionserver.HRegion.execService(

        at org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(

        at org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(

        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(


So, the client.RpcRetryingCaller (I assume that's the Kylin server making an RPC call to HBase)
is failing. The error message is:

 Message missing required fields: compressedRows, stats


1.      What might cause this?

2.      Is there a way that I can make nodes 1 & 3 "sync up" or clear/reload from built
cube data so that they respond (without having to rebuild my cube)?  Or is this an issue with
Nodes 1 & 3 failing to communicate with HBase?  I’ve run command-line hbase queries
on all 3 nodes to make sure they can all communicate with hbase…

3.      How can I diagnose whether a cube is being recognized by a particular cluster node?

-Phil Scott

View raw message