kylin-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "张磊" <121762...@qq.com>
Subject 回复: Exceed scan threshold at 10000001
Date Thu, 27 Oct 2016 09:56:28 GMT
Do you mean when i query, i should add where clause, 
but in some case, the number of records > threshold, how can i do?
For example, order by all groups, the number of the  all groups >  threshold




------------------ 原始邮件 ------------------
发件人: "Alberto Ramón";<a.ramonportoles@gmail.com>;
发送时间: 2016年10月27日(星期四) 下午5:47
收件人: "dev"<dev@kylin.apache.org>; 

主题: Re: Exceed scan threshold at 10000001



 ERROR: Scan row count exceeded threshold

MailList
<http://mail-archives.apache.org/mod_mbox/kylin-user/201608.mbox/%3CCALjEW7M_YYi7Xs55OqPdxS6pzNvD0%2BamN2AX3hetnF0%3D9uFnow%40mail.gmail.com%3E>
Kilin
1787 <https://issues.apache.org/jira/browse/KYLIN-1787>v1.5.3

*Scan row count exceeded threshold: 1000000, please add filter condition to
narrow down backend scan range, like where clause*


BR, Alb

2016-10-27 11:40 GMT+02:00 张磊 <121762713@qq.com>:

> Hi
>
>
> When i query a sql, I do not know why should scan hbase? How can i do?
> Thanks!
>
>
> Table: lineorder  12,000,000 row records
> Dimensions: LO_CUSTKEY,LO_PARTKEY
> Measures: count(1), sum(LO_REVENUE)
>
>
> Query SQL: select count(1),sum(LO_REVENUE) from lineorder group by
> LO_CUSTKEY,LO_PARTKEY order by LO_CUSTKEY,LO_PARTKEY limit 50
>
>
> I build a cude with two Dimensions and two Measures(count and sum), the
> size of the Htable is 98 MB, when i execute a query in insight, it shows
> Error in coprocessor; and i check the hbase log, i find blow messages
>
>
> 2016-10-27 02:06:13,470 INFO  [B.defaultRpcServer.handler=4,queue=1,port=16020]
> gridtable.GTScanRequest: pre aggregation is not beneficial, skip it
> 2016-10-27 02:06:13,470 INFO  [B.defaultRpcServer.handler=4,queue=1,port=16020]
> endpoint.CubeVisitService: Scanned 1 rows from HBase.
>
>
> 2016-10-27 02:24:20,884 INFO  [B.defaultRpcServer.handler=6,queue=0,port=16020]
> endpoint.CubeVisitService: Scanned 9999001 rows from HBase.
> 2016-10-27 02:24:20,889 INFO  [B.defaultRpcServer.handler=6,queue=0,port=16020]
> endpoint.CubeVisitService: The cube visit did not finish normally because
> scan num exceeds threshold
> org.apache.kylin.gridtable.GTScanExceedThresholdException: Exceed scan
> threshold at 10000001
>         at org.apache.kylin.storage.hbase.cube.v2.coprocessor.
> endpoint.CubeVisitService$2.hasNext(CubeVisitService.java:267)
>         at org.apache.kylin.storage.hbase.cube.v2.HBaseReadonlyStore$1$1.
> hasNext(HBaseReadonlyStore.java:111)
>         at org.apache.kylin.storage.hbase.cube.v2.coprocessor.
> endpoint.CubeVisitService.visitCube(CubeVisitService.java:299)
>         at org.apache.kylin.storage.hbase.cube.v2.coprocessor.
> endpoint.generated.CubeVisitProtos$CubeVisitService.callMethod(
> CubeVisitProtos.java:3952)
>         at org.apache.hadoop.hbase.regionserver.HRegion.
> execService(HRegion.java:7815)
>         at org.apache.hadoop.hbase.regionserver.RSRpcServices.
> execServiceOnRegion(RSRpcServices.java:1986)
>         at org.apache.hadoop.hbase.regionserver.RSRpcServices.
> execService(RSRpcServices.java:1968)
>         at org.apache.hadoop.hbase.protobuf.generated.
> ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:33652)
>         at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2178)
>         at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
>         at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(
> RpcExecutor.java:133)
>         at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.
> java:108)
>         at java.lang.Thread.run(Thread.java:745)
Mime
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message