drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Julian Hyde <jh...@apache.org>
Subject Re: Optimizing SUM(1) query
Date Tue, 15 Mar 2016 19:20:16 GMT
Is there any reason why Drill cannot transform SUM(1) to COUNT(*) at an early stage (i.e. using
a logical optimization rule) so that this optimization does not need to be done for each engine?

> On Mar 15, 2016, at 5:29 AM, Sudip Mukherjee <smukherjee@commvault.com> wrote:
> 
> I was trying to have an Optimizer rule for the solr storage plugin that I'm working on
for this query. Trying to use SOLR field stats for this , so that the query is faster..
> Getting the below exception while transforming project to scan. Could you please advise?
> 
> 
> 2016-03-15 08:20:35,149 [291801ee-33fc-064d-7aff-18391f15ae0e:foreman] DEBUG o.a.d.e.p.s.h.DefaultSqlHandler
- Drill Logical :
> DrillScreenRel: rowcount = 1.0, cumulative cost = {60.1 rows, 320.1 cpu, 0.0 io, 0.0
network, 176.0 memory}, id = 49
>  DrillProjectRel(EXPR$0=[$0]): rowcount = 1.0, cumulative cost = {60.0 rows, 320.0 cpu,
0.0 io, 0.0 network, 176.0 memory}, id = 48
>    DrillAggregateRel(group=[{}], EXPR$0=[SUM($0)]): rowcount = 1.0, cumulative cost =
{60.0 rows, 320.0 cpu, 0.0 io, 0.0 network, 176.0 memory}, id = 46
>      DrillProjectRel($f0=[1]): rowcount = 20.0, cumulative cost = {40.0 rows, 80.0 cpu,
0.0 io, 0.0 network, 0.0 memory}, id = 44
>        DrillScanRel(table=[[solr, ANalert_494]], groupscan=[SolrGroupScan [SolrScanSpec=SolrScanSpec
[solrCoreName=ANalert_494, solrUrl=http://localhost:20000/solr/ filter=[], solrDocFetchCount=-1,
aggreegation=[]], columns=[`*`]]]): rowcount = 20.0, cumulative cost = {20.0 rows, 0.0 cpu,
0.0 io, 0.0 network, 0.0 memory}, id = 26
> 
> 2016-03-15 08:20:35,201 [291801ee-33fc-064d-7aff-18391f15ae0e:foreman] DEBUG o.a.drill.exec.work.foreman.Foreman
- 291801ee-33fc-064d-7aff-18391f15ae0e: State change requested PENDING --> FAILED
> org.apache.drill.exec.work.foreman.ForemanException: Unexpected exception during fragment
initialization: index (0) must be less than size (0)
> 	at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:255) [drill-java-exec.jar:1.4.0]
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [na:1.8.0_65]
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [na:1.8.0_65]
> 	at java.lang.Thread.run(Unknown Source) [na:1.8.0_65]
> Caused by: java.lang.IndexOutOfBoundsException: index (0) must be less than size (0)
> 	at com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:305) ~[com.google.guava-guava.jar:na]
> 	at com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:284) ~[com.google.guava-guava.jar:na]
> 	at com.google.common.collect.EmptyImmutableList.get(EmptyImmutableList.java:80) ~[com.google.guava-guava.jar:na]
> 	at org.apache.calcite.util.Pair$6.get(Pair.java:335) ~[org.apache.calcite-calcite-core.jar:1.4.0-drill-r10]
> 	at org.apache.drill.exec.planner.StarColumnHelper.containsStarColumnInProject(StarColumnHelper.java:60)
~[drill-java-exec.jar:1.4.0]
> 	at org.apache.drill.exec.planner.physical.visitor.StarColumnConverter.visitProject(StarColumnConverter.java:138)
~[drill-java-exec.jar:1.4.0]
> 	at org.apache.drill.exec.planner.physical.visitor.StarColumnConverter.visitProject(StarColumnConverter.java:45)
~[drill-java-exec.jar:1.4.0]
> 	at org.apache.drill.exec.planner.physical.ProjectPrel.accept(ProjectPrel.java:69) ~[drill-java-exec.jar:1.4.0]
> 	at org.apache.drill.exec.planner.physical.visitor.StarColumnConverter.visitScreen(StarColumnConverter.java:77)
~[drill-java-exec.jar:1.4.0]
> 	at org.apache.drill.exec.planner.physical.visitor.StarColumnConverter.visitScreen(StarColumnConverter.java:45)
~[drill-java-exec.jar:1.4.0]
> 	at org.apache.drill.exec.planner.physical.ScreenPrel.accept(ScreenPrel.java:64) ~[drill-java-exec.jar:1.4.0]
> 	at org.apache.drill.exec.planner.physical.visitor.StarColumnConverter.insertRenameProject(StarColumnConverter.java:72)
~[drill-java-exec.jar:1.4.0]
> 	at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToPrel(DefaultSqlHandler.java:326)
~[drill-java-exec.jar:1.4.0]
> 	at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:175)
~[drill-java-exec.jar:1.4.0]
> 	at org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:197)
~[drill-java-exec.jar:1.4.0]
> 
> 
> Thanks,
> Sudip
> 
> -----Original Message-----
> From: Sudip Mukherjee [mailto:smukherjee@commvault.com] 
> Sent: 07 March 2016 PM 04:45
> To: dev@drill.apache.org
> Subject: RE: Optimizing SUM(1) query
> 
> Hi,
> 
> Extremely sorry for the delayed response!
> 
> Basically , I am experimenting an Apache Solr plugin for Drill and tried to use Tableau.
On intial selection of a "Table" , I see these queries from drill webui profile
> 
> SELECT * FROM (SELECT * FROM `solr`.`CSVStringData`) T LIMIT 0
> 
> SELECT * FROM (SELECT SUM(1) AS `COL` FROM `solr`.`CSVStringData` `CSVStringData` HAVING
COUNT(1)>0) T LIMIT 0 
> 
> I have a dataset of 2,297,451,8 and when I try access that,the tableau screen in kind
of stuck in "Processing Request" for 5 minutes.
> 
> Thanks,
> Sudip
> 
> -----Original Message-----
> From: Andries Engelbrecht [mailto:aengelbrecht@maprtech.com]
> Sent: 20 February 2016 AM 03:42
> To: dev@drill.apache.org
> Subject: Re: Optimizing SUM(1) query
> 
> What are you trying to do in Tableau that generates these queries?
> 
> Do you have the actual full query that is being generated?
> 
> Tableau has a number of customization features, wonder if it is possible to optimize
those settings to avoid this issue.
> 
> --Andries
> 
> 
>> On Feb 19, 2016, at 10:16 AM, Sudip Mukherjee <smukherjee@commvault.com> wrote:
>> 
>> Hi,
>> 
>> Have anyone tried optimizing SUM(1) query in drill? Or is it implemented?
>> Getting these query while using Tableau. Mostly probably it is trying to figure out
NUMBER_OF_RECORDS.
>> 
>> Thanks,
>> Sudip
>> 
>> 
>> 
>> ***************************Legal Disclaimer***************************
>> "This communication may contain confidential and privileged material 
>> for the sole use of the intended recipient. Any unauthorized review, 
>> use or distribution by others is strictly prohibited. If you have 
>> received the message by mistake, please advise the sender by reply email and delete
the message. Thank you."
>> **********************************************************************
> 
> 
> 
> 
> 
> ***************************Legal Disclaimer***************************
> "This communication may contain confidential and privileged material for the sole use
of the intended recipient. Any unauthorized review, use or distribution by others is strictly
prohibited. If you have received the message by mistake, please advise the sender by reply
email and delete the message. Thank you."
> **********************************************************************
> 
> 
> 
> 
> ***************************Legal Disclaimer***************************
> "This communication may contain confidential and privileged material for the
> sole use of the intended recipient. Any unauthorized review, use or distribution
> by others is strictly prohibited. If you have received the message by mistake,
> please advise the sender by reply email and delete the message. Thank you."
> **********************************************************************


Mime
View raw message