drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steven Phillips (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-3975) Partition Planning rule causes query failure due to IndexOutOfBoundsException on HDFS
Date Sun, 25 Oct 2015 03:47:27 GMT

    [ https://issues.apache.org/jira/browse/DRILL-3975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972984#comment-14972984
] 

Steven Phillips commented on DRILL-3975:
----------------------------------------

My approach has been to remove the scheme and authority from the paths any time I encounter
code that uses the path as a key, or does any sort of string comparison. This is an area where
I think we need to clean up. I don't think we are very consistent throughout the code base
in how was handle paths.

The usual trick I use to strip away the schema and authority is the method Path.getPathWithoutSchemeAndAuthority(Path
p). If I have String objects and not Path objects, I will convert the String to a path, use
the utility method to remove scheme and authority, and then call toString().

> Partition Planning rule causes query failure due to IndexOutOfBoundsException on HDFS
> -------------------------------------------------------------------------------------
>
>                 Key: DRILL-3975
>                 URL: https://issues.apache.org/jira/browse/DRILL-3975
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning & Optimization
>            Reporter: Jacques Nadeau
>
> In attempting to run the extended test suite provided by MapR, there are a large number
of queries that fail due to issues in the PruneScanRule and specifically the DFSPartitionLocation
constructor line 31. It is likely due to issues with the code that are related to running
on HDFS where this code path has apparently not been tested.
> An example test query this type of failure occurred: 
> /src/drill-test-framework/resources/Functional/ctas/ctas_auto_partition/tpch0.01_multiple_partitions/data/q11.q
> Example stack trace below:
> {code}
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: StringIndexOutOfBoundsException:
String index out of range: -12
> [Error Id: f2941267-49b1-4f67-a17f-610ffb13fcb7 on ip-172-31-30-32.us-west-2.compute.internal:31010]
>         at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:534)
~[drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
>         at org.apache.drill.exec.work.foreman.Foreman$ForemanResult.close(Foreman.java:742)
[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
>         at org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:841)
[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
>         at org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:786)
[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
>         at org.apache.drill.common.EventProcessor.sendEvent(EventProcessor.java:73) [drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
>         at org.apache.drill.exec.work.foreman.Foreman$StateSwitch.moveToState(Foreman.java:788)
[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
>         at org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:894) [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
>         at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:255) [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
[na:1.7.0_85]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
[na:1.7.0_85]
>         at java.lang.Thread.run(Thread.java:745) [na:1.7.0_85]
> Caused by: org.apache.drill.exec.work.foreman.ForemanException: Unexpected exception
during fragment initialization: Internal error: Error while applying rule PruneScanRule:Filter_On_Scan_Parquet,
args [rel#43148:DrillFilterRel.LOGICAL.ANY([]).[](input=rel#43147:Subset#4.LOGICAL.ANY([]).[],condition==($0,
1)), rel#43241:DrillScanRel.LOGICAL.ANY([]).[](table=[dfs, ctasAutoPartition, tpch_multiple_partitions/lineitem_twopart_ordered2],groupscan=ParquetGroupScan
[entries=[ReadEntryWithPath [path=hdfs://ip-172-31-30-32:54310/drill/testdata/ctas_auto_partition/tpch_multiple_partitions/lineitem_twopart_ordered2]],
selectionRoot=hdfs://ip-172-31-30-32:54310/drill/testdata/ctas_auto_partition/tpch_multiple_partitions/lineitem_twopart_ordered2,
numFiles=1, usedMetadataFile=false, columns=[`l_modline`, `l_moddate`]])]
>         ... 4 common frames omitted
> Caused by: java.lang.AssertionError: Internal error: Error while applying rule PruneScanRule:Filter_On_Scan_Parquet,
args [rel#43148:DrillFilterRel.LOGICAL.ANY([]).[](input=rel#43147:Subset#4.LOGICAL.ANY([]).[],condition==($0,
1)), rel#43241:DrillScanRel.LOGICAL.ANY([]).[](table=[dfs, ctasAutoPartition, tpch_multiple_partitions/lineitem_twopart_ordered2],groupscan=ParquetGroupScan
[entries=[ReadEntryWithPath [path=hdfs://ip-172-31-30-32:54310/drill/testdata/ctas_auto_partition/tpch_multiple_partitions/lineitem_twopart_ordered2]],
selectionRoot=hdfs://ip-172-31-30-32:54310/drill/testdata/ctas_auto_partition/tpch_multiple_partitions/lineitem_twopart_ordered2,
numFiles=1, usedMetadataFile=false, columns=[`l_modline`, `l_moddate`]])]
>         at org.apache.calcite.util.Util.newInternal(Util.java:792) ~[calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6]
>         at org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:251)
~[calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6]
>         at org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:808)
~[calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6]
>         at org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:303) ~[calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6]
>         at org.apache.calcite.prepare.PlannerImpl.transform(PlannerImpl.java:303) ~[calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6]
>         at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.logicalPlanningVolcanoAndLopt(DefaultSqlHandler.java:545)
~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
>         at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:213)
~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
>         at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:248)
~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
>         at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:164)
~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
>         at org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:178)
~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
>         at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:905) [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
>         at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:244) [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
>         ... 3 common frames omitted
> Caused by: java.lang.StringIndexOutOfBoundsException: String index out of range: -12
>         at java.lang.String.substring(String.java:1875) ~[na:1.7.0_85]
>         at org.apache.drill.exec.planner.DFSPartitionLocation.<init>(DFSPartitionLocation.java:31)
~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
>         at org.apache.drill.exec.planner.ParquetPartitionDescriptor.createPartitionSublists(ParquetPartitionDescriptor.java:126)
~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
>         at org.apache.drill.exec.planner.AbstractPartitionDescriptor.iterator(AbstractPartitionDescriptor.java:53)
~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
>         at org.apache.drill.exec.planner.logical.partition.PruneScanRule.doOnMatch(PruneScanRule.java:190)
~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
>         at org.apache.drill.exec.planner.logical.partition.ParquetPruneScanRule$2.onMatch(ParquetPruneScanRule.java:87)
~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
>         at org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228)
~[calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6]
>         ... 13 common frames omitted
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message