drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Hsuan-Yi Chu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-2876) Data from RHS of Union All is lost, when column on LHS is non-existent
Date Sun, 26 Apr 2015 18:35:38 GMT

    [ https://issues.apache.org/jira/browse/DRILL-2876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14513178#comment-14513178
] 

Sean Hsuan-Yi Chu commented on DRILL-2876:
------------------------------------------

This issue is because the type of non-existent columns is treated as NULLABLE INT. Due to
our implicit casting rule, we should cast the varchar to INT, which leading to this exception.


Because of that, we could say this behavior is "expected". Thus, we downgrade this issue to
minor. 

On the other hand, it reminds the counterintuitive manner of how the type of nonexistent columns
is defined. Maybe we should rethink about that? [~vicky] might also have a relevant issue
to this one?

> Data from RHS of Union All is lost, when column on LHS is non-existent
> ----------------------------------------------------------------------
>
>                 Key: DRILL-2876
>                 URL: https://issues.apache.org/jira/browse/DRILL-2876
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning & Optimization
>    Affects Versions: 0.9.0
>         Environment: 64e3ec52b93e9331aa5179e040eca19afece8317 | DRILL-2611: value vectors
should report valid value count | 16.04.2015 @ 13:53:34 EDT
>            Reporter: Khurram Faraaz
>            Assignee: Sean Hsuan-Yi Chu
>            Priority: Critical
>
> When column projected on left hand side of UNION ALL operator is non-existent (it does
not exist in the parquet file) and column projected on RHS is of varchar type and it exists
in the parquet file. In the UNION ALL results we see that the varchar data coming from RHS
is lost and returned as null. Test was performed on 4 node cluster on CentOS. Data was read
from two different parquet files.
> Failing query is,
> {code}
> 0: jdbc:drill:> select dbl from prqFrmCSV104 union all select col_vchar from prqFrmCSV105;
> +------------+
> |    dbl     |
> +------------+
> | null       |
> | null       |
> | null       |
> | null       |
> | null       |
> | null       |
> | null       |
> | null       |
> ...
> | null       |
> | null       |
> | null       |
> java.lang.RuntimeException: java.sql.SQLException: Failure while executing query.
> 	at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2514)
> 	at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2148)
> 	at sqlline.SqlLine.print(SqlLine.java:1809)
> 	at sqlline.SqlLine$Commands.execute(SqlLine.java:3766)
> 	at sqlline.SqlLine$Commands.sql(SqlLine.java:3663)
> 	at sqlline.SqlLine.dispatch(SqlLine.java:889)
> 	at sqlline.SqlLine.begin(SqlLine.java:763)
> 	at sqlline.SqlLine.start(SqlLine.java:498)
> 	at sqlline.SqlLine.main(SqlLine.java:460)
> {code}
> CTAS statements were
> {code}
> create table prqFrmCSV104 as select cast(columns[0] as int) col_int, cast(columns[1]
as bigint) col_bgint, cast(columns[2] as char(10)) col_char, cast(columns[3] as varchar(18))
col_vchar, cast(columns[4] as timestamp) col_tmstmp, cast(columns[5] as date) col_date, cast(columns[6]
as boolean) col_boln, cast(columns[7] as double) col_dbl from `csvToPrqOrg.csv`;
> {code}
> {code}
> select col_int, col_bgint, col_char, col_vchar, col_tmstmp, col_date, col_boln, col_dbl
from prqFrmCSV104;
> 21.   19: create table prqFrmCSV105 as select cast(columns[0] as int) col_int, cast(columns[1]
as bigint) col_bgint, cast(columns[2] as char(10)) col_char, cast(columns[3] as varchar(18))
col_vchar, cast(columns[4] as timestamp) col_tmstmp, cast(columns[5] as date) col_date, cast(columns[6]
as boolean) col_boln, cast(columns[7] as double) col_dbl from `csvToPrqOrg.csv`;
> {code}
> Stack trace
> {code}
> 2015-04-24 22:43:19,226 [2ac538f7-bdf6-0db5-ef85-cad3f4e79add:frag:0:0] ERROR o.a.drill.exec.ops.FragmentContext
- Fragment Context received failure -- Fragment: 0:0
> org.apache.drill.common.exceptions.DrillUserException: SYSTEM ERROR: njEl0iAivVwLEbAg
> [0f4a4008-c885-43ea-9ea1-8e9cde0e4119 on centos-04.qa.lab:31010]
>         at org.apache.drill.common.exceptions.DrillUserException$Builder.build(DrillUserException.java:115)
~[drill-common-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.common.exceptions.ErrorHelper.wrap(ErrorHelper.java:39) ~[drill-common-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.ops.FragmentContext.fail(FragmentContext.java:151) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:182)
[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
[drill-common-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
[na:1.7.0_75]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
[na:1.7.0_75]
>         at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75]
> Caused by: java.lang.NumberFormatException: njEl0iAivVwLEbAg
>         at org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.nfeI(StringFunctionHelpers.java:97)
~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.varCharToInt(StringFunctionHelpers.java:122)
~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.test.generated.UnionAllerGen303.doEval(UnionAllerTemplate.java:41)
~[na:na]
>         at org.apache.drill.exec.test.generated.UnionAllerGen303.unionRecords(UnionAllerTemplate.java:43)
~[na:na]
>         at org.apache.drill.exec.physical.impl.union.UnionAllRecordBatch.doWork(UnionAllRecordBatch.java:257)
~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.union.UnionAllRecordBatch.innerNext(UnionAllRecordBatch.java:113)
~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142)
~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:74)
~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:76)
~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:64)
~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:164)
[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         ... 4 common frames omitted
> 2015-04-24 22:43:19,227 [2ac538f7-bdf6-0db5-ef85-cad3f4e79add:frag:0:0] INFO  o.a.drill.exec.work.foreman.Foreman
- State change requested.  RUNNING --> FAILED
> org.apache.drill.common.exceptions.DrillRemoteException: SYSTEM ERROR: njEl0iAivVwLEbAg
> [17d1cc16-8b8f-43f0-8ed5-df944047fd42 on centos-04.qa.lab:31010]
>         at org.apache.drill.exec.work.foreman.QueryManager.statusUpdate(QueryManager.java:163)
[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.work.foreman.QueryManager$RootStatusReporter.statusChange(QueryManager.java:281)
[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.work.fragment.AbstractStatusReporter.fail(AbstractStatusReporter.java:114)
[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.work.fragment.AbstractStatusReporter.fail(AbstractStatusReporter.java:110)
[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.work.fragment.FragmentExecutor.internalFail(FragmentExecutor.java:235)
[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:183)
[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
[drill-common-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
[na:1.7.0_75]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
[na:1.7.0_75]
>         at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message