drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Abhishek Girish (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-2608) Union all query fails when json.all_text_mode=false
Date Fri, 27 Mar 2015 20:25:52 GMT

    [ https://issues.apache.org/jira/browse/DRILL-2608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384553#comment-14384553
] 

Abhishek Girish commented on DRILL-2608:
----------------------------------------

This looks to be an invalid use-case. The reason it works when you switch on json all_text_mode
is because then everything is considered a string and the query ends up doing a union all
of all columns, considering each value as a string. 

Irrespective of the data format (json / parquet / ...), AFAIK this is not supported - you
are attempting union all between in-compatible data values. 

However you can try union all between these two files (although one file has integers and
other strings, it should be possible because the strings are numeric values enclosed in quotes):
file 1:
{code}
{"key":123}
{"key":456}
{code}

file 2:
{code}
{"key":"789"}
{"key":"159"}
{code}



> Union all query fails when json.all_text_mode=false
> ---------------------------------------------------
>
>                 Key: DRILL-2608
>                 URL: https://issues.apache.org/jira/browse/DRILL-2608
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning & Optimization
>    Affects Versions: 0.9.0
>         Environment: | 9d92b8e319f2d46e8659d903d355450e15946533 | DRILL-2580: Exit early
from HashJoinBatch if build side is empty | 26.03.2015 @ 16:13:53 EDT | Unknown     | 26.03.2015
@ 16:53:21 EDT |
>            Reporter: Khurram Faraaz
>            Assignee: Sean Hsuan-Yi Chu
>
> Union all query over JSON data file fails when store.json.all_text_mode is set to false,
and same query returns correct results when store.json.all_text_mode is set to true. Each
JSON data file had only one type of object {"key":<value>}, and the values in each of
the JSON data files were of same datatype. Test was executed on a 4 node cluster.
> {code}
> 0: jdbc:drill:> select key from `charData.json` union all select key from `dateData.json`
union all select key from `doubleData.json` union all select key from `intData.json` union
all select key from `timeStmpData.json` union all select key from `vrChrData.json`;
> Query failed: RemoteRpcException: Failure while running fragment., For input string:
"itzVxYBb" [ f1f81073-161c-4f24-89e5-37379413b01b on centos-04.qa.lab:31010 ]
> [ f1f81073-161c-4f24-89e5-37379413b01b on centos-04.qa.lab:31010 ]
> Error: exception while executing query: Failure while executing query. (state=,code=0)
> {code}
> Then I set alter session set `store.json.all_text_mode`=true;
> After setting son.all_text_mode to true, union all query returned correct results.
> {code}
> 0: jdbc:drill:> select key from `charData.json` union all select key from `dateData.json`
union all select key from `doubleData.json` union all select key from `intData.json` union
all select key from `timeStmpData.json` union all select key from `vrChrData.json`;
> ...
> +------------+
> 7,194 rows selected (0.462 seconds)
> {code}
> Resetting it back to false gives the same Exception
> {code}
> 0: jdbc:drill:> alter session set `store.json.all_text_mode`=false;
> +------------+------------+
> |     ok     |  summary   |
> +------------+------------+
> | true       | store.json.all_text_mode updated. |
> +------------+------------+
> 1 row selected (0.049 seconds)
> 0: jdbc:drill:> select key from `charData.json` union all select key from `dateData.json`
union all select key from `doubleData.json` union all select key from `intData.json` union
all select key from `timeStmpData.json` union all select key from `vrChrData.json`;
> Query failed: RemoteRpcException: Failure while running fragment., For input string:
"itzVxYBb" [ 412eda0e-cc22-43ae-b763-5e40a0326551 on centos-04.qa.lab:31010 ]
> [ 412eda0e-cc22-43ae-b763-5e40a0326551 on centos-04.qa.lab:31010 ]
> Error: exception while executing query: Failure while executing query. (state=,code=0)
> {code}
> Stack trace from drillbit.log
> {code}
> 2015-03-27 18:30:56,620 [2aea5e1e-88b9-3e4e-07b5-d7e46b29756f:frag:0:0] ERROR o.a.drill.exec.work.foreman.Foreman
- Error b9cb90bd-7d89-4061-8595-4c5ad983f3f3: RemoteRpcException: Failure while running fragment.,
For input string: "itzVxYBb" [ 412eda0e-cc22-43ae-b763-5e40a0326551 on centos-04.qa.lab:31010
]
> [ 412eda0e-cc22-43ae-b763-5e40a0326551 on centos-04.qa.lab:31010 ]
> org.apache.drill.exec.rpc.RemoteRpcException: Failure while running fragment., For input
string: "itzVxYBb" [ 412eda0e-cc22-43ae-b763-5e40a0326551 on centos-04.qa.lab:31010 ]
> [ 412eda0e-cc22-43ae-b763-5e40a0326551 on centos-04.qa.lab:31010 ]
>         at org.apache.drill.exec.work.foreman.QueryManager.statusUpdate(QueryManager.java:163)
[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.work.foreman.QueryManager$RootStatusReporter.statusChange(QueryManager.java:281)
[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.work.fragment.AbstractStatusReporter.fail(AbstractStatusReporter.java:114)
[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.work.fragment.AbstractStatusReporter.fail(AbstractStatusReporter.java:110)
[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.work.fragment.FragmentExecutor.internalFail(FragmentExecutor.java:230)
[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:182)
[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
[drill-common-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
[na:1.7.0_75]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
[na:1.7.0_75]
>         at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message