drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steven Phillips (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-1736) Cannot cast to other data types after using flatten + convert_from('json')
Date Wed, 26 Nov 2014 11:55:23 GMT

    [ https://issues.apache.org/jira/browse/DRILL-1736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14226090#comment-14226090
] 

Steven Phillips commented on DRILL-1736:
----------------------------------------

I am testing my fix using a text file:

100|[10, 1000]
101|[20, 1200]

and running this query:

with tmp as (select b, flatten(convert_from(c, 'json')) as f from (select columns[0] as b,
columns[1] as c from t))
select * from tmp where cast(tmp.f as int) = 10;

My patch fixes part of the problem, but there is another problem that I did not address. With
my changes, this is new error:

org.apache.drill.exec.exception.SchemaChangeException: Failure while trying to materialize
incoming schema.  Errors:
 
Error in expression at index -1.  Error: Only ProjectRecordBatch could have complex writer
function. You are using complex writer function convert_fromJSON in a non-project operation!.
 Full expression: --UNKNOWN EXPRESSION--.
Error in expression at index -1.  Error: Missing function implementation: [flatten(LATE-OPTIONAL)].
 Full expression: --UNKNOWN EXPRESSION--..
	at org.apache.drill.exec.physical.impl.filter.FilterRecordBatch.generateSV2Filterer(FilterRecordBatch.java:197)
~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
	at org.apache.drill.exec.physical.impl.filter.FilterRecordBatch.setupNewSchema(FilterRecordBatch.java:117)
~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
	...

and this is the physical plan:
Drill Physical : 
00-00    Screen: rowcount = 1.0, cumulative cost = {6.1 rows, 34.1 cpu, 0.0 io, 0.0 network,
0.0 memory}, id = 799
00-01      Project(b=[$0], f=[$2]): rowcount = 1.0, cumulative cost = {6.0 rows, 34.0 cpu,
0.0 io, 0.0 network, 0.0 memory}, id = 798
00-02        Flatten(flattenField=[$2]): rowcount = 1.0, cumulative cost = {5.0 rows, 26.0
cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 797
00-03          Project(EXPR$0=[$0], EXPR$1=[$1], EXPR$2=[CONVERT_FROM($1, 'json')]): rowcount
= 1.0, cumulative cost = {4.0 rows, 25.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 796
00-04            SelectionVectorRemover: rowcount = 1.0, cumulative cost = {3.0 rows, 13.0
cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 795
00-05              Filter(condition=[=(CAST(FLATTEN(CONVERT_FROM($1, 'json'))):INTEGER NOT
NULL, 10)]): rowcount = 1.0, cumulative cost = {2.0 rows, 12.0 cpu, 0.0 io, 0.0 network, 0.0
memory}, id = 794
00-06                Project(b=[ITEM($0, 0)], c=[ITEM($0, 1)]): rowcount = 1.0, cumulative
cost = {1.0 rows, 8.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 793
00-07                  Scan(groupscan=[EasyGroupScan [selectionRoot=/tmp/t, numFiles=1, columns=[`columns`[0],
`columns`[1]], files=[file:/tmp/t/file.tbl]]]): rowcount = 1.0, cumulative cost = {0.0 rows,
0.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 792


The problem appears to be that the convert_from expression is used inside of a filter operator,
but currently complex_functions are only allowed inside of a project operator. 

> Cannot cast to other data types after using flatten + convert_from('json')
> --------------------------------------------------------------------------
>
>                 Key: DRILL-1736
>                 URL: https://issues.apache.org/jira/browse/DRILL-1736
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Functions - Drill
>    Affects Versions: 0.6.0, 0.7.0
>            Reporter: Hao Zhu
>            Assignee: Steven Phillips
>             Fix For: 0.7.0
>
>
> 1. This SQL looks good.
> {code}
> select cast(row_key as int) as b, flatten(convert_from(mat.i.n , 'json')) as d from dfs.root.`table/mat`
as mat;
> +------------+------------+
> |     b      |     d      |
> +------------+------------+
> | 100        | 10         |
> | 100        | 1000       |
> | 101        | 20         |
> | 101        | 1200       |
> +------------+------------+
> 4 rows selected (0.196 seconds)
> {code}
> 2. Can not cast column 'b' to other data type.
> {code}
> with tmp as
> (select cast(row_key as int) as b, flatten(convert_from(mat.i.n , 'json')) as d from
dfs.root.`table/mat` as mat)
> select * from tmp where cast(tmp.d as int)=10;
>  
> Query failed: Failure while running fragment., Failure while trying to materialize incoming
schema.  Errors:
> Error in expression at index -1.  Error: Missing function implementation: [castINT(MAP-REQUIRED)].
 Full expression: --UNKNOWN EXPRESSION--.. [ 744bffba-5ad9-40f4-a47e-25dc83565716 on n4a:31010
]
>   (org.apache.drill.exec.exception.SchemaChangeException) Failure while trying to materialize
incoming schema.  Errors:
> Error in expression at index -1.  Error: Missing function implementation: [castINT(MAP-REQUIRED)].
 Full expression: --UNKNOWN EXPRESSION--..
>  org.apache.drill.exec.physical.impl.filter.FilterRecordBatch.generateSV2Filterer():194
>  org.apache.drill.exec.physical.impl.filter.FilterRecordBatch.setupNewSchema():114  
org.apache.drill.exec.record.AbstractSingleRecordBatch.buildSchema():110
>  org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.buildSchema():80
  org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.buildSchema():64    org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.buildSchema():80
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.buildSchema():269    org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.buildSchema():80
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.buildSchema():95
>     org.apache.drill.exec.work.fragment.FragmentExecutor.run():111
>     org.apache.drill.exec.work.WorkManager$RunnableWrapper.run():249
>     .......():0
> Error: exception while executing query: Failure while executing query. (state=,code=0)
> {code}
> 3.  Still can not change data type after creating the view.
> {code}
> create or replace view testview as select cast(row_key as int) as b, flatten(convert_from(mat.i.n
, 'json')) as d from dfs.root.`table/mat` as mat;
>  
> describe testview;
> +-------------+------------+-------------+
> | COLUMN_NAME | DATA_TYPE  | IS_NULLABLE |
> +-------------+------------+-------------+
> | b           | INTEGER    | NO          |
> | d           | ANY        | NO          |
> +-------------+------------+-------------+
> 2 rows selected (0.505 seconds)
> select * from testview where cast(d as int)=10;
>  
> Query failed: Failure while running fragment., Failure while trying to materialize incoming
schema.  Errors:
> Error in expression at index -1.  Error: Missing function implementation: [castINT(MAP-REQUIRED)].
 Full expression: --UNKNOWN EXPRESSION--.. [ e3a92573-3947-416e-b0ea-aa6dc4d47a20 on n4a:31010
]
>   (org.apache.drill.exec.exception.SchemaChangeException) Failure while trying to materialize
incoming schema.  Errors:
> Error in expression at index -1.  Error: Missing function implementation: [castINT(MAP-REQUIRED)].
 Full expression: --UNKNOWN EXPRESSION--..
>  org.apache.drill.exec.physical.impl.filter.FilterRecordBatch.generateSV2Filterer():194
>  org.apache.drill.exec.physical.impl.filter.FilterRecordBatch.setupNewSchema():114
>  org.apache.drill.exec.record.AbstractSingleRecordBatch.buildSchema():110
>  org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.buildSchema():80
>  org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.buildSchema():64
>  org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.buildSchema():80
>  org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.buildSchema():269
>  org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.buildSchema():80
   org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.buildSchema():95
>     org.apache.drill.exec.work.fragment.FragmentExecutor.run():111
>     org.apache.drill.exec.work.WorkManager$RunnableWrapper.run():249
>     .......():0
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message