incubator-drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hao Zhu <h...@maprtech.com>
Subject Re: error in flatten and join
Date Mon, 17 Nov 2014 19:52:30 GMT
DRILL-1736 <https://issues.apache.org/jira/browse/DRILL-1736> is opened for
the issue.

Thanks,
Hao

On Mon, Nov 17, 2014 at 9:50 AM, Hao Zhu <hzhu@maprtech.com> wrote:

> I could not find the workaround for this issue, after also trying to use
> views.
> The issue is after convert_from+flatten, the data type is 'ANY' and we
> could not explicitly change the type, eg, int or varchar.
>
> For example:
> *1. This SQL looks good.*
>
>> select cast(row_key as int) as b, flatten(convert_from(mat.i.n , 'json'))
>> as d from dfs.root.`table/mat` as mat;
>> +------------+------------+
>> |     b      |     d      |
>> +------------+------------+
>> | 100        | 10         |
>> | 100        | 1000       |
>> | 101        | 20         |
>> | 101        | 1200       |
>> +------------+------------+
>> 4 rows selected (0.196 seconds)
>
>
> *2. Can not cast column 'b' to other data type.*
>
>> with tmp as
>> (select cast(row_key as int) as b, flatten(convert_from(mat.i.n ,
>> 'json')) as d from dfs.root.`table/mat` as mat)
>> select * from tmp where cast(tmp.d as int)=10;
>>
>
>
>> Query failed: Failure while running fragment., Failure while trying to
>> materialize incoming schema.  Errors:
>> Error in expression at index -1.  Error: Missing function implementation:
>> [castINT(MAP-REQUIRED)].  Full expression: --UNKNOWN EXPRESSION--.. [
>> 744bffba-5ad9-40f4-a47e-25dc83565716 on n4a:31010 ]
>>   (org.apache.drill.exec.exception.SchemaChangeException) Failure while
>> trying to materialize incoming schema.  Errors:
>> Error in expression at index -1.  Error: Missing function implementation:
>> [castINT(MAP-REQUIRED)].  Full expression: --UNKNOWN EXPRESSION--..
>>
>>  org.apache.drill.exec.physical.impl.filter.FilterRecordBatch.generateSV2Filterer():194
>>  org.apache.drill.exec.physical.impl.filter.FilterRecordBatch.setupNewSchema():114
>>  org.apache.drill.exec.record.AbstractSingleRecordBatch.buildSchema():110
>>  org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.buildSchema():80
>>  org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.buildSchema():64
>>
>> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.buildSchema():80
>> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.buildSchema():269
>>
>> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.buildSchema():80
>>
>> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.buildSchema():95
>>     org.apache.drill.exec.work.fragment.FragmentExecutor.run():111
>>     org.apache.drill.exec.work.WorkManager$RunnableWrapper.run():249
>>     .......():0
>>
>> Error: exception while executing query: Failure while executing query.
>> (state=,code=0)
>
> *3.  Still can not change data type after creating the view.*
>
>> create or replace view testview as select cast(row_key as int) as b,
>> flatten(convert_from(mat.i.n , 'json')) as d from dfs.root.`table/mat` as
>> mat;
>>
>
>
>> describe testview;
>> +-------------+------------+-------------+
>> | COLUMN_NAME | DATA_TYPE  | IS_NULLABLE |
>> +-------------+------------+-------------+
>> | b           | INTEGER    | NO          |
>> | d           | ANY        | NO          |
>> +-------------+------------+-------------+
>> 2 rows selected (0.505 seconds)
>>
>
>
>> select * from testview where cast(d as int)=10;
>>
>
>
>> Query failed: Failure while running fragment., Failure while trying to
>> materialize incoming schema.  Errors:
>> Error in expression at index -1.  Error: Missing function implementation:
>> [castINT(MAP-REQUIRED)].  Full expression: --UNKNOWN EXPRESSION--.. [
>> e3a92573-3947-416e-b0ea-aa6dc4d47a20 on n4a:31010 ]
>>   (org.apache.drill.exec.exception.SchemaChangeException) Failure while
>> trying to materialize incoming schema.  Errors:
>> Error in expression at index -1.  Error: Missing function implementation:
>> [castINT(MAP-REQUIRED)].  Full expression: --UNKNOWN EXPRESSION--..
>>
>>  org.apache.drill.exec.physical.impl.filter.FilterRecordBatch.generateSV2Filterer():194
>>
>>  org.apache.drill.exec.physical.impl.filter.FilterRecordBatch.setupNewSchema():114
>>  org.apache.drill.exec.record.AbstractSingleRecordBatch.buildSchema():110
>>
>>  org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.buildSchema():80
>>
>>  org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.buildSchema():64
>>
>>  org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.buildSchema():80
>>
>>  org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.buildSchema():269
>>  org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.buildSchema():80
>>   org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.buildSchema():95
>>     org.apache.drill.exec.work.fragment.FragmentExecutor.run():111
>>     org.apache.drill.exec.work.WorkManager$RunnableWrapper.run():249
>>     .......():0
>
> Thanks,
>
> Hao
>
> On Sat, Nov 15, 2014 at 10:24 PM, Neeraja Rentachintala <
> nrentachintala@maprtech.com> wrote:
>
>> On the view  error, refer to
>>
>> https://cwiki.apache.org/confluence/display/DRILL/CREATE+TABLE+AS+(CTAS)+Command
>>
>> You can set a workspace as writable and then create views within it (for
>> ex: below)
>>
>> "tmp": {
>>       "location": "/tmp",
>>       "writable": true,
>>        }
>>
>> On Sat, Nov 15, 2014 at 1:05 AM, MohammadReza Mofateh <
>> mofateh72@gmail.com>
>> wrote:
>>
>> > Hi all
>> >  Steven
>> > Do you mean now it's not possible to do now?
>> > What's your idea about my query?How do I implement them?
>> >
>> > Neeraja
>> >
>> > I tried to make a view but an error occurred:
>> > Error: Current schema is not a Drill schema. Can't create new relations
>> > (tables or views) in non-Drill schemas
>> > How do I resolve It?
>> >
>> > Best
>> >
>> >
>> > On Sat, Nov 15, 2014 at 12:09 PM, Steven Phillips <
>> sphillips@maprtech.com>
>> > wrote:
>> >
>> > > This is a bug, and it has to do with the combination of "fast schema
>> > > return" and "convert_from". Fast schema return requires operators to
>> > return
>> > > a schema based solely on the input schema, before any actual data is
>> > > available. On the other hand, convert_from(VARBINARY, 'json') does not
>> > have
>> > > a known return type until the data is available. It could be map,
>> list,
>> > or
>> > > primitive type.
>> > >
>> > > I'll file a jira for this issue. Unfortunately, I don't know of any
>> work
>> > > arounds to use in the mean time.
>> > >
>> > > On Fri, Nov 14, 2014 at 10:23 PM, Neeraja Rentachintala <
>> > > nrentachintala@maprtech.com> wrote:
>> > >
>> > > > Hi
>> > > > Can you try creating a view on the hbase.act table to do the join.
>> > > > Yon can find a view creation example in the tutorial
>> > > > <
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/DRILL/Analyzing+Yelp+JSON+Data+with+Apache+Drill
>> > > > >
>> > > > (please search for create or replace view syntax)
>> > > >
>> > > >
>> > > > -Neeraja
>> > > >
>> > > >
>> > > > On Fri, Nov 14, 2014 at 10:16 PM, MohammadReza Mofateh <
>> > > > mofateh72@gmail.com>
>> > > > wrote:
>> > > >
>> > > > > No body? :)
>> > > > >
>> > > > > On Thu, Nov 13, 2014 at 8:44 PM, MohammadReza Mofateh <
>> > > > mofateh72@gmail.com
>> > > > > >
>> > > > > wrote:
>> > > > >
>> > > > > > Hi,
>> > > > > > Excuse me ,I write it bad:
>> > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> **********************************************************************************
>> > > > > > Main query:
>> > > > > >
>> > > > > > select * from hbase.act ,
>> > > > > > (select cast(b as varchar(5))as r ,flatten(d) as f from
(select
>> > > row_key
>> > > > > as
>> > > > > > b, convert_from(mat.i.n,'json') as d from hbase.mat)) as
x
>> > > > > >  where  act.row_key =x.f
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> **********************************************************************************
>> > > > > > In last e-mail what I mean that if I run sub query ,sub
query
>> works
>> > > > well
>> > > > > > Sub query(It used as x in main query):
>> > > > > >
>> > > > > > select cast(b as varchar(5))as r ,flatten(d) as f from (select
>> > > row_key
>> > > > as
>> > > > > > b, convert_from(mat.i.n,'json') as d from hbase.mat)
>> > > > > >
>> > > > > > It returns me(Notice to additional details):
>> > > > > > 100 |10
>> > > > > > 100|1000
>> > > > > > 101|1200
>> > > > > > 101|20
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> **********************************************************************************
>> > > > > > Main query doesn't return any answer,It returns:
>> > > > > >
>> > > > > > Error in expression at index 0.  Error: Missing function
>> > > > implementation:
>> > > > > > [hash(MAP-REQUIRED)].  Full expression: null..
>> > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> ******************************************************************************************************************
>> > > > > >
>> > > > > > Additional details:
>> > > > > > ______________________________________________________
>> > > > > > table mat:
>> > > > > > row_key  |   i:n
>> > > > > > 100         |  ["10","1000"]
>> > > > > > 101         |  ["20","1200"]
>> > > > > > ______________________________________________________
>> > > > > > table act:
>> > > > > > row_key  |    i:y
>> > > > > > 10           |   apple
>> > > > > > 20           |   orange
>> > > > > > 1000        |  banana
>> > > > > > 1200        |  peach
>> > > > > > 1400        |  onion
>> > > > > > ______________________________________________________
>> > > > > >
>> > > > > >
>> > > > > > On Thu, Nov 13, 2014 at 7:59 PM, Ted Dunning <
>> > ted.dunning@gmail.com>
>> > > > > > wrote:
>> > > > > >
>> > > > > >> I am having a hard time understanding your question.
>> > > > > >>
>> > > > > >> Are you saying that the first query gives the correct
answer
>> but
>> > > > giving
>> > > > > >> the
>> > > > > >> second query which is a sub-query in the first one gives
an
>> > > unexpected
>> > > > > >> error?
>> > > > > >>
>> > > > > >> The first query that it appears that you have used is:
>> > > > > >>
>> > > > > >> select * from hbase.act, (
>> > > > > >>     select cast(b as varchar(5)) as r,
>> > > > > >>                flatten(d) as f
>> > > > > >>         from (
>> > > > > >>              select row_key as b, convert_from(mat.i.n,'json')
>> as
>> > d
>> > > > > >>              from hbase.mat
>> > > > > >>           )
>> > > > > >>      ) as x
>> > > > > >> where  act.row_key =x.f
>> > > > > >>
>> > > > > >> And the second seems to be this:
>> > > > > >>
>> > > > > >> (
>> > > > > >>     select cast(b as varchar(5)) as r,
>> > > > > >>                flatten(d) as f
>> > > > > >>     from (
>> > > > > >>          select row_key as b, convert_from(mat.i.n,'json')
as d
>> > > > > >>          from hbase.mat
>> > > > > >>     )
>> > > > > >> )
>> > > > > >>
>> > > > > >> As far as I can tell, this is identical to the sub-query
in the
>> > > first
>> > > > > one
>> > > > > >> except that you have left parentheses around the sub-query.
>> > > > > >>
>> > > > > >> Is that a correct story of what you did?
>> > > > > >>
>> > > > > >> If so, what happens if you remove the parens?  I don't
think
>> it is
>> > > > legal
>> > > > > >> SQL syntax to put parentheses around a query except
when it is
>> a
>> > > > > >> sub-query.
>> > > > > >>
>> > > > > >>
>> > > > > >>
>> > > > > >> On Thu, Nov 13, 2014 at 6:08 AM, MohammadReza Mofateh
<
>> > > > > >> mofateh72@gmail.com>
>> > > > > >> wrote:
>> > > > > >>
>> > > > > >> > Hi
>> > > > > >> > I run this query:
>> > > > > >> >
>> > > > > >> > select * from hbase.act , (select cast(b as varchar(5))as
r
>> > > > > ,flatten(d)
>> > > > > >> as
>> > > > > >> > f from (select row_key as b, convert_from(mat.i.n,'json')
as
>> d
>> > > from
>> > > > > >> > hbase.mat)) as x
>> > > > > >> >            where  act.row_key =x.f
>> > > > > >> >
>> > > > > >> > x=It return me a true answer:
>> > > > > >> > table mat:
>> > > > > >> > row_key  |i:n
>> > > > > >> > 100         |["10","1000"]
>> > > > > >> > 101         |["20","1200"]
>> > > > > >> > (select cast(b as varchar(5))as r ,flatten(d) as
f from
>> (select
>> > > > > row_key
>> > > > > >> as
>> > > > > >> > b, convert_from(mat.i.n,'json') as d from hbase.mat))
>> > > > > >> >  returns me:
>> > > > > >> > 100 |10
>> > > > > >> > 100|1000
>> > > > > >> > 101|1200
>> > > > > >> > 101|20
>> > > > > >> >
>> > > > > >> > table act:
>> > > > > >> > row_key  |i:y
>> > > > > >> > 10|apple
>> > > > > >> > 20|orange
>> > > > > >> > 1000|banana
>> > > > > >> > 1200|pich
>> > > > > >> > 1400|onion
>> > > > > >> >  As query it should return me:
>> > > > > >> > act.row
>> > > > > >> > 10
>> > > > > >> > 20
>> > > > > >> > 1000
>> > > > > >> > 1200
>> > > > > >> > 1400
>> > > > > >> >
>> > > > > >> > But i receive this error:
>> > > > > >> >
>> > > > > >> > Error in expression at index 0.  Error: Missing
function
>> > > > > implementation:
>> > > > > >> > [hash(MAP-REQUIRED)].  Full expression: null..
>> > > > > >> >
>> > > > > >>
>> > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> > >
>> > >
>> > > --
>> > >  Steven Phillips
>> > >  Software Engineer
>> > >
>> > >  mapr.com
>> > >
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message