drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hao Zhu <h...@maprtech.com>
Subject Re: Doubt on Querying JSON
Date Tue, 11 Nov 2014 22:13:10 GMT
Hi Jim,

I can successfully reproduce the issue you mentioned.

1. Put 2 json files with the same content -- 1.json and 2.json.

$ cat 1.json
> {
>         "timestamp":1415688106710,
>         "status":"OK",
>         "total":17,
>         "data":[
>                 { "volumename":"a", "actualreplication":[0,0,0],
> "InodesExceededAlarm":0, "ContainersNonLocalAlarm":0 },
>                 { "volumename":"b", "actualreplication":[0,0,0],
> "InodesExceededAlarm":0, "ContainersNonLocalAlarm":0 },
>                 { "volumename":"c", "actualreplication":[0,0,0],
> "InodesExceededAlarm":0, "ContainersNonLocalAlarm":0 }
>                ]
> }

2. Then this SQL hung:

> > select `timestamp` as newtime,flatten(data) as newdata from
> dfs.tmp.`*.json`;
> +------------+------------+
> |  newtime   |  newdata   |
> +------------+------------+
> | 1415688106710 |
> {"volumename":"a","actualreplication":[0,0,0],"InodesExceededAlarm":0,"ContainersNonLocalAlarm":0}
> |
> | 1415688106710 |
> {"volumename":"b","actualreplication":[0,0,0],"InodesExceededAlarm":0,"ContainersNonLocalAlarm":0}
> |
> | 1415688106710 |
> {"volumename":"c","actualreplication":[0,0,0],"InodesExceededAlarm":0,"ContainersNonLocalAlarm":0}
> |
>
It should produce 6 rows instead.

Thanks,

Hao

On Tue, Nov 11, 2014 at 12:21 PM, Jim Bates <jbates@maprtech.com> wrote:

> I had trouble with flatten extending past one wile. If you include more
> than 1 json file in your data set it never seams to finish the command. In
> my case it returned in 20 sec with one file but when the data set had two
> files it never finiahed after 5 min so I gave up till a later release comes
> along.
> On Nov 11, 2014 11:57 AM, "Hao Zhu" <hzhu@maprtech.com> wrote:
>
> > Hi Mufeed,
> >
> > I tested on latest 0.7 build and flatten can work in your case.
> >
> > with sub as
> > > (select `timestamp` as newtime,flatten(data) as newdata from
> > > dfs.tmp.`1.json`)
> > > select sub.newtime, sub.newdata.volumename,
> > >
> >
> sub.newdata.actualreplication,sub.newdata.InodesExceededAlarm,sub.newdata.ContainersNonLocalAlarm
> > >  from sub;
> > >
> > > +------------+------------+------------+------------+------------+
> > > |  newtime   |   EXPR$1   |   EXPR$2   |   EXPR$3   |   EXPR$4   |
> > > +------------+------------+------------+------------+------------+
> > > | 1415688106710 | a          | [0,0,0]    | 0          | 0          |
> > > | 1415688106710 | b          | [0,0,0]    | 0          | 0          |
> > > | 1415688106710 | c          | [0,0,0]    | 0          | 0          |
> > > +------------+------------+------------+------------+------------+
> > > 3 rows selected (0.158 seconds)
> >
> > Regarding your errors:
> >
> > 1. select timestamp from `1.json`;
> >
> > Since "timestamp" is a reserved word in Drill, so it should be changed
> to:
> >
> > select `timestamp` from `1.json`;
> >
> >
> > 2. select timestamp[0] from `1.json`;
> >
> > Since "timestamp" is not an array, so we can not use "[0]" to specify the
> > 1st element.
> >
> > 3. select status[0] from `1.json`;
> >
> > The same as 2, "status" is not an array.
> >
> > Thanks,
> >
> > Hao
> >
> > On Tue, Nov 11, 2014 at 7:22 AM, Jacques Nadeau <jacques@apache.org>
> > wrote:
> >
> > > We're indeed seeing some issues with the current master (development)
> > > branch and flatten functionality.  There a number of open bugs right
> now
> > > against it.  Can you review the open bugs and see if they describe the
> > > problem you are seeing?
> > >
> > > thanks
> > >
> > > On Tue, Nov 11, 2014 at 5:35 AM, mufy <mufeed.usman@gmail.com> wrote:
> > >
> > > > I have the following.
> > > >
> > > > {
> > > >         "timestamp":1415688106710,
> > > >         "status":"OK",
> > > >         "total":17,
> > > >         "data":[
> > > >                 { "volumename":"a", "actualreplication":[0,0,0],
> > > > "InodesExceededAlarm":0, "ContainersNonLocalAlarm":0 },
> > > >                 { "volumename":"b", "actualreplication":[0,0,0],
> > > > "InodesExceededAlarm":0, "ContainersNonLocalAlarm":0 },
> > > >                 { "volumename":"c", "actualreplication":[0,0,0],
> > > > "InodesExceededAlarm":0, "ContainersNonLocalAlarm":0 }
> > > >                ]
> > > > }
> > > >
> > > > But query results are "partial".
> > > >
> > > > 0: jdbc:drill:zk=n67:5181> select data[0] from `1.json`;
> > > > +------------+
> > > > |   EXPR$0   |
> > > > +------------+
> > > > |
> > > >
> > >
> >
> {"volumename":"a","actualreplication":[0,0,0],"InodesExceededAlarm":0,"ContainersNonLocalAlarm":0}
> > > > |
> > > > +------------+
> > > > 1 row selected (0.139 seconds)
> > > > 0: jdbc:drill:zk=n67:5181> select data[1] from `1.json`;
> > > > +------------+
> > > > |   EXPR$0   |
> > > > +------------+
> > > > |
> > > >
> > >
> >
> {"volumename":"b","actualreplication":[0,0,0],"InodesExceededAlarm":0,"ContainersNonLocalAlarm":0}
> > > > |
> > > > +------------+
> > > > 1 row selected (0.125 seconds)
> > > >
> > > >
> > > >
> > > > 0: jdbc:drill:zk=n67:5181> select * from `1.json`;
> > > > +------------+------------+------------+------------+
> > > > | timestamp  |   status   |   total    |    data    |
> > > > +------------+------------+------------+------------+
> > > > | 1415688106710 | OK         | 17         |
> > > >
> > > >
> > >
> >
> [{"volumename":"a","actualreplication":[0,0,0],"InodesExceededAlarm":0,"ContainersNonLocalAlarm":0},{"volumename":"b","actualreplication":[0,0,0]
> > > > |
> > > > +------------+------------+------------+------------+
> > > > 1 row selected (0.22 seconds)
> > > >
> > > >
> > > >
> > > > 0: jdbc:drill:zk=n67:5181> select timestamp from `1.json`;
> > > > Query failed: Failure while parsing sql.
> > > >
> > > > Error: exception while executing query: Failure while executing
> query.
> > > > (state=,code=0)
> > > > 0: jdbc:drill:zk=n67:5181> select timestamp[0] from `1.json`;
> > > > Query failed: Failure while parsing sql.
> > > >
> > > > Error: exception while executing query: Failure while executing
> query.
> > > > (state=,code=0)
> > > > 0: jdbc:drill:zk=n67:5181> select status[0] from `1.json`;
> > > > +------------+
> > > > |   EXPR$0   |
> > > > +------------+
> > > > Query failed: Failure while running fragment.
> > > >
> > > > java.lang.RuntimeException: java.sql.SQLException: Failure while
> > > > executing query.
> > > >     at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2514)
> > > >     at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2148)
> > > >     at sqlline.SqlLine.print(SqlLine.java:1809)
> > > >     at sqlline.SqlLine$Commands.execute(SqlLine.java:3766)
> > > >     at sqlline.SqlLine$Commands.sql(SqlLine.java:3663)
> > > >     at sqlline.SqlLine.dispatch(SqlLine.java:889)
> > > >     at sqlline.SqlLine.begin(SqlLine.java:763)
> > > >     at sqlline.SqlLine.start(SqlLine.java:498)
> > > >     at sqlline.SqlLine.main(SqlLine.java:460)
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message