incubator-drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohit Anchlia <mohitanch...@gmail.com>
Subject Re: Error in running select over hdfs
Date Fri, 24 Oct 2014 21:36:07 GMT
Here is one of the lines from json file:

[ec2-user@ip-10-225-156-201 ~]$ hadoop fs -cat
/user/train/xd/tweets/tmp/tweets-0.json|more
{"created_at":"Wed Oct 22 18:04:43 +0000
2014","id":524984711660068864,"id_str":"524984711660068864","text":"Robinho
niega las acusaciones de violaci\u00f3n sexual en Itali
a: El delantero brasile\u00f1o Robinho neg\u00f3 tajantemente ... http:\/\/
t.co\/psgRSPbSgZ","source":"\u003ca href=\"http:\/\/twitterfeed.com\"
rel=\"nofollow\"\u003etwitt
erfeed\u003c\/a\u003e","truncated":false,"in_reply_to_status_id":null,"in_reply_to_status_id_str":null,"in_reply_to_user_id":null,"in_reply_to_user_id_str":null,"in_reply_t
o_screen_name":null,"user":{"id":2416621622,"id_str":"2416621622","name":"Daniel
Romero","screen_name":"Dany_rom5","location":"Sevilla","url":null,"description":"Verde
paz!
","protected":false,"verified":false,"followers_count":141,"friends_count":935,"listed_count":1,"favourites_count":4,"statuses_count":50478,"created_at":"Fri
Mar 28 23:57:3
2 +0000
2014","utc_offset":7200,"time_zone":"Amsterdam","geo_enabled":false,"lang":"es","contributors_enabled":false,"is_translator":false,"profile_background_color":"C0DEE
D","profile_background_image_url":"http:\/\/abs.twimg.com
\/images\/themes\/theme1\/bg.png","profile_background_image_url_https":"https:\/\/
abs.twimg.com\/images\/themes\/th
eme1\/bg.png","profile_background_tile":false,"profile_link_color":"0084B4","profile_sidebar_border_color":"C0DEED","profile_sidebar_fill_color":"DDEEF6","profile_text_colo
r":"333333","profile_use_background_image":true,"profile_image_url":"http:\/\/
pbs.twimg.com
\/profile_images\/449697097827098624\/9YmqsvgW_normal.jpeg","profile_image_url_ht
tps":"https:\/\/pbs.twimg.com
\/profile_images\/449697097827098624\/9YmqsvgW_normal.jpeg","profile_banner_url":"https:\/\/
pbs.twimg.com\/profile_banners\/2416621622\/1396053
818","default_profile":true,"default_profile_image":false,"following":null,"follow_request_sent":null,"notifications":null},"geo":null,"coordinates":null,"place":null,"cont
ributors":null,"retweet_count":0,"favorite_count":0,"entities":{"hashtags":[],"trends":[],"urls":[{"url":"http:\/\/
t.co\/psgRSPbSgZ","expanded_url":"http:\/\/bit.ly\/1rgKSt
f","display_url":"bit.ly
\/1rgKStf","indices":[114,136]}],"user_mentions":[],"symbols":[]},"favorited":false,"retweeted":false,"possibly_sensitive":false,"filter_level":"med
ium","lang":"es","timestamp_ms":"1414001083666"}

On Fri, Oct 24, 2014 at 2:22 PM, Ramana Inukonda <rinukonda@maprtech.com>
wrote:

> Also,
> In order to minimize back and fro mails, If its a json file can you post or
> share the json file or a few lines from the json file?
>
> Regards
> Ramana
>
>
> On Fri, Oct 24, 2014 at 2:18 PM, Ramana Inukonda <rinukonda@maprtech.com>
> wrote:
>
> > Hey,
> >
> > Sorry to hear that you are having trouble with a simple case.
> > I can help you debug this- Is the file a json or a txt file?
> >
> > if its a json file please have appropriate extensions. If its a txt file
> > can you please have an entry in your storage plugin(accessible at http://
> > <drillbit>:8047)
> >
> >   "formats": {
> >     "psv": {
> >       "type": "text",
> >       "extensions": [
> >         "txt"
> >       ],
> >       "delimiter": ","
> >     },
> >
> > This is presuming the file is a comma separated file. Otherwise change to
> > appropriate delimiter.
> >
> >
> > Regards
> > Ramana
> >
> >
> > On Fri, Oct 24, 2014 at 1:47 PM, Mohit Anchlia <mohitanchlia@gmail.com>
> > wrote:
> >
> >> I can certainly do that, however In real world how would we go about
> >> troubleshooting and resolving issues over large data sets? Drill needs
> to
> >> have a better way to identify and troubleshoot such issues.
> >>
> >>
> >> On Fri, Oct 24, 2014 at 1:41 PM, Abhishek Girish <
> >> abhishek.girish@gmail.com>
> >> wrote:
> >>
> >> > Can you try creating a new file with just one JSON record in it
> (copying
> >> > say the first record from the original json document)  and see if you
> >> can
> >> > query the same?
> >> >
> >> > Also try creating a simple json file by copying the one on
> >> > http://json.org/example. Copy it to /tmp on HDFS and try querying the
> >> file
> >> > using Drill (specify the schema as "use dfs.tmp;"). If this works,
> then
> >> the
> >> > issue could be with your original json file. If not, it could be some
> >> > simple setup issue.
> >> >
> >> > Regards,
> >> > Abhishek
> >> >
> >> > On Fri, Oct 24, 2014 at 1:25 PM, Mohit Anchlia <
> mohitanchlia@gmail.com>
> >> > wrote:
> >> >
> >> > > Any clues? Not sure why I can't do a simple select.
> >> > > On Fri, Oct 24, 2014 at 9:19 AM, Mohit Anchlia <
> >> mohitanchlia@gmail.com>
> >> > > wrote:
> >> > >
> >> > > > Here is the exception
> >> > > >
> >> > > > 2014-10-23 20:09:08,689
> >> [91b7d838-3128-4add-a686-7ceb05b8e765:frag:0:0]
> >> > > > ERROR o.a.d.e.p.i.ScreenCreator$ScreenRoot - Error
> >> > > > b6f84bc1-8f18-42e9-b79f-c889fa13a40e: Screen received stop request
> >> > sent.
> >> > > > java.lang.IllegalArgumentException: null
> >> > > >         at
> >> > > >
> >> > >
> >> >
> >>
> org.apache.drill.common.expression.PathSegment$ArraySegment.<init>(PathSegment.java:52)
> >> > > > ~[drill-common-0.5.0-incubating-rebuffed.jar:0.5.0-incubating]
> >> > > >         at
> >> > > >
> >> > >
> >> >
> >>
> org.apache.drill.common.expression.PathSegment$ArraySegment.cloneWithNewChild(PathSegment.java:102)
> >> > > > ~[drill-common-0.5.0-incubating-rebuffed.jar:0.5.0-incubating]
> >> > > >         at
> >> > > >
> >> > >
> >> >
> >>
> org.apache.drill.common.expression.PathSegment$ArraySegment.cloneWithNewChild(PathSegment.java:29)
> >> > > > ~[drill-common-0.5.0-incubating-rebuffed.jar:0.5.0-incubating]
> >> > > >         at
> >> > > >
> >> > >
> >> >
> >>
> org.apache.drill.common.expression.PathSegment$NameSegment.cloneWithNewChild(PathSegment.java:179)
> >> > > > ~[drill-common-0.5.0-incubating-rebuffed.jar:0.5.0-incubating]
> >> > > >         at
> >> > > >
> >> > >
> >> >
> >>
> org.apache.drill.common.expression.PathSegment$NameSegment.cloneWithNewChild(PathSegment.java:113)
> >> > > > ~[drill-common-0.5.0-incubating-rebuffed.jar:0.5.0-incubating]
> >> > > >         at
> >> > > >
> >> > >
> >> >
> >>
> org.apache.drill.common.expression.PathSegment$NameSegment.cloneWithNewChild(PathSegment.java:179)
> >> > > > ~[drill-common-0.5.0-incubating-rebuffed.jar:0.5.0-incubating]
> >> > > >         at
> >> > > >
> >> > >
> >> >
> >>
> org.apache.drill.common.expression.PathSegment$NameSegment.cloneWithNewChild(PathSegment.java:113)
> >> > > > ~[drill-common-0.5.0-incubating-rebuffed.jar:0.5.0-incubating]
> >> > > >         at
> >> > > >
> >> > >
> >> >
> >>
> org.apache.drill.common.expression.PathSegment$NameSegment.cloneWithNewChild(PathSegment.java:179)
> >> > > > ~[drill-common-0.5.0-incubating-rebuffed.jar:0.5.0-incubating]
> >> > > >         at
> >> > > >
> >> > >
> >> >
> >>
> org.apache.drill.common.expression.SchemaPath.getUnindexedArrayChild(SchemaPath.java:163)
> >> > > > ~[drill-common-0.5.0-incubating-rebuffed.jar:0.5.0-incubating]
> >> > > >         at
> >> > > >
> >> > >
> >> >
> >>
> org.apache.drill.exec.vector.complex.RepeatedListVector.addOrGet(RepeatedListVector.java:413)
> >> > > > ~[drill-java-exec-0.5.0-incubating-rebuffed.jar:0.5.0-incubating]
> >> > > >         at
> >> > > >
> >> > >
> >> >
> >>
> org.apache.drill.exec.vector.complex.impl.RepeatedListWriter.float8(RepeatedListWriter.java:413)
> >> > > > ~[drill-java-exec-0.5.0-incubating-rebuffed.jar:0.5.0-incubating]
> >> > > >         at
> >> > > >
> >> > >
> >> >
> >>
> org.apache.drill.exec.vector.complex.fn.JsonReader.writeData(JsonReader.java:352)
> >> > > > ~[drill-java-exec-0.5.0-incubating-rebuffed.jar:0.5.0-incubating]
> >> > > >         at
> >> > > >
> >> > >
> >> >
> >>
> org.apache.drill.exec.vector.complex.fn.JsonReader.writeData(JsonReader.java:307)
> >> > > > ~[drill-java-exec-0.5.0-incubating-rebuffed.jar:0.5.0-incubating]
> >> > > >         at
> >> > > >
> >> > >
> >> >
> >>
> org.apache.drill.exec.vector.complex.fn.JsonReader.writeData(JsonReader.java:307)
> >> > > > ~[drill-java-exec-0.5.0-incubating-rebuffed.jar:0.5.0-incubating]
> >> > > >         at
> >> > > >
> >> > >
> >> >
> >>
> org.apache.drill.exec.vector.complex.fn.JsonReader.writeData(JsonReader.java:203)
> >> > > > ~[drill-java-exec-0.5.0-incubating-rebuffed.jar:0.5.0-incubating]
> >> > > >         at
> >> > > >
> >> > >
> >> >
> >>
> org.apache.drill.exec.vector.complex.fn.JsonReader.writeData(JsonReader.java:206)
> >> > > > ~[drill-java-exec-0.5.0-incubating-rebuffed.jar:0.5.0-incubating]
> >> > > >         at
> >> > > >
> >> > >
> >> >
> >>
> org.apache.drill.exec.vector.complex.fn.JsonReader.writeData(JsonReader.java:206)
> >> > > > ~[drill-java-exec-0.5.0-incubating-rebuffed.jar:0.5.0-incubating]
> >> > > >
> >> > > > On Thu, Oct 23, 2014 at 5:35 PM, Abhishek Girish <
> >> > > > abhishek.girish@gmail.com> wrote:
> >> > > >
> >> > > >> Can you look up the drillbit.log (should be present some
place in
> >> your
> >> > > >> installation log directory) and find
> >> > > >> "b6f84bc1-8f18-42e9-b79f-c889fa13a40e".
> >> > > >> Share the error that is shown.
> >> > > >>
> >> > > >> On Thu, Oct 23, 2014 at 5:10 PM, Mohit Anchlia <
> >> > mohitanchlia@gmail.com>
> >> > > >> wrote:
> >> > > >>
> >> > > >> > I moved the file to .json and now I get:
> >> > > >> >
> >> > > >> > 0: jdbc:drill:zk=local> select * from `tweets-0.json`;
> >> > > >> > Query failed: Screen received stop request sent. null
> >> > > >> > [b6f84bc1-8f18-42e9-b79f-c889fa13a40e]
> >> > > >> > Error: exception while executing query: Failure while
trying to
> >> get
> >> > > next
> >> > > >> > result batch. (state=,code=0)
> >> > > >> >
> >> > > >> > On Thu, Oct 23, 2014 at 11:28 AM, Abhishek Girish <
> >> > > >> > abhishek.girish@gmail.com
> >> > > >> > > wrote:
> >> > > >> >
> >> > > >> > > Or if your data is indeed in json format, change
the
> extension
> >> of
> >> > > your
> >> > > >> > data
> >> > > >> > > file from ".txt" to ".json"
> >> > > >> > >
> >> > > >> > > On Thu, Oct 23, 2014 at 11:25 AM, Abhishek Girish
<
> >> > > >> > > abhishek.girish@gmail.com
> >> > > >> > > > wrote:
> >> > > >> > >
> >> > > >> > > > Can you try replacing "storageformat": "json"
with
> >> > > "storageformat":
> >> > > >> > "csv"
> >> > > >> > > > ‚Äčin your plugin‚Äč
> >> > > >> > > > ?
> >> > > >> > > >
> >> > > >> > > >
> >> > > >> > > > On Thu, Oct 23, 2014 at 11:11 AM, Mohit Anchlia
<
> >> > > >> > mohitanchlia@gmail.com>
> >> > > >> > > > wrote:
> >> > > >> > > >
> >> > > >> > > >> I've tried that too
> >> > > >> > > >>
> >> > > >> > > >> Error: exception while executing query:
Failure while
> >> trying to
> >> > > get
> >> > > >> > next
> >> > > >> > > >> result batch. (state=,code=0)
> >> > > >> > > >> 0: jdbc:drill:zk=local> select * from
> >> > hdfs.json.`/tweets-0.txt`;
> >> > > >> > > >> Oct 23, 2014 2:10:40 PM
> >> > > >> > org.eigenbase.sql.validate.SqlValidatorException
> >> > > >> > > >> <init>
> >> > > >> > > >> SEVERE: org.eigenbase.sql.validate.SqlValidatorException:
> >> Table
> >> > > >> > > >> 'hdfs.json./tweets-0.txt' not found
> >> > > >> > > >> Oct 23, 2014 2:10:40 PM
> >> org.eigenbase.util.EigenbaseException
> >> > > >> <init>
> >> > > >> > > >> SEVERE: org.eigenbase.util.EigenbaseContextException:
From
> >> line
> >> > > 1,
> >> > > >> > > column
> >> > > >> > > >> 15 to line 1, column 18: Table 'hdfs.json./tweets-0.txt'
> not
> >> > > found
> >> > > >> > > >> Query failed: Failure while parsing sql.
Table
> >> > > >> > 'hdfs.json./tweets-0.txt'
> >> > > >> > > >> not found [619f0469-0606-4e8e-9ae5-17a305f527fe]
> >> > > >> > > >> Error: exception while executing query:
Failure while
> >> trying to
> >> > > get
> >> > > >> > next
> >> > > >> > > >> result batch. (state=,code=0)
> >> > > >> > > >> 0: jdbc:drill:zk=local>
> >> > > >> > > >>
> >> > > >> > > >> On Thu, Oct 23, 2014 at 11:04 AM, Neeraja
Rentachintala <
> >> > > >> > > >> nrentachintala@maprtech.com> wrote:
> >> > > >> > > >>
> >> > > >> > > >> > can you just try this.
> >> > > >> > > >> > select * from hdfs.json.`/tweets-0.txt`;
> >> > > >> > > >> >
> >> > > >> > > >> > On Thu, Oct 23, 2014 at 10:59 AM,
Mohit Anchlia <
> >> > > >> > > mohitanchlia@gmail.com
> >> > > >> > > >> >
> >> > > >> > > >> > wrote:
> >> > > >> > > >> >
> >> > > >> > > >> > > This is what I see, looks like
that file is showing up
> >> > > >> > > >> > >
> >> > > >> > > >> > > sqlline version 1.1.6
> >> > > >> > > >> > > 0: jdbc:drill:zk=local> use
hdfs.json;
> >> > > >> > > >> > > +------------+------------+
> >> > > >> > > >> > > |     ok     |  summary   |
> >> > > >> > > >> > > +------------+------------+
> >> > > >> > > >> > > | true       | Default schema
changed to 'hdfs.json' |
> >> > > >> > > >> > > +------------+------------+
> >> > > >> > > >> > > 1 row selected (1.112 seconds)
> >> > > >> > > >> > > 0: jdbc:drill:zk=local> show
files
> >> > > >> > > >> > > . . . . . . . . . . . > ;
> >> > > >> > > >> > >
> >> > > >> > > >> > >
> >> > > >> > > >> >
> >> > > >> > > >>
> >> > > >> > >
> >> > > >> >
> >> > > >>
> >> > >
> >> >
> >>
> +------------+-------------+------------+------------+------------+------------+-------------+------------+------------------+
> >> > > >> > > >> > > |    name    | isDirectory |
  isFile   |   length   |
> >> > >  owner
> >> > > >> > |
> >> > > >> > > >> > > group    | permissions | accessTime
|
> modificationTime |
> >> > > >> > > >> > >
> >> > > >> > > >> > >
> >> > > >> > > >> >
> >> > > >> > > >>
> >> > > >> > >
> >> > > >> >
> >> > > >>
> >> > >
> >> >
> >>
> +------------+-------------+------------+------------+------------+------------+-------------+------------+------------------+
> >> > > >> > > >> > > | tweets-0.txt | false     
 | true       | 2097437
>   |
> >> > root
> >> > > >> > >  |
> >> > > >> > > >> > > supergroup | rw-r--r--   | 2014-10-22
19:26:15.458 |
> >> > > 2014-10-22
> >> > > >> > > >> > > 14:04:26.585 |
> >> > > >> > > >> > > | tweets-1.txt | false     
 | true       | 1998156
>   |
> >> > root
> >> > > >> > >  |
> >> > > >> > > >> > > supergroup | rw-r--r--   | 2014-10-22
14:04:26.616 |
> >> > > 2014-10-22
> >> > > >> > > >> > > 14:04:37.123 |
> >> > > >> > > >> > >
> >> > > >> > > >> > >
> >> > > >> > > >> >
> >> > > >> > > >>
> >> > > >> > >
> >> > > >> >
> >> > > >>
> >> > >
> >> >
> >>
> +------------+-------------+------------+------------+------------+------------+-------------+------------+------------------+
> >> > > >> > > >> > > 2 rows selected (0.264 seconds)
> >> > > >> > > >> > > 0: jdbc:drill:zk=local>
> >> > > >> > > >> > >
> >> > > >> > > >> > > On Thu, Oct 23, 2014 at 10:56
AM, Jason Altekruse <
> >> > > >> > > >> > > altekrusejason@gmail.com>
> >> > > >> > > >> > > wrote:
> >> > > >> > > >> > >
> >> > > >> > > >> > > > Could you try running 'show
files' from the sqllline
> >> > prompt
> >> > > >> to
> >> > > >> > see
> >> > > >> > > >> if
> >> > > >> > > >> > > that
> >> > > >> > > >> > > > gives you any results for
files Drill is able to
> find?
> >> > > >> > > >> > > >
> >> > > >> > > >> > > > On Thu, Oct 23, 2014 at
10:43 AM, Mohit Anchlia <
> >> > > >> > > >> > mohitanchlia@gmail.com>
> >> > > >> > > >> > > > wrote:
> >> > > >> > > >> > > >
> >> > > >> > > >> > > > > Could somebody look
at this error and advise what
> >> might
> >> > > be
> >> > > >> > > wrong?
> >> > > >> > > >> It
> >> > > >> > > >> > > > seems
> >> > > >> > > >> > > > > I am doing everything
that's documented.
> >> > > >> > > >> > > > > On Wed, Oct 22, 2014
at 2:20 PM, Mohit Anchlia <
> >> > > >> > > >> > mohitanchlia@gmail.com
> >> > > >> > > >> > > >
> >> > > >> > > >> > > > > wrote:
> >> > > >> > > >> > > > >
> >> > > >> > > >> > > > > > I am getting
the following error even though
> that
> >> > file
> >> > > >> > exists
> >> > > >> > > in
> >> > > >> > > >> > hdfs
> >> > > >> > > >> > > > > >
> >> > > >> > > >> > > > > > 0: jdbc:drill:zk=local>
select * from
> >> > > >> > > >> > > > > > hdfs.`/user/train/xd/tweets/tmp/tweets-0.txt`;
> >> > > >> > > >> > > > > > Oct 22, 2014
5:16:31 PM
> >> > > >> > > >> > > > org.eigenbase.sql.validate.SqlValidatorException
> >> > > >> > > >> > > > > > <init>
> >> > > >> > > >> > > > > > SEVERE:
> >> > > org.eigenbase.sql.validate.SqlValidatorException:
> >> > > >> > > Table
> >> > > >> > > >> > > > > > 'hdfs./user/train/xd/tweets/tmp/tweets-0.txt'
> not
> >> > found
> >> > > >> > > >> > > > > > Oct 22, 2014
5:16:31 PM
> >> > > >> > org.eigenbase.util.EigenbaseException
> >> > > >> > > >> > <init>
> >> > > >> > > >> > > > > > SEVERE:
> >> org.eigenbase.util.EigenbaseContextException:
> >> > > >> From
> >> > > >> > > line
> >> > > >> > > >> 1,
> >> > > >> > > >> > > > column
> >> > > >> > > >> > > > > > 15 to line 1,
column 18: Table
> >> > > >> > > >> > > > > > 'hdfs./user/train/xd/tweets/tmp/tweets-0.txt'
> not
> >> > found
> >> > > >> > > >> > > > > > Query failed:
Failure while parsing sql. Table
> >> > > >> > > >> > > > > > 'hdfs./user/train/xd/tweets/tmp/tweets-0.txt'
> not
> >> > found
> >> > > >> > > >> > > > > > [7e1d5c73-0521-480e-b74b-a4fa50e3f4a7]
> >> > > >> > > >> > > > > > Error: exception
while executing query: Failure
> >> while
> >> > > >> trying
> >> > > >> > > to
> >> > > >> > > >> get
> >> > > >> > > >> > > > next
> >> > > >> > > >> > > > > > result batch.
(state=,code=0)
> >> > > >> > > >> > > > > >
> >> > > >> > > >> > > > > >
> >> > > >> > > >> > > > > > I created new
plugin called hdfs.
> >> > > >> > > >> > > > > >
> >> > > >> > > >> > > > > > {
> >> > > >> > > >> > > > > >   "type": "file",
> >> > > >> > > >> > > > > >   "enabled":
true,
> >> > > >> > > >> > > > > >   "connection":
"hdfs://10.225.156.201:9000/",
> >> > > >> > > >> > > > > >   "workspaces":
{
> >> > > >> > > >> > > > > >     "json": {
> >> > > >> > > >> > > > > >       "location":
"/user/train/xd/tweets/tmp",
> >> > > >> > > >> > > > > >       "writable":
false,
> >> > > >> > > >> > > > > >       "storageformat":
"json"
> >> > > >> > > >> > > > > >     }
> >> > > >> > > >> > > > > >   },
> >> > > >> > > >> > > > > >   "formats":
{
> >> > > >> > > >> > > > > >     "json": {
> >> > > >> > > >> > > > > >       "type":
"json"
> >> > > >> > > >> > > > > >     }
> >> > > >> > > >> > > > > >   }
> >> > > >> > > >> > > > > > }
> >> > > >> > > >> > > > > >
> >> > > >> > > >> > > > >
> >> > > >> > > >> > > >
> >> > > >> > > >> > >
> >> > > >> > > >> >
> >> > > >> > > >>
> >> > > >> > > >
> >> > > >> > > >
> >> > > >> > >
> >> > > >> >
> >> > > >>
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message