hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Natkins (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-3333) Specified SerDe does not get used when executing a query over JSON data
Date Fri, 03 Aug 2012 20:17:02 GMT

     [ https://issues.apache.org/jira/browse/HIVE-3333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jonathan Natkins updated HIVE-3333:
-----------------------------------

    Attachment: hive-test-case.tar.gz

Attaching a small test case. The paths in hive.sql will have to be modified slightly, but
should be an easy reproducer.
                
> Specified SerDe does not get used when executing a query over JSON data
> -----------------------------------------------------------------------
>
>                 Key: HIVE-3333
>                 URL: https://issues.apache.org/jira/browse/HIVE-3333
>             Project: Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>            Reporter: Jonathan Natkins
>         Attachments: hive-test-case.tar.gz
>
>
> I found a JSON SerDe that I wanted to try out, and I ran into some issues attempting
to use it. The script I was executing looks like this:
> ADD JAR /home/natty/hive-test-case/hive-json-serde-0.2.jar;
> CREATE TABLE bar (
>   id INT,
>   integers ARRAY<INT>,
>   datum STRING
> ) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.JsonSerde';
> LOAD DATA LOCAL INPATH '/home/natty/sample_data/json.sample' OVERWRITE INTO TABLE bar;
> SELECT * FROM bar;
> The data I loaded in looks like this:
> { "id": 1, "integers": [ 1, 2, 3 ], "datum": "hello" },
> When the "SELECT * FROM bar" query executes, it returns with a failure:
> hive> ADD JAR /home/natty/hive-test-case/hive-json-serde-0.2.jar;
> Added /home/natty/hive-test-case/hive-json-serde-0.2.jar to class path
> Added resource: /home/natty/hive-test-case/hive-json-serde-0.2.jar
> hive> SELECT * FROM bar;
> OK
> Failed with exception java.io.IOException:java.lang.ClassCastException: org.json.JSONArray
cannot be cast to [Ljava.lang.Object;
> Time taken: 2.335 seconds
> Now, this alone doesn't bother me. What bothers me is that, if I look at the log file,
I see the following exception:
> 2012-08-03 13:12:11,407 ERROR CliDriver (SessionState.java:printError(380)) - Failed
with exception java.io.IOException:java.lang.ClassCastException: org.json.JSONArray cannot
be cast to [Ljava.lang.Object;
> java.io.IOException: java.lang.ClassCastException: org.json.JSONArray cannot be cast
to [Ljava.lang.Object;
> 	at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:173)
> 	at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1383)
> 	at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:266)
> 	at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
> 	at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
> 	at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671)
> 	at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> 	at java.lang.reflect.Method.invoke(Method.java:597)
> 	at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
> Caused by: java.lang.ClassCastException: org.json.JSONArray cannot be cast to [Ljava.lang.Object;
> 	at org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector.getList(StandardListObjectInspector.java:98)
> 	at org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:287)
> 	at org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:213)
> 	at org.apache.hadoop.hive.serde2.DelimitedJSONSerDe.serializeField(DelimitedJSONSerDe.java:59)
> 	at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:365)
> 	at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:163)
> 	... 11 more
> Note that this exception indicates that Hive is executing code for the DelimitedJSONSerDe,
rather than the one that I specified (JsonSerde from the jar file). Seems incorrect.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message