hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Connell, Chuck" <Chuck.Conn...@nuance.com>
Subject RE: hive json serde
Date Tue, 18 Sep 2012 12:47:51 GMT
OK, but keep in mind that you could use get_json_object(whole-json-field, $.C.D). If using
a serde, then you can say just C.D directly.

Chuck Connell
Nuance R&D Data Team
Burlington, MA


From: Mark Golden [mailto:mark.golden1744@gmail.com]
Sent: Monday, September 17, 2012 7:31 PM
To: Connell, Chuck
Cc: user@hive.apache.org
Subject: Re: hive json serde

Thanks Chuck.

I tried both json_tuple & get_json_objects on top of the serde table and it works too.
I have nested jsons and some of the keys do not have a specific schema (You can throw whatever
you want there) so if I want to query from those keys then I have to use get_json_object on
top of the serde table. If I have the following json

{ "A" : "I love hive", "B" : "User mailing list is great", "C" : { "D" : "chuck", "E" : "is
awesome and thanks for your help" } }
{ "A" : "I love hive", "B" : "User mailing list is great", "C" : { "G" : "It will rain in
NY tonight" } }
{ "A" : "I love hive", "B" : "User mailing list is great", "C" : { "H" : "No it will snow
in NY" } }

If I create a table on top of json like this then the serde will work, json_tuple & get_json_object
will also work but to query the keys inside "C" I need to do get_json_object on "C" something
like

select *
from my_serde_table
where get_json_object(C, "$.D") ilike "chuck"

I like serde table because it is much cleaner than create a table with (value string) and
then doing get_json_object or json_tuple and extract all the columns out.

I'm exposing this table to other users in my company so I want to have the serde table. Thanks
for your help.
On Mon, Sep 17, 2012 at 3:49 PM, Connell, Chuck <Chuck.Connell@nuance.com<mailto:Chuck.Connell@nuance.com>>
wrote:
A few things that should help...

- There are three ways to handle JSON within a Hive table, get_json_object, json_tuple, and
JSON SerDe. These are separate mechanisms. So if you want to use get_json_object, you don't
need to load a SerDe.

- With a SerDe, Hive understands JSON fields natively, so you can just use the plain field
names, no special function call at all.

- I'm not sure if auxpath accepts a relative path, but I know the full path works.

Chuck


________________________________
From: Mark Golden [mark.golden1744@gmail.com<mailto:mark.golden1744@gmail.com>]
Sent: Monday, September 17, 2012 5:58 PM
To: Connell, Chuck
Cc: user@hive.apache.org<mailto:user@hive.apache.org>

Subject: Re: hive json serde

It works now. Looks like there is a bug in the code.

if you do hive --auxpath ./serde then I get an error but if I get the full path as
hive --auxpath /var/lib/hdfs/serde/ then get_json_object() works.

Thanks for your help. Appreciate it.

CCing hive user group as well.

On Mon, Sep 17, 2012 at 2:54 PM, Mark Golden <mark.golden1744@gmail.com<mailto:mark.golden1744@gmail.com>>
wrote:
:-) Thanks. I downloaded and got it compiled as well.

My table has nested JSON values so when I tried to do get_json_object(<column>,'$.<key>')
this serde doesn't work in that case. It throws an error. I'll email the developer too.

When you tried did you try this on a cluster of nodes or on a single instance?

On Mon, Sep 17, 2012 at 2:38 PM, Connell, Chuck <Chuck.Connell@nuance.com<mailto:Chuck.Connell@nuance.com>>
wrote:
Go to the Downloads link on the project page, near the upper right.

________________________________
From: Mark Golden [mark.golden1744@gmail.com<mailto:mark.golden1744@gmail.com>]
Sent: Monday, September 17, 2012 5:32 PM
To: Connell, Chuck
Subject: Re: hive json serde
Hey Chuck,

Where did you download the prebuild jar from?
On Mon, Sep 17, 2012 at 12:10 PM, Connell, Chuck <Chuck.Connell@nuance.com<mailto:Chuck.Connell@nuance.com>>
wrote:
I used his pre-built jar. No need to compile anything.

Be sure to add " --auxpath /path/to/jar/dir " to the Hive command line.

Chuck



From: Connell, Chuck [mailto:Chuck.Connell@nuance.com<mailto:Chuck.Connell@nuance.com>]
Sent: Monday, September 17, 2012 3:06 PM
To: user@hive.apache.org<mailto:user@hive.apache.org>
Subject: RE: hive json serde

I just finished testing this one. No problems found. The developer is also quite responsive
to issues raised.  I encouraged him to submit it to the Hive dev team as core code.

https://github.com/rcongiu/Hive-JSON-Serde/

Chuck Connell
Nuance R&D Data Team
Burlington, MA


From: Mark Golden [mailto:mark.golden1744@gmail.com]
Sent: Monday, September 17, 2012 2:56 PM
To: user@hive.apache.org<mailto:user@hive.apache.org>
Subject: hive json serde

Hi,

I'm looking for a hive json serde that works well. Any suggestions? I did a quick google and
found a few but none of them are complete.

Thanks,
Mark.





Mime
View raw message