hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Harris <Ryan.Har...@zionsbancorp.com>
Subject RE: Using json_tuple for Nested json Arrays
Date Tue, 27 Oct 2015 19:37:49 GMT
Do you have an example of the query that you tried (which failed).
In short, you probably want to use the get_json_object() UDF:
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-get_json_object

if you need the JSON array broken into individual records, you might require lateral view
explode as in these examples:
http://mechanics.flite.com/blog/2014/04/16/using-explode-and-lateral-view-in-hive/
http://stackoverflow.com/questions/28716165/how-to-query-struct-array-with-hive-get-json-object


From: Sam Joe [mailto:games2013.sam@gmail.com]
Sent: Tuesday, October 27, 2015 1:29 PM
To: user@hive.apache.org
Subject: Re: Using json_tuple for Nested json Arrays

I tried using EXPLODE function on the nested json array but it doesn't work and throws following
error:

FAILED: UDFArgumentException explode() takes an array or a map as a parameter


Thanks,
Joel

On Tue, Oct 27, 2015 at 3:20 PM, Sam Joe <games2013.sam@gmail.com<mailto:games2013.sam@gmail.com>>
wrote:
Hi,

Is it possible to use json_tuple function to extract data from json arrays (nested too). I
am trying to process json data as string and avoid using serdes since user data may be malformed.

Please see a sample json data given below:


{
"filter_level": "low",
"retweeted": false,
"in_reply_to_screen_name": null,
"possibly_sensitive": false,
"truncated": false,
"lang": "en",
"in_reply_to_status_id_str": null,
"id": 654395184428515332,
"extended_entities": {
"media": [{
"sizes": {
"thumb": {
"w": 150,
"resize": "crop",
"h": 150
},
"small": {
"w": 340,
"resize": "fit",
"h": 255
},
"large": {
"w": 1024,
"resize": "fit",
"h": 768
},
"medium": {
"w": 600,
"resize": "fit",
"h": 450
}
},
"source_user_id": 16864598,
"media_url": "http://pbs.twimg.com/media/CRSL2MPWsAAOnZo.jpg",
"display_url": "pic.twitter.com/i3004WyF4g<http://pic.twitter.com/i3004WyF4g>",
"type": "photo",
"url": "http://t.co/i3004WyF4g",
"id": 654301608990388224,
"media_url_https": "https://pbs.twimg.com/media/CRSL2MPWsAAOnZo.jpg",
"expanded_url": "http://twitter.com/lordlancaster/status/654301626665189376/photo/1",
"source_user_id_str": "16864598",
"indices": [143,
144],
"source_status_id_str": "654301626665189376",
"source_status_id": 654301626665189376,
"id_str": "654301608990388224"
},
{
"sizes": {
"thumb": {
"w": 150,
"resize": "crop",
"h": 150
},
"small": {
"w": 340,
"resize": "fit",
"h": 255
},
"large": {
"w": 1024,
"resize": "fit",
"h": 768
},
"medium": {
"w": 600,
"resize": "fit",
"h": 450
}
},
"source_user_id": 16864598,
"media_url": "http://pbs.twimg.com/media/CRSL2MRWgAAGOcj.jpg",
"display_url": "pic.twitter.com/i3004WyF4g<http://pic.twitter.com/i3004WyF4g>",
"type": "photo",
"url": "http://t.co/i3004WyF4g",
"id": 654301608998764544,
"media_url_https": "https://pbs.twimg.com/media/CRSL2MRWgAAGOcj.jpg",
"expanded_url": "http://twitter.com/lordlancaster/status/654301626665189376/photo/1",
"source_user_id_str": "16864598",
"indices": [143,
144],
"source_status_id_str": "654301626665189376",
"source_status_id": 654301626665189376,
"id_str": "654301608998764544"
},
{
"sizes": {
"thumb": {
"w": 150,
"resize": "crop",
"h": 150
},
"small": {
"w": 340,
"resize": "fit",
"h": 255
},
"large": {
"w": 1024,
"resize": "fit",
"h": 768
},
"medium": {
"w": 600,
"resize": "fit",
"h": 450
}
},
"source_user_id": 16864598,
"media_url": "http://pbs.twimg.com/media/CRSL2MQWwAAP4Qo.jpg",
"display_url": "pic.twitter.com/i3004WyF4g<http://pic.twitter.com/i3004WyF4g>",
"type": "photo",
"url": "http://t.co/i3004WyF4g",
"id": 654301608994586624,
"media_url_https": "https://pbs.twimg.com/media/CRSL2MQWwAAP4Qo.jpg",
"expanded_url": "http://twitter.com/lordlancaster/status/654301626665189376/photo/1",
"source_user_id_str": "16864598",
"indices": [143,
144],
"source_status_id_str": "654301626665189376",
"source_status_id": 654301626665189376,
"id_str": "654301608994586624"
},
{
"sizes": {
"thumb": {
"w": 150,
"resize": "crop",
"h": 150
},
"small": {
"w": 340,
"resize": "fit",
"h": 255
},
"large": {
"w": 1024,
"resize": "fit",
"h": 768
},
"medium": {
"w": 600,
"resize": "fit",
"h": 450
}
},
"source_user_id": 16864598,
"media_url": "http://pbs.twimg.com/media/CRSL2M8WcAEXowZ.jpg",
"display_url": "pic.twitter.com/i3004WyF4g<http://pic.twitter.com/i3004WyF4g>",
"type": "photo",
"url": "http://t.co/i3004WyF4g",
"id": 654301609179115521,
"media_url_https": "https://pbs.twimg.com/media/CRSL2M8WcAEXowZ.jpg",
"expanded_url": "http://twitter.com/lordlancaster/status/654301626665189376/photo/1",
"source_user_id_str": "16864598",
"indices": [143,
144],
"source_status_id_str": "654301626665189376",
"source_status_id": 654301626665189376,
"id_str": "654301609179115521"
}]
}
}

Appreciate any help!

Thanks,
Joel




======================================================================
THIS ELECTRONIC MESSAGE, INCLUDING ANY ACCOMPANYING DOCUMENTS, IS CONFIDENTIAL and may contain
information that is privileged and exempt from disclosure under applicable law. If you are
neither the intended recipient nor responsible for delivering the message to the intended
recipient, please note that any dissemination, distribution, copying or the taking of any
action in reliance upon the message is strictly prohibited. If you have received this communication
in error, please notify the sender immediately.  Thank you.
Mime
View raw message