Return-Path: X-Original-To: apmail-hive-user-archive@www.apache.org Delivered-To: apmail-hive-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5B7E417809 for ; Tue, 27 Oct 2015 19:29:22 +0000 (UTC) Received: (qmail 48958 invoked by uid 500); 27 Oct 2015 19:28:46 -0000 Delivered-To: apmail-hive-user-archive@hive.apache.org Received: (qmail 48882 invoked by uid 500); 27 Oct 2015 19:28:46 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 48871 invoked by uid 99); 27 Oct 2015 19:28:46 -0000 Received: from Unknown (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 27 Oct 2015 19:28:46 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 040BFC391B for ; Tue, 27 Oct 2015 19:28:46 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.9 X-Spam-Level: ** X-Spam-Status: No, score=2.9 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=3, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-eu-west.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id K7pFXoBz4OFG for ; Tue, 27 Oct 2015 19:28:35 +0000 (UTC) Received: from mail-io0-f171.google.com (mail-io0-f171.google.com [209.85.223.171]) by mx1-eu-west.apache.org (ASF Mail Server at mx1-eu-west.apache.org) with ESMTPS id 4EFE821436 for ; Tue, 27 Oct 2015 19:28:35 +0000 (UTC) Received: by iody8 with SMTP id y8so77393384iod.1 for ; Tue, 27 Oct 2015 12:28:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=7UiOFRuVY3ZQ+z6HTt6r4oT6vFOz1sQRsGyxdIcOncQ=; b=qHSqqaIibjmMwfnN2dm2r4LjkpeMKbhQYHfc2TUEysLJTCySgPPIP3bd9L3RlQYFY4 vU+q5djBRd1NvqHpqpNWaAuxdCrTetCPRD82TT+y/ZjdecGSwCyVhUwSC8wakLfA6FIM 0LbbmvIvriJ6oPjfCy64Yg72QrF6jA+SxojyPR+zFqelBpdh3WPamV7Lt4ocEop1SHte rroVIWJyaHhSlJUXDMjvez3dypQvbyU29zaBbcLSUyjm91UEiztZ7ARud9cnejs5yxYn klC3rEOh/CfvAwP4dQyy2RoQ2bC6pDt0QyC6p6gcufW5YXR5KjZ59+zgsAXC3HIBxgsU 3gPA== MIME-Version: 1.0 X-Received: by 10.107.153.16 with SMTP id b16mr2267415ioe.65.1445974113706; Tue, 27 Oct 2015 12:28:33 -0700 (PDT) Received: by 10.107.141.69 with HTTP; Tue, 27 Oct 2015 12:28:33 -0700 (PDT) In-Reply-To: References: Date: Tue, 27 Oct 2015 15:28:33 -0400 Message-ID: Subject: Re: Using json_tuple for Nested json Arrays From: Sam Joe To: user@hive.apache.org Content-Type: multipart/alternative; boundary=001a1140fc06676a7005231b155c --001a1140fc06676a7005231b155c Content-Type: text/plain; charset=UTF-8 I tried using EXPLODE function on the nested json array but it doesn't work and throws following error: FAILED: UDFArgumentException explode() takes an array or a map as a parameter Thanks, Joel On Tue, Oct 27, 2015 at 3:20 PM, Sam Joe wrote: > Hi, > > Is it possible to use json_tuple function to extract data from json arrays > (nested too). I am trying to process json data as string and avoid using > serdes since user data may be malformed. > > Please see a sample json data given below: > > > { > > "filter_level": "low", > > "retweeted": false, > > "in_reply_to_screen_name": null, > > "possibly_sensitive": false, > > "truncated": false, > > "lang": "en", > > "in_reply_to_status_id_str": null, > > "id": 654395184428515332, > > "extended_entities": { > > "media": [{ > > "sizes": { > > "thumb": { > > "w": 150, > > "resize": "crop", > > "h": 150 > > }, > > "small": { > > "w": 340, > > "resize": "fit", > > "h": 255 > > }, > > "large": { > > "w": 1024, > > "resize": "fit", > > "h": 768 > > }, > > "medium": { > > "w": 600, > > "resize": "fit", > > "h": 450 > > } > > }, > > "source_user_id": 16864598, > > "media_url": "http://pbs.twimg.com/media/CRSL2MPWsAAOnZo.jpg", > > "display_url": "pic.twitter.com/i3004WyF4g", > > "type": "photo", > > "url": "http://t.co/i3004WyF4g", > > "id": 654301608990388224, > > "media_url_https": "https://pbs.twimg.com/media/CRSL2MPWsAAOnZo.jpg", > > "expanded_url": " > http://twitter.com/lordlancaster/status/654301626665189376/photo/1", > > "source_user_id_str": "16864598", > > "indices": [143, > > 144], > > "source_status_id_str": "654301626665189376", > > "source_status_id": 654301626665189376, > > "id_str": "654301608990388224" > > }, > > { > > "sizes": { > > "thumb": { > > "w": 150, > > "resize": "crop", > > "h": 150 > > }, > > "small": { > > "w": 340, > > "resize": "fit", > > "h": 255 > > }, > > "large": { > > "w": 1024, > > "resize": "fit", > > "h": 768 > > }, > > "medium": { > > "w": 600, > > "resize": "fit", > > "h": 450 > > } > > }, > > "source_user_id": 16864598, > > "media_url": "http://pbs.twimg.com/media/CRSL2MRWgAAGOcj.jpg", > > "display_url": "pic.twitter.com/i3004WyF4g", > > "type": "photo", > > "url": "http://t.co/i3004WyF4g", > > "id": 654301608998764544, > > "media_url_https": "https://pbs.twimg.com/media/CRSL2MRWgAAGOcj.jpg", > > "expanded_url": " > http://twitter.com/lordlancaster/status/654301626665189376/photo/1", > > "source_user_id_str": "16864598", > > "indices": [143, > > 144], > > "source_status_id_str": "654301626665189376", > > "source_status_id": 654301626665189376, > > "id_str": "654301608998764544" > > }, > > { > > "sizes": { > > "thumb": { > > "w": 150, > > "resize": "crop", > > "h": 150 > > }, > > "small": { > > "w": 340, > > "resize": "fit", > > "h": 255 > > }, > > "large": { > > "w": 1024, > > "resize": "fit", > > "h": 768 > > }, > > "medium": { > > "w": 600, > > "resize": "fit", > > "h": 450 > > } > > }, > > "source_user_id": 16864598, > > "media_url": "http://pbs.twimg.com/media/CRSL2MQWwAAP4Qo.jpg", > > "display_url": "pic.twitter.com/i3004WyF4g", > > "type": "photo", > > "url": "http://t.co/i3004WyF4g", > > "id": 654301608994586624, > > "media_url_https": "https://pbs.twimg.com/media/CRSL2MQWwAAP4Qo.jpg", > > "expanded_url": " > http://twitter.com/lordlancaster/status/654301626665189376/photo/1", > > "source_user_id_str": "16864598", > > "indices": [143, > > 144], > > "source_status_id_str": "654301626665189376", > > "source_status_id": 654301626665189376, > > "id_str": "654301608994586624" > > }, > > { > > "sizes": { > > "thumb": { > > "w": 150, > > "resize": "crop", > > "h": 150 > > }, > > "small": { > > "w": 340, > > "resize": "fit", > > "h": 255 > > }, > > "large": { > > "w": 1024, > > "resize": "fit", > > "h": 768 > > }, > > "medium": { > > "w": 600, > > "resize": "fit", > > "h": 450 > > } > > }, > > "source_user_id": 16864598, > > "media_url": "http://pbs.twimg.com/media/CRSL2M8WcAEXowZ.jpg", > > "display_url": "pic.twitter.com/i3004WyF4g", > > "type": "photo", > > "url": "http://t.co/i3004WyF4g", > > "id": 654301609179115521, > > "media_url_https": "https://pbs.twimg.com/media/CRSL2M8WcAEXowZ.jpg", > > "expanded_url": " > http://twitter.com/lordlancaster/status/654301626665189376/photo/1", > > "source_user_id_str": "16864598", > > "indices": [143, > > 144], > > "source_status_id_str": "654301626665189376", > > "source_status_id": 654301626665189376, > > "id_str": "654301609179115521" > > }] > > } > > } > > > Appreciate any help! > > > *Thanks,* > > *Joel* > > > > --001a1140fc06676a7005231b155c Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
I tried using EXPLODE function on the nested json array bu= t it doesn't work and throws following error:

F= AILED: UDFArgumentException explode() takes an array or a map as a paramete= r


Thanks,
Joel

On Tue, Oc= t 27, 2015 at 3:20 PM, Sam Joe <games2013.sam@gmail.com> wrote:
Hi,

Is it possible to use json_tuple function to extract data from jso= n arrays (nested too). I am trying to process json data as string and avoid= using serdes since user data may be malformed.

Pl= ease see a sample json data given below:


{

"filter_level": "low",

"ret= weeted": false,

"in_reply_to_screen_name": null,

"possibly_sens= itive": false,

"truncated": false,

"lang": "en",=

&qu= ot;in_reply_to_status_id_str": null,

"id": 654395184428515332,

"= ;extended_entities": {

"media": [{

"sizes": {

"thum= b": {

= "w": 150,

"resize": "crop",

"= h": 150

},

"small": {

"w": 340,

"resize": = "fit",

"h": 255

},

"large": {

"w": = 1024,

<= /span>"resize": "fit",

"h": 768

},

"medi= um": {

= "w": 600,

"resize": "fit",

"= h": 450

}

},

"source_user_id": 16864598,

"media_url": &qu= ot;http://pbs.twimg.com/media/CRSL2MPWsAAOnZo.jpg",

"display= _url": "pic.twitter.com/i3004WyF4g",

"type": "photo&qu= ot;,

"url": "http://t.co/i3004WyF4g",

"id": 654301608990388224,

=

"= ;media_url_https": "https://pbs.twimg.com/media/CRSL2MPWsAAOnZ= o.jpg",

"expanded_url": "http:/= /twitter.com/lordlancaster/status/654301626665189376/photo/1",

=

"= ;source_user_id_str": "16864598",

= "indices": [143,

144= ],

"source_status_id_str": "654301626665189376",

"sour= ce_status_id": 654301626665189376,

"id_str": "65430160899= 0388224"

},

{

"sizes": {

"thumb": {

"w": 150,

=

&qu= ot;resize": "crop",

"h": 150

},

"small": {=

"w": 340,

"resize": "fit",

"h": 25= 5

},

"large": {

"w": 1024,

"resize": "fit&= quot;,

= "h": 768

},

"medium": {

= "w": 600,

"= resize": "fit",

"h": 450

}

},

"source_user_id"= ;: 16864598,

"media_url": "http://pbs.twimg.com/media/CRSL2M= RWgAAGOcj.jpg",

"display_url": "pic.twitter.com/i3004WyF4g&q= uot;,

"type": "photo",

"url": "http://t.co/i3004WyF4g",

&qu= ot;id": 654301608998764544,

"media_url_https": "http= s://pbs.twimg.com/media/CRSL2MRWgAAGOcj.jpg",

"expanded_url&quo= t;: "http://twitter.com/lordlancaster/status/654= 301626665189376/photo/1",

"source_user_id_str": "1686= 4598",

= "indices": [143,

144],

"source_status_id_str": &= quot;654301626665189376",

"source_status_id": 6543016266651893= 76,

"id_str": "654301608998764544"

},

{

"sizes": {

<= p class=3D"MsoNormal"> "= ;thumb": {

"w": 150,

"resize": "crop",=

"h": 150

},

"small": {

"w": 340,

"resiz= e": "fit",

"h": 255

},

"large": {

"w&= quot;: 1024,

"resize": "fit",

"h": 768

},

<= p class=3D"MsoNormal"> "= ;medium": {

"w": 600,

"resize": "fit",<= /p>

= "h": 450

}

},

"source_user_id": 16864598,

"media_url"= ;: "http://pbs.twimg.com/media/CRSL2MQWwAAP4Qo.jpg",

"d= isplay_url": "pic.twitter.com/i3004WyF4g",

"type": "ph= oto",

= "url": "http://t.co/i3004WyF4g",

"id": 654301608994586= 624,

"media_url_https": "https://pbs.twimg.com/media/CRSL2M= QWwAAP4Qo.jpg",

"expanded_url": "http://twitter.com/lordlancaster/status/654301626665189376/photo/1&qu= ot;,

"source_user_id_str": "16864598",

"indices":= [143,

144],

= "source_status_id_str": "654301626665189376",<= /p>

&q= uot;source_status_id": 654301626665189376,

<= span style=3D"white-space:pre-wrap"> "id_str": "654= 301608994586624"

},

{

"sizes": {

"thumb": {

"w": = 150,

"resize": "crop",

"h": 150

},

"smal= l": {

= "w": 340,

"resize": "fit",

"h= ": 255

= },

"large": {

"w": 1024,

"resize": = "fit",

"h": 768

},

"medium": {

"w":= 600,

<= /span>"resize": "fit",

"h": 450

}

},

"sourc= e_user_id": 16864598,

"media_url": "http://pbs.twimg.co= m/media/CRSL2M8WcAEXowZ.jpg",

"display_url": "pic.twitter.com/i3= 004WyF4g",

"type": "photo",

"url": "= ;http://t.co/i3004WyF4= g",

"id": 654301609179115521,

"media_url_https": &= quot;https://pbs.twimg.com/media/CRSL2M8WcAEXowZ.jpg",

"expa= nded_url": "http://twitter.com/lordlancaste= r/status/654301626665189376/photo/1",

"source_user_id_str"= : "16864598",

"indices": [143,

= 144],

"source_status_id_s= tr": "654301626665189376",

"source_status_id": 65430= 1626665189376,

"id_str": "654301609179115521"

}]

}

}


A= ppreciate any help!


Thanks,

Joel




--001a1140fc06676a7005231b155c--