hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ajay Tirpude <tirpudeaj...@gmail.com>
Subject Re: Nested JSON Parsing
Date Sun, 13 Nov 2016 06:52:43 GMT
Hi Dudu,

I want to parse my json file and get the desired output in csv file that I
pasted in the output section. Currently I am able to achieve this using
bash(jq command) but that is not an answer for json files that are in TBs.
So I am looking for a solution in PIG or HIVE.

Regards,
Ajay T

On Sun, Nov 13, 2016 at 12:10 PM, Markovitz, Dudu <dmarkovitz@paypal.com>
wrote:

> And your issue/question is?
>
>
>
> *From:* Ajay Tirpude [mailto:tirpudeajay1@gmail.com]
> *Sent:* Sunday, November 13, 2016 4:46 AM
> *To:* user@hive.apache.org
> *Subject:* Nested JSON Parsing
>
>
>
> Dear All,
>
>
>
> I am trying to parse this json file given below and my intention is to
> convert this json file into a csv.
>
>
>
> *{*
>
> *  "devicetype": "SmartPhone",*
>
> *  "uuid": "sg76fdhh7gfxhxfhgxf67x",*
>
> *  "ts": {*
>
> *    "date": "2016-03-23T10:58:34.660Z"*
>
> *  },*
>
> *  "events": [*
>
> *    {*
>
> *      "timestamp": "2016-03-23T10:58:37Z",*
>
> *      "evt": "first",*
>
> *      "ad": "v6v75v88n98778mn",*
>
> *      "tkey": "ngbbc76fbc6fb6fb66fb6",*
>
> *      "mtp": "Wed Mar 23 2016 19:04:22 GMT 0800 (PHT)",*
>
> *      "eventid": "eytuy"*
>
> *    },*
>
> *    {*
>
> *      "timestamp": "2016-03-23T10:58:35Z",*
>
> *      "evt": "second",*
>
> *      "ad": "v6v75v88n98778mn",*
>
> *      "tkey": "ngbbc76fbc6fb6fb66fb6"*
>
> *    },*
>
> *    {*
>
> *      "timestamp": "2016-03-23T10:58:36Z",*
>
> *      "evt": "third",*
>
> *      "ad": "v6v75v88n98778mn",*
>
> *      "tkey": "ngbbc76fbc6fb6fb66fb6"*
>
> *    }*
>
> *  ],*
>
> *  "adid": "v6v75v88n98778mn",*
>
> *  "ad_tz": {*
>
> *    "date": "2016-03-23T10:58:34.660Z"*
>
> *  },*
>
> *  "ua": "Mozilla/5.0 (Linux; U; Android 4.3; en-gb; SM-N9005
> Build/JSS15J) AppleWebKit/534.30 (KHTML, like Gecko) Version/4.0 Mobile
> Safari/534.30"*
>
> *}*
>
>
>
> There are few conditions that I need to apply before I parse
>
>
>
> 1. I want to get all the fields except timestamp inside events nested key.
>
> 2. I want to loop events key for each evt. In above input file there are
> three evts but that would not fixed in the actual input file. There can be
> multiple evts and not just 3.
>
> 3. Not every evt block is similar. You can have different extra field in
> each evt block but we need to extract every key. In case we don't have key
> in one evt then the value should be blank for that env. For example for
> evt: first we have two extra key value pair i.,e, eventid/mtp and these
> value should be blank for other evts. Similarly we can have some key:value
> in other evts as well so that other key:values should be blank in other
> evts.
>
>
>
> At last I want the output to be like this
>
>
>
> devicetype
>
> uuid
>
> ts.date
>
> events.evt
>
> events.ad
>
> events.tkey
>
> events.mtp
>
> events.eventid
>
> adid
>
> ad_tz.date
>
> ua
>
> SmartPhone
>
> sg76fdhh7gfxhxfhgxf67x
>
> 2016-03-23T10:58:34.660Z
>
> first
>
> v6v75v88n98778mn
>
> ngbbc76fbc6fb6fb66fb6
>
> Wed Mar 23 2016 19:04:22 GMT 0800 (PHT)
>
> eytuy
>
> v6v75v88n98778mn
>
> 2016-03-23T10:58:34.660Z
>
> Mozilla/5.0 (Linux; U; Android 4.3; en-gb; SM-N9005 Build/JSS15J)
> AppleWebKit/534.30 (KHTML, like Gecko) Version/4.0 Mobile Safari/534.30
>
> SmartPhone
>
> sg76fdhh7gfxhxfhgxf67x
>
> 2016-03-23T10:58:34.660Z
>
> second
>
> v6v75v88n98778mn
>
> ngbbc76fbc6fb6fb66fb6
>
> v6v75v88n98778mn
>
> 2016-03-23T10:58:34.660Z
>
> Mozilla/5.0 (Linux; U; Android 4.3; en-gb; SM-N9005 Build/JSS15J)
> AppleWebKit/534.30 (KHTML, like Gecko) Version/4.0 Mobile Safari/534.30
>
> SmartPhone
>
> sg76fdhh7gfxhxfhgxf67x
>
> 2016-03-23T10:58:34.660Z
>
> third
>
> v6v75v88n98778mn
>
> ngbbc76fbc6fb6fb66fb6
>
>
>
>
>
> v6v75v88n98778mn
>
> 2016-03-23T10:58:34.660Z
>
> Mozilla/5.0 (Linux; U; Android 4.3; en-gb; SM-N9005 Build/JSS15J)
> AppleWebKit/534.30 (KHTML, like Gecko) Version/4.0 Mobile Safari/534.30
>
>
>
> Regards,
>
> Ajay T
>

Mime
View raw message