hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Capriolo <edlinuxg...@gmail.com>
Subject Re: Hive JSON Serde question
Date Sun, 25 Jan 2015 16:11:32 GMT
Nested lists require nested lateral views.

On Sun, Jan 25, 2015 at 11:02 AM, Sanjay Subramanian <
sanjaysubramanian@yahoo.com> wrote:

> hey guys
>
> This is the Hive table definition I have created based on the JSON
> I am using this version of hive json serde
> https://github.com/rcongiu/Hive-JSON-Serde
>
> ADD JAR
> /home/sanjay/mycode/jar/jsonserde/json-serde-1.3.1-SNAPSHOT-jar-with-dependencies.jar
> ;
> DROP TABLE IF EXISTS
>   datafeed_json
> ;
> CREATE EXTERNAL TABLE IF NOT EXISTS
>    datafeed_json (
>    object STRING,
>    entry array
>           <struct
>             <id:STRING,
>              time:BIGINT,
>              changes:array
>                <struct
>                  <field:STRING,
>                   value:struct
>                     <item:STRING,
>                      verb:STRING,
>                      parent_id:STRING,
>                      sender_id:BIGINT,
>                      created_time:BIGINT>>>>>
> ) ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe' STORED AS TEXTFILE
> LOCATION '/data/sanjay/datafeed'
> ;
>
>
> QUERY 1
> =======
> ADD JAR
> /home/sanjay/mycode/jar/jsonserde/json-serde-1.3.1-SNAPSHOT-jar-with-dependencies.jar
> ;
>  SELECT
>     object,
>     entry[0].id,
>     entry[0].time,
>     entry[0].changes[0].field,
>     entry[0].changes[0].value.item,
>     entry[0].changes[0].value.verb,
>     entry[0].changes[0].value.parent_id,
>     entry[0].changes[0].value.sender_id,
>     entry[0].changes[0].value.created_time
>   FROM
>     datafeed_json
> ;
>
> RESULT1
> ======
> foo123  113621765320467 1418608223 leads song1 rock
> 113621765320467_1107142375968396 100004748082019 1418608223
>
>
> QUERY2
> ======
> ADD JAR
> /home/sanjay/mycode/jar/jsonserde/json-serde-1.3.1-SNAPSHOT-jar-with-dependencies.jar
> ;
>  SELECT
>     object,
>     entry.id,
>     entry.time,
>     ntry
>   FROM
>     datafeed_json
>   LATERAL VIEW EXPLODE
>     (datafeed_json.entry.changes) oc1 AS ntry
> ;
>
> RESULT2
> =======
> This gives 4 rows but I was not able to iteratively do the LATERAL VIEW
> EXPLODE
>
>
> I tried various combinations of LATERAL VIEW , LATERAL VIEW EXPLODE,
> json_tuple to extract all fields in an exploded view from the JSON in tab
> separated format but no luck.
>
> Any thoughts ?
>
>
>
> Thanks
>
> sanjay
>
>
>

Mime
View raw message