Return-Path: X-Original-To: apmail-hive-user-archive@www.apache.org Delivered-To: apmail-hive-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 076C99526 for ; Mon, 2 Apr 2012 01:38:18 +0000 (UTC) Received: (qmail 89356 invoked by uid 500); 2 Apr 2012 01:38:18 -0000 Delivered-To: apmail-hive-user-archive@hive.apache.org Received: (qmail 89311 invoked by uid 500); 2 Apr 2012 01:38:17 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 89303 invoked by uid 99); 2 Apr 2012 01:38:17 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 02 Apr 2012 01:38:17 +0000 X-ASF-Spam-Status: No, hits=-2.8 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_HI,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of Anurag.Gulati@aexp.com designates 12.10.219.25 as permitted sender) Received: from [12.10.219.25] (HELO lppiu581.aexp.com) (12.10.219.25) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 02 Apr 2012 01:38:11 +0000 X-WSS-ID: 0M1TWJ1-0D-FLM-02 X-Outbound-AXP: PASSED X-M-MSG: Received: from lppiu579.aexp.com (lppim502.ipc.us.aexp.com [148.173.210.207]) by lppiu581.aexp.com (Axway MailGate 5.1.0) with ESMTP id 2AC6F25001E8 for ; Sun, 1 Apr 2012 18:37:49 -0700 (MST) X-M-MSG: X-WSS-ID: 0M1TWJ1-0C-OGL-04 X-Outbound-AXP: PASSED X-Disclaimer: AMEX Disclaimer Received: from mail.aexp.com (unknown [148.171.244.6]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by lppiu579.aexp.com (Axway MailGate 5.1.0) with ESMTP id 22B505551B2F for ; Sun, 1 Apr 2012 18:37:48 -0700 (MST) Received: from 024-SN1MPN1-012.024d.mgd.msft.net ([169.254.2.2]) by 024-SN1MMR1-006.024d.mgd.msft.net ([148.171.244.6]) with mapi id 14.01.0355.003; Sun, 1 Apr 2012 20:37:48 -0500 From: "Anurag Gulati" To: "user@hive.apache.org" Subject: Question - Nested JSON using Hive Thread-Topic: Question - Nested JSON using Hive Thread-Index: Ac0QcTQi5afpNyTWTBipCIxCdUf/eA== Date: Mon, 2 Apr 2012 01:37:47 +0000 Message-ID: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.79.193.221] Content-Type: multipart/alternative; boundary=_000_DC0D0310B270F94D8A472D21CAE8B794A9F0EE024SN1MPN1012024d_ MIME-Version: 1.0 X-DLP-Flow: Processed X-Virus-Checked: Checked by ClamAV on apache.org --_000_DC0D0310B270F94D8A472D21CAE8B794A9F0EE024SN1MPN1012024d_ Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable I've been trying to figure this out for a couple days now and I haven't = gotten very far. Looking for your guidance on the matter. As a test, I'm trying to import Facebook Open Graph API data into Hive but = = am having a problem with the syntax. Here is a line of sample data I'm trying to import (my own personal data): {"id":"10011666","name":"Test = user","first_name":"Test","last_name":"user","link":"http:\/\/www.facebook.= com\/test.user","username":"test.user","birthday":"09\/19\/1983","hometown"= :{"id":"103102203064024","name":"West= Chester, Pennsylvania"},"location":{"id":"","name":null},"bio":"This is my= = Bio. I'm a geek that love to hack (in a good way)","quotes":"I like quotes.= = But I'm shortening this section cuz it was = wild!","work":[{"employer":{"id":"6185812851","name":"American = Eagle"},"location":{"id":"105540216147364","name":"Phoenix, = Arizona"},"position":{"id":"133619273341785","name":"Counter = Guy"},"start_date":"2012-01"},{"employer":{"id":"190876464341724","name":"C= ardiac= group"},"position":{"id":"105630109469647","name":"Executive = Producer"},"description":"We create music for Artist Placement and = TV\/Film.","start_date":"2002-01"},{"employer":{"id":"6185812851","name":"A= merican= Eagle"},"location":{"id":"105540216147364","name":"Phoenix, = Arizona"},"position":{"id":"116439401740213","name":"Floor = Guy"},"start_date":"2007-10","end_date":"2012-01"},{"employer":{"id":"11006= 7355684846","name":"Saint= Joseph Hospital"},"location":{"id":"105540216147364","name":"Phoenix, = Arizona"},"position":{"id":"202489236428627","name":"Pharmacy IT = Coordinator"},"start_date":"2005-10","end_date":"2007-10"},{"employer":{"id= ":"110067355684846","name":"Saint= Joseph Hospital"},"location":{"id":"105540216147364","name":"Phoenix, = Arizona"},"position":{"id":"144703015548786","name":"Pharmacy = Tech"},"start_date":"2001-02","end_date":"2005-10"}],"sports":[{"id":"10860= 6435830479","name":"Karate"}],"favorite_teams":[{"id":"87169796810","name":= "Philadelphia= Flyers"},{"id":"93625750491","name":"Philadelphia = Phillies"},{"id":"45898408995","name":"Phoenix = Suns"},{"id":"120163518021430","name":"Philadelphia = Eagles"}],"favorite_athletes":[{"id":"77922840249","name":"Steve = Nash"},{"id":"105590659475179","name":"Wayne = Gretzky"},{"id":"62975399193","name":"Michael = Jordan"}],"inspirational_people":[{"id":"106676942701904","name":"Gandhi"}]= ,"education":[{"school":{"id":"109324275761313","name":"Corona= del Sol High School"},"type":"High = School"},{"school":{"id":"23680344606","name":"Arizona State = University"},"type":"College"}],"gender":"male","interested_in":["female"],= "relationship_status":"Single","religion":"Hinduism= (One with all things)","political":"Liberal (Left of = Center)","email":"app+22c90gj.9hh9d.f7304b58ac646e08b5f0f10a73547e34\u0040p= roxymail.facebook.com","website":"www.slashdot.org\r\nwww.gizmodo.com","tim= ezone":-7,"locale":"en_US","languages":[{"id":"106059522759137","name":"Eng= lish"},{"id":"112969428713061","name":"Hindi"}],"verified":true,"updated_ti= me":"2012-03-22T17:24:25+0000"} Here is my HIVE Table Code: ADD JAR JARs/json-serde-1.1-jar-with-dependencies.jar; ADD JAR JARs/json-path-0.5.4.jar; ADD JAR JARs/json-smart-1.0.6.3.jar; CREATE TABLE USERS ( id STRING, name STRING, =66irst_name STRING, last_name STRING, link STRING, username STRING, birthday STRING, hometown MAP, location MAP, bio STRING, quotes STRING, work ARRAY< MAP, MAP, MAP, start_date STRING, end_date STRING >, education ARRAY< MAP, STRING >, gender STRING, interested_in ARRAY< STRING >, relationship_status STRING, religion STRING, political STRING, email STRING, website STRING, timezone INT, locale STRING, language ARRAY< STRING, STRING >, verified STRING, updated_time STRING ) ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe' STORED AS TEXTFILE; LOAD DATA LOCAL INPATH '/export/home/me/json_data.txt' OVERWRITE INTO TABLE= = USERS; When I run this code I get the error: =46AILED: Parse Error: line 10:0 mismatched input 'location' expecting = Identifier near ',' in column specification Can you please tell me what I'm doing wrong and how to fix it=3F I appreciate it!!! Regards, Anurag G. American Express made the following annotations on Sun Apr 01 2012 18:37:49= =20 ***************************************************************************= ***= =20 "This message and any attachments are solely for the intended recipient and= = may contain confidential or privileged information. If you are not the = intended recipient, any disclosure, copying, use, or distribution of the = information included in this message and any attachments is prohibited. If = = you have received this communication in error, please notify us by reply = e-mail and immediately and permanently delete this message and any = attachments. Thank you."=20 American Express a ajout=E9 le commentaire suivant le Sun Apr 01 2012 = 18:37:49=20 Ce courrier et toute pi=E8ce jointe qu'il contient sont r=E9serv=E9s au seu= l= destinataire indiqu=E9 et peuvent renfermer des renseignements = confidentiels et privil=E9gi=E9s. Si vous n'=EAtes pas le destinataire = pr=E9vu, toute divulgation, duplication, utilisation ou distribution du = courrier ou de toute pi=E8ce jointe est interdite. Si vous avez re=E7u cett= e= communication par erreur, veuillez nous en aviser par courrier et = d=E9truire imm=E9diatement le courrier et les pi=E8ces jointes. Merci.=20 ***************************************************************************= ***= =20 ---------------------------------------------------------------------------= ---- --_000_DC0D0310B270F94D8A472D21CAE8B794A9F0EE024SN1MPN1012024d_ Content-Type: text/html; charset=us-ascii Content-Transfer-Encoding: quoted-printable

I’ve been trying to figure this out for a = couple days now and I haven’t gotten very far.

Looking for your guidance on the = matter.

 

As a test, I’m trying to import Facebook Open = = Graph API data into Hive but am having a problem with the = syntax.

 

Here is a line of sample data I’m trying to = import (my own personal data):

 

{"id":"10011666","name":&= quot;Test= = user","first_name":"Test","last_name":&q= uot;user","link":"http:\/\/www.facebook.com\/test.user&= quot;,"username":"test.user","birthday":"= ;09\/19\/1983","hometown":{"id":"103102203064= 024","name":"West= Chester, = Pennsylvania"},"location":{"id":"","= ;name":null},"bio":"This is my Bio. I’m a geek that love to hack (in a good = way)","quotes":"I like quotes. But I’m shortening= = this section cuz it was = wild!","work":[{"employer":{"id":"6= 185812851","name":"American= = Eagle"},"location":{"id":"105540216147364&quo= t;,"name":"Phoenix,= = Arizona"},"position":{"id":"133619273341785&q= uot;,"name":"Counter = Guy"},"start_date":"2012-01"},{"employer"= ;:{"id":"190876464341724","name":"Cardia= c= = group"},"position":{"id":"105630109469647&quo= t;,"name":"Executive= Producer"},"description":"We create music for Artist = = Placement and = TV\/Film.","start_date":"2002-01"},{"employer= ":{"id":"6185812851","name":"Americ= an = Eagle"},"location":{"id":"105540216147364&quo= t;,"name":"Phoenix,= = Arizona"},"position":{"id":"116439401740213&q= uot;,"name":"Floor= = Guy"},"start_date":"2007-10","end_date":= "2012-01"},{"employer":{"id":"1100673556= 84846","name":"Saint= Joseph = Hospital"},"location":{"id":"105540216147364&= quot;,"name":"Phoenix, = Arizona"},"position":{"id":"202489236428627&q= uot;,"name":"Pharmacy= IT = Coordinator"},"start_date":"2005-10","end_dat= e":"2007-10"},{"employer":{"id":"11= 0067355684846","name":"Saint= Joseph = Hospital"},"location":{"id":"105540216147364&= quot;,"name":"Phoenix,= = Arizona"},"position":{"id":"144703015548786&q= uot;,"name":"Pharmacy = Tech"},"start_date":"2001-02","end_date"= :"2005-10"}],"sports":[{"id":"1086064358= 30479","name":"Karate"}],"favorite_teams"= ;:[{"id":"87169796810","name":"Philadelp= hia= = =46lyers"},{"id":"93625750491","name":&q= uot;Philadelphia= = Phillies"},{"id":"45898408995","name":&q= uot;Phoenix = Suns"},{"id":"120163518021430","name":&q= uot;Philadelphia= = Eagles"}],"favorite_athletes":[{"id":"7792284= 0249","name":"Steve= = Nash"},{"id":"105590659475179","name":&q= uot;Wayne= = Gretzky"},{"id":"62975399193","name":&qu= ot;Michael= = Jordan"}],"inspirational_people":[{"id":"1066= 76942701904","name":"Gandhi"}],"education&quo= t;:[{"school":{"id":"109324275761313","n= ame":"Corona del Sol High School"},"type":"High = School"},{"school":{"id":"23680344606",&= quot;name":"Arizona= State = University"},"type":"College"}],"gender"= :"male","interested_in":["female"],"rela= tionship_status":"Single","religion":"Hinduis= m= (One with all things)","political":"Liberal (Left of = Center)","email":"app+22c90gj.9hh9d.f7304b58ac646e0= 8b5f0f10a73547e34\u0040proxymail.facebook.com","website":&qu= ot;www.slashdot.org\r\nwww.gizmodo.com","timezone":-7,"= locale":"en_US","languages":[{"id":"= ;106059522759137","name":"English"},{"id"= ;:"112969428713061","name":"Hindi"}],"ve= rified":true,"updated_time":"2012-03-22T17:24:25+00= 00"}

 

Here is my HIVE Table Code:

 

ADD JAR = JARs/json-serde-1.1-jar-with-dependencies.jar;

ADD JAR JARs/json-path-0.5.4.jar;

ADD JAR JARs/json-smart-1.0.6.3.jar;

 

CREATE TABLE USERS (

id STRING,

name STRING,

first_name STRING,

last_name STRING,

link STRING,

username STRING,

birthday STRING,

hometown MAP<STRING,STRING>,

location MAP<id STRING,name = STRING>,

bio STRING,

quotes STRING,

work ARRAY<

MAP<STRING,STRING>,

MAP<STRING,STRING>,

MAP<STRING,STRING>,

start_date STRING,

end­_date STRING

>,

education ARRAY<

MAP<STRING,STRING>,

STRING

>,

gender STRING,

interested_in ARRAY<

STRING

>,

relationship_status STRING,

religion STRING,

political STRING,

email STRING,

website STRING,

timezone INT,

locale STRING,

language ARRAY<

STRING,

STRING

>,

verified STRING,

updated_time STRING

)

ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe= '= STORED AS TEXTFILE;

 

LOAD DATA LOCAL INPATH = '/export/home/me/json_data.txt' OVERWRITE INTO TABLE USERS;

 

 

When I run this code I get the error:

FAILED: Parse Error: line 10:0 mismatched input = 'location' expecting Identifier near ',' in column = specification

 

Can you please tell me what I’m doing wrong an= d= how to fix it=3F 

 

I appreciate it!!!

 

 

Regards,

 

Anurag G.

American Express made the following annotations on Sun Apr 01 2012= = 18:37:49
***************************************************************************= ***=
"This message and any attachments are solely for the intended recipient and= = may contain confidential or privileged information. If you are not the = intended recipient, any disclosure, copying, use, or distribution of the = information included in this message and any attachments is prohibited. If = = you have received this communication in error, please notify us by reply = e-mail and immediately and permanently delete this message and any = attachments. Thank you."
American Express a ajout=E9 le commentaire suivant le Sun Apr 01 2012 = 18:37:49
Ce courrier et toute pi=E8ce jointe qu'il contient sont r=E9serv=E9s au seu= l= destinataire indiqu=E9 et peuvent renfermer des renseignements = confidentiels et privil=E9gi=E9s. Si vous n'=EAtes pas le destinataire = pr=E9vu, toute divulgation, duplication, utilisation ou distribution du = courrier ou de toute pi=E8ce jointe est interdite. Si vous avez re=E7u cett= e= communication par erreur, veuillez nous en aviser par courrier et = d=E9truire imm=E9diatement le courrier et les pi=E8ces jointes. Merci.
***************************************************************************= ***


--------------------------------------------------------------= -----------------

--_000_DC0D0310B270F94D8A472D21CAE8B794A9F0EE024SN1MPN1012024d_--