Return-Path: X-Original-To: apmail-hive-user-archive@www.apache.org Delivered-To: apmail-hive-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D9C60112FC for ; Fri, 13 Jun 2014 11:34:24 +0000 (UTC) Received: (qmail 86186 invoked by uid 500); 13 Jun 2014 11:34:22 -0000 Delivered-To: apmail-hive-user-archive@hive.apache.org Received: (qmail 86111 invoked by uid 500); 13 Jun 2014 11:34:21 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 86103 invoked by uid 99); 13 Jun 2014 11:34:21 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 13 Jun 2014 11:34:21 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of harish.tangella@gmail.com designates 209.85.160.171 as permitted sender) Received: from [209.85.160.171] (HELO mail-yk0-f171.google.com) (209.85.160.171) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 13 Jun 2014 11:34:17 +0000 Received: by mail-yk0-f171.google.com with SMTP id 200so1924612ykr.16 for ; Fri, 13 Jun 2014 04:33:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=Yi4Gwrvi/5AayXIeV4KXwP3CBaoOSqRNOiDJjL545tw=; b=J+qG3KLnCCV2MSJKzUHG57mGmGJJ8B88HUEOuTYgu13HYUMH3STHtGTSLdS6r+zurH hGbgYecKb7DExTScaEtLPviVpObDR7AVv6bRzW4qfLys/6o9oEPTvOBpWrLnFufK6qeX eygC1VT72xYE1Kblz7VUyoVffw5WJuHQAN5SLilJGgQm8HJ676umqa9MI5FUCDVeTKUd wh8Abk6uqhInH2fMgG906g/JL0DSgRHoqW7ECEuZmyr7/S277/JBjQAWUMSpa7/yoOA2 WnycPVBSeJG+1aQG7nd/AmYLqcaKaHj3Iit+1dc5qcDDXzWLa2S98MZHIQ0yF2+zOL3T cpqg== MIME-Version: 1.0 X-Received: by 10.236.142.13 with SMTP id h13mr2471534yhj.155.1402659232991; Fri, 13 Jun 2014 04:33:52 -0700 (PDT) Received: by 10.170.189.132 with HTTP; Fri, 13 Jun 2014 04:33:52 -0700 (PDT) In-Reply-To: References: Date: Fri, 13 Jun 2014 17:03:52 +0530 Message-ID: Subject: Re: Loading xml to hive and fetching unbounded tags From: harish tangella To: user@hive.apache.org Content-Type: multipart/alternative; boundary=20cf303a322752cf4d04fbb60d6e X-Virus-Checked: Checked by ClamAV on apache.org --20cf303a322752cf4d04fbb60d6e Content-Type: text/plain; charset=UTF-8 Hi, We are trying to get the data in the form of rows not in columns ..We are able to get partial data by implementing RecordReader. Logic we have applied is - getting the xml with start and end tag as 'Row' as the result we get only the second row, expected is 2 rows.... Refering to below xml , Expected result is : 1 1 1 2 In case if we use Xpath.. we get the data in the column wise , when we do select APPLICATION_ID,APPLICATION_CODE from the table , we get 1,["1","2"] On Fri, Jun 13, 2014 at 4:01 PM, Knowledge gatherer < knowledge.gatherer.007@gmail.com> wrote: > Are you trying to capture this data in one column and use XPATH with UDF > to get the data. > > > On Wed, Jun 11, 2014 at 11:12 AM, harish tangella < > harish.tangella@gmail.com> wrote: > >> Hi, >> >> Request you to help. >> >> Fetching unbounded tags from the xml in hive >> >> We tried with xpath but unable to get all the unbounded tags. >> >> a sample xml file is >> >> >> >> 1 >> >> >> 1 >> >> >> 2 >> >> >> >> >> >> we are able to get the application code by giving [1] in appdetail. >> Request for help to get all the appdetail tags. >> > > --20cf303a322752cf4d04fbb60d6e Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi,
=C2=A0
We are trying to get the data in the form of rows not in columns ..We = are able to get partial data by implementing RecordReader. Logic we have ap= plied is - getting the xml with start and end tag as 'Row' as the r= esult we get only the second row, expected is 2 rows....
=C2=A0
Refering to below xml , Expected result is :
=C2=A0
=C2=A0<Row><APPLICATION_ID>1</APPLICA= TION_ID><AppDetails><AppDetail><APPLICATION_CODE>1<= /APPLICATION_CODE></AppDetail></AppDetails></Row>
=C2=A0
<Row><APPLICATION_ID>1</APPLICATION_I= D><AppDetails>&l= t;AppDetail><APPLICATION_CODE>2</APPLICATION_CODE></AppDetail></AppDetails></Row>
=C2=A0
In case if we use Xpath.. we get the data in the col= umn wise , when we do select=C2=A0 APPLICATION_ID,APPLICATION_CODE from the= table , we get 1,["1","2"]
=C2=A0

=C2=A0

=C2=A0

=C2=A0


=C2=A0
On Fri, Jun 13, 2014 at 4:01 PM, Knowledge gathe= rer <knowledge.gatherer.007@gmail.com> wrote:=
Are you trying to capture this data in one column and use XPATH with U= DF to get the data.


On Wed, Jun 11, 2014 at 11:12 AM, harish tangell= a <harish.tangella@gmail.com> wrote:
Hi,
=C2=A0
=C2=A0 Request you to help.
=C2=A0
=C2=A0 Fetching unbounded tags from the xml in hive
=C2=A0
=C2=A0We tried with xpath but unable to get all the unbounded tags.
=C2=A0
a sample xml file is
=C2=A0
<Rows>
<Row>
<APPLICATION_ID>1</APPLICATION_ID>
<AppDetails>
<AppDetail>=
<APPLICATION_CODE>1</APPLICATION_CODE>=
</AppDetail>
<Ap= pDetail>
<APPLICATION_CODE>2</APPLI= CATION_CODE>
</AppDetail>
</AppDe= tails>
</Row>
</Rows>

we are able to get the application code by = giving [1] in appdetail. Request for help to get all the appdetail tags.


--20cf303a322752cf4d04fbb60d6e--