Return-Path: Delivered-To: apmail-hadoop-hive-user-archive@minotaur.apache.org Received: (qmail 99368 invoked from network); 29 Jun 2009 19:03:49 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 29 Jun 2009 19:03:49 -0000 Received: (qmail 77089 invoked by uid 500); 29 Jun 2009 19:03:57 -0000 Delivered-To: apmail-hadoop-hive-user-archive@hadoop.apache.org Received: (qmail 77066 invoked by uid 500); 29 Jun 2009 19:03:57 -0000 Mailing-List: contact hive-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hive-user@hadoop.apache.org Delivered-To: mailing list hive-user@hadoop.apache.org Received: (qmail 77037 invoked by uid 99); 29 Jun 2009 19:03:57 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 29 Jun 2009 19:03:57 +0000 X-ASF-Spam-Status: No, hits=-1.8 required=10.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of njain@facebook.com designates 69.63.179.25 as permitted sender) Received: from [69.63.179.25] (HELO mailout-sf2p.facebook.com) (69.63.179.25) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 29 Jun 2009 19:03:44 +0000 Received: from mail.thefacebook.com (intlb01.snat.snc1.facebook.com [10.128.203.18] (may be forged)) by pp02.snc1.tfbnw.net (8.14.1/8.14.1) with ESMTP id n5TJ3ECR027414 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=NOT) for ; Mon, 29 Jun 2009 12:03:14 -0700 Received: from SC-MBXC1.TheFacebook.com ([192.168.18.100]) by sc-hub01.TheFacebook.com ([192.168.18.104]) with mapi; Mon, 29 Jun 2009 12:03:21 -0700 From: Namit Jain To: "hive-user@hadoop.apache.org" Date: Mon, 29 Jun 2009 12:03:15 -0700 Subject: RE: getting the field types of a query result Thread-Topic: getting the field types of a query result Thread-Index: Acn4zesZ9CA8Hm44TJajlGlD9R+CbAAASnXoAASib1wAAGH5IAACOhiQ Message-ID: References: <68B7689C98024D43B4C2709456F0B5200A109C9B25@SC-MBXC1.TheFacebook.com> In-Reply-To: <68B7689C98024D43B4C2709456F0B5200A109C9B25@SC-MBXC1.TheFacebook.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-cr-hashedpuzzle: Bd90 CrLi Cygq C5vL Eqqk F5sm GXwa Ga8C Gj39 Gx9+ GyOe Iw7I I/LS J07k KTtl K5GG;1;aABpAHYAZQAtAHUAcwBlAHIAQABoAGEAZABvAG8AcAAuAGEAcABhAGMAaABlAC4AbwByAGcA;Sosha1_v1;7;{1338A129-FA07-45BE-A4EE-88F322DCB42B};bgBqAGEAaQBuAEAAZgBhAGMAZQBiAG8AbwBrAC4AYwBvAG0A;Mon, 29 Jun 2009 19:03:15 GMT;UgBFADoAIABnAGUAdAB0AGkAbgBnACAAdABoAGUAIABmAGkAZQBsAGQAIAB0AHkAcABlAHMAIABvAGYAIABhACAAcQB1AGUAcgB5ACAAcgBlAHMAdQBsAHQA x-cr-puzzleid: {1338A129-FA07-45BE-A4EE-88F322DCB42B} acceptlanguage: en-US Content-Type: multipart/alternative; boundary="_000_DFD95197F3AE8C45B0A96C2F4BA3A2556C83375E73SCMBXC1TheFac_" MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=1.12.8161:2.4.5,1.2.40,4.0.166 definitions=2009-06-29_06:2009-06-25,2009-06-29,2009-06-29 signatures=0 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 ipscore=0 phishscore=0 bulkscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx engine=5.0.0-0811170000 definitions=main-0906290143 X-Virus-Checked: Checked by ClamAV on apache.org --_000_DFD95197F3AE8C45B0A96C2F4BA3A2556C83375E73SCMBXC1TheFac_ Content-Type: text/plain; charset="iso-2022-jp" Content-Transfer-Encoding: quoted-printable Had a discussion with Raghu/Zheng offline. Basically, genFileSinkPlan is not passing the type information - that will = be added, and then LazySerDe will support serializing into json even if the= types are provided From: Ashish Thusoo [mailto:athusoo@facebook.com] Sent: Monday, June 29, 2009 11:06 AM To: hive-user@hadoop.apache.org Subject: RE: getting the field types of a query result Not sure what the issue is here, but SemanticAnalyzer does do inferencing o= f types including that for udfs. I think this is not exposed out through an= API, but that can be easily added. Ashish ________________________________ From: Prasad Chakka [mailto:pchakka@facebook.com] Sent: Monday, June 29, 2009 10:47 AM To: hive-user@hadoop.apache.org Subject: Re: getting the field types of a query result SemanticAnalyzer should set the correct schema of the output result during = compilation of the query. Doing the way suggested below is not the right wa= y. If SemanticAnalyzer is not doing this correct thing then we should fix i= t. ________________________________ From: David Lerman Reply-To: Date: Mon, 29 Jun 2009 08:34:25 -0700 To: Subject: Re: getting the field types of a query result If I'm following's Min's Jira, the challenge is that I would need to parse the query, find the selected fields and look up their type via metaserver, then look at all the UDF's to determine their output type given the inputs -- which is a bit redundant since we already did all that work in executing the query. Min, have you settled on an approach for this in the JDBC driver? > From: He Yongqiang > Date: Mon, 29 Jun 2009 04:33:59 -0700 > To: > Subject: Re: getting the field types of a query result > > If I understanding correctly, I think Prasad means is that the type info = of > each column is stored in Hive metadata. And you can fetch that informatio= n by > HiveServer or JDBC client (and you need to set up a remote hive metaserve= r). > > Yongqiang > >On 09-6-29 =1B$B2<8a=1B(B6:42, "Min Zhou" wrote: >> Hi, >> >> I've came across the same problem when developing jdbc for >> hive(https://issues.apache.org/jira/browse/HIVE-576). it had nothing to = do >> with HiveServer and jdbc. I thought currently there is no? good way solv= ing >> it. SenmanticAnalyzer needs to get the result's type returned by udf/uda= f for >> building the schema you mentioned. we should also consider the user defi= ned >> type . >> >> Regards, >> Min On 6/29/09 11:27 AM, "hive-user-help@hadoop.apache.org" wrote: > > > hive-user Digest of: get > > Topics (messages 947 through 950): > > getting the field types of a query result > 947 by: David Lerman > 948 by: Prasad Chakka > 949 by: Min Zhou > 950 by: He Yongqiang > > Administrivia: > > > --- Administrative commands for the hive-user list --- > > I can handle administrative requests automatically. Please > do not send them to the list address! Instead, send > your message to the correct command address: > > To subscribe to the list, send a message to: > > > To remove your address from the list, send a message to: > > > Send mail to the following for info and FAQ for this list: > > > > Similar addresses exist for the digest list: > > > > To get messages 123 through 145 (a maximum of 100 per request), mail: > > > To get an index with subject and author for messages 123-456 , mail: > > > They are always returned as sets of 100, max 2000 per request, > so you'll actually get 100-499. > > To receive all messages with the same subject as message 12345, > send a short message to: > > > The messages should contain one line or word of text to avoid being > treated as sp@m, but I will ignore their content. > Only the ADDRESS you send to is important. > > You can start a subscription for an alternate address, > for example "john@host.domain", just add a hyphen and your > address (with '=3D' instead of '@') after the command word: > > > To stop subscription for this address, mail: > > > In both cases, I'll send a confirmation message to that address. When > you receive it, simply reply to it to complete your subscription. > > If despite following these instructions, you do not get the > desired results, please contact my owner at > hive-user-owner@hadoop.apache.org. Please be patient, my owner is a > lot slower than I am ;-) > > --- Enclosed is a copy of the request I received. > > Return-Path: > Received: (qmail 3099 invoked by uid 99); 29 Jun 2009 15:27:40 -0000 > Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) > by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 29 Jun 2009 15:27:40 +0= 000 > X-ASF-Spam-Status: No, hits=3D-0.0 required=3D10.0 > tests=3DSPF_PASS > X-Spam-Check-By: apache.org > Received-SPF: pass (nike.apache.org: local policy) > Received: from [8.8.14.102] (HELO smtp001.evlta.videoegg.com) (8.8.14.102= ) > by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 29 Jun 2009 15:27:31 +0= 000 > X-IronPort-AV: E=3DSophos;i=3D"4.42,309,1243839600"; > d=3D"scan'208";a=3D"2277316" > Received: from troca1.evlta.ad.videoegg.com ([10.11.12.11]) > by smtp002.evlta.videoegg.com with ESMTP; 29 Jun 2009 08:27:10 -0700 > Received: from troca1.evlta.ad.videoegg.com ([10.11.12.11]) by > troca1.evlta.ad.videoegg.com ([10.11.12.11]) with mapi; Mon, 29 Jun 2009 > 08:25:32 -0700 > From: David Lerman > To: "hive-user-get@hadoop.apache.org" > Date: Mon, 29 Jun 2009 08:27:07 -0700 > Subject: > Thread-Topic: > Thread-Index: Acn4zg/ffeKshPhRWEmMgyYFzoSarw=3D=3D > Message-ID: > > Accept-Language: en-US > Content-Language: en > X-MS-Has-Attach: > X-MS-TNEF-Correlator: > acceptlanguage: en-US > Content-Type: text/plain; charset=3D"iso-8859-1" > Content-Transfer-Encoding: quoted-printable > MIME-Version: 1.0 > X-Virus-Checked: Checked by ClamAV on apache.org > > > > > ---------------------------------------------------------------------- > --_000_DFD95197F3AE8C45B0A96C2F4BA3A2556C83375E73SCMBXC1TheFac_ Content-Type: text/html; charset="iso-2022-jp" Content-Transfer-Encoding: quoted-printable Re: getting the field types of a query result

Had a discussion with Raghu/Zheng offline.=

Basically, genFileSinkPlan is not passing the type informati= on – that will be added, and then LazySerDe will support serializing into json e= ven if the types are provided

 

 

From: Ashish Thusoo [mailto:athusoo@facebook.com]
Sent: Monday, June 29, 2009 11:06 AM
To: hive-user@hadoop.apache.org
Subject: RE: getting the field types of a query result

 

Not sure what the issue is here, but SemanticAnalyzer does do inferencing of types including that for udfs. I think this is not exposed o= ut through an API, but that can be easily added.

=  

Ashish

 


From: Prasad Chakka [mailto:pchakka@facebook.= com]
Sent: Monday, June 29, 2009 10:47 AM
To: hive-user@hadoop.apache.org
Subject: Re: getting the field types of a query result

SemanticAnalyzer should set the correct schema of the output result during compilation of the query. Doing the way suggested below is not the right way. If SemanticAnalyzer is not doing this correct thing then we should fix it.


From: David Lerman <dlerman@videoegg.com>
Reply-To: <hive-user@hado= op.apache.org>
Date: Mon, 29 Jun 2009 08:34:25 -0700
To: <hive-user@hadoop.apa= che.org>
Subject: Re: getting the field types of a query result

If I'm following's Min's Jira, the challenge is that I would need to parse<= br> the query, find the selected fields and look up their type via metaserver,<= br> then look at all the UDF's to determine their output type given the inputs<= br> -- which is a bit redundant since we already did all that work in executing=
the query.  Min, have you settled on an approach for this in the JDBC<= br> driver?


> From: He Yongqiang <heyo= ngqiang@software.ict.ac.cn>
> Date: Mon, 29 Jun 2009 04:33:59 -0700
> To: <hive-user@hadoop.apach= e.org>
> Subject: Re: getting the field types of a query result
>
> If I understanding correctly, I think Prasad means is that the type in= fo of
> each column is stored in Hive metadata. And you can fetch that informa= tion by
> HiveServer or JDBC client (and you need to set up a remote hive metaserver).
>
> Yongqiang
>
>On 09-6-29
=1B$B2<8a= =1B(B6:42, "M= in Zhou" <coderplay@gmail.com> = wrote:
>> Hi,
>>
>> I've came across the same problem when developing jdbc for
>> hive(ht= tps://issues.apache.org/jira/browse/HIVE-576). it had nothing to do
>> with HiveServer and jdbc. I thought currently there is no? good wa= y solving
>> it. SenmanticAnalyzer needs to get the result's type returned by udf/udaf for
>> building the schema you mentioned. we should also consider the use= r defined
>> type .
>>
>> Regards,
>> Min



On 6/29/09 11:27 AM, "hiv= e-user-help@hadoop.apache.org"
<hive-user-help@hadoop.apac= he.org> wrote:

>
>
> hive-user Digest of: get
>
> Topics (messages 947 through 950):
>
> getting the field types of a query result
>         947 by: David Lerman >         948 by: Prasad Chakka<= br> >         949 by: Min Zhou
>         950 by: He Yongqiang >
> Administrivia:
>
>
> --- Administrative commands for the hive-user list ---
>
> I can handle administrative requests automatically. Please
> do not send them to the list address! Instead, send
> your message to the correct command address:
>
> To subscribe to the list, send a message to:
>    <hive-user-subscribe@hadoop.apache.org>
>
> To remove your address from the list, send a message to:
>    <hive-user-unsubscribe@hadoop.apache.org>
>
> Send mail to the following for info and FAQ for this list:
>    <hiv= e-user-info@hadoop.apache.org>
>    <hive= -user-faq@hadoop.apache.org>
>
> Similar addresses exist for the digest list:
>    <hive-user-digest-subs= cribe@hadoop.apache.org>
>    <hive-user-digest-un= subscribe@hadoop.apache.org>
>
> To get messages 123 through 145 (a maximum of 100 per request), mail:<= br> >    <hive-user-get.123_145@hadoop.apache.org>
>
> To get an index with subject and author for messages 123-456 , mail: >    <hive-user-index.123_456@hadoop.apache.org>
>
> They are always returned as sets of 100, max 2000 per request,
> so you'll actually get 100-499.
>
> To receive all messages with the same subject as message 12345,
> send a short message to:
>    <hive-user-thread.12345@hadoop.apache.org>
>
> The messages should contain one line or word of text to avoid being > treated as sp@m, but I will ignore their content.
> Only the ADDRESS you send to is important.
>
> You can start a subscription for an alternate address,
> for example "john@host.domain&qu= ot;, just add a hyphen and your
> address (with '=3D' instead of '@') after the command word:
> <hive-user-subscribe-john=3Dhost.domain@hadoop.apache.org>
>
> To stop subscription for this address, mail:
> <hive-user-unsubscribe-john=3Dhost.domain@hadoop.apache.org>
>
> In both cases, I'll send a confirmation message to that address. When<= br> > you receive it, simply reply to it to complete your subscription.
>
> If despite following these instructions, you do not get the
> desired results, please contact my owner at
> hive-user-owner@hadoop.a= pache.org. Please be patient, my owner is a
> lot slower than I am ;-)
>
> --- Enclosed is a copy of the request I received.
>
> Return-Path: <dlerman@videoegg.com= >
> Received: (qmail 3099 invoked by uid 99); 29 Jun 2009 15:27:40 -0000 > Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230)=
>     by apache.org (qpsmtpd/0.29) with ESMTP; Mon, = 29 Jun 2009 15:27:40 +0000
> X-ASF-Spam-Status: No, hits=3D-0.0 required=3D10.0
>         tests=3DSPF_PASS
> X-Spam-Check-By: apache.org
> Received-SPF: pass (nike.apache.org: local policy)
> Received: from [8.8.14.102] (HELO smtp001.evlta.videoegg.com) (8.8.14.= 102)
>     by apache.org (qpsmtpd/0.29) with ESMTP; Mon, = 29 Jun 2009 15:27:31 +0000
> X-IronPort-AV: E=3DSophos;i=3D"4.42,309,1243839600";
>    d=3D"scan'208";a=3D"2277316"
> Received: from troca1.evlta.ad.videoegg.com ([10.11.12.11])
>   by smtp002.evlta.videoegg.com with ESMTP; 29 Jun 2009 08:2= 7:10 -0700
> Received: from troca1.evlta.ad.videoegg.com ([10.11.12.11]) by
>  troca1.evlta.ad.videoegg.com ([10.11.12.11]) with mapi; Mon, 29 = Jun 2009
>  08:25:32 -0700
> From: David Lerman <dlerman@videoe= gg.com>
> To: "hive-user-get@ha= doop.apache.org" <hive-user-get@hadoop.apache= .org>
> Date: Mon, 29 Jun 2009 08:27:07 -0700
> Subject: <no subject>
> Thread-Topic: <no subject>
> Thread-Index: Acn4zg/ffeKshPhRWEmMgyYFzoSarw=3D=3D
> Message-ID: <C66E= 558B.11D9F%dlerman@videoegg.com>
> Accept-Language: en-US
> Content-Language: en
> X-MS-Has-Attach:
> X-MS-TNEF-Correlator:
> acceptlanguage: en-US
> Content-Type: text/plain; charset=3D"iso-8859-1"
> Content-Transfer-Encoding: quoted-printable
> MIME-Version: 1.0
> X-Virus-Checked: Checked by ClamAV on apache.org
>
>
>
>
> ----------------------------------------------------------------------=
>

--_000_DFD95197F3AE8C45B0A96C2F4BA3A2556C83375E73SCMBXC1TheFac_--