Return-Path: X-Original-To: apmail-hive-user-archive@www.apache.org Delivered-To: apmail-hive-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3F974115D1 for ; Thu, 24 Jul 2014 05:35:20 +0000 (UTC) Received: (qmail 82526 invoked by uid 500); 24 Jul 2014 05:35:18 -0000 Delivered-To: apmail-hive-user-archive@hive.apache.org Received: (qmail 82453 invoked by uid 500); 24 Jul 2014 05:35:18 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 82443 invoked by uid 99); 24 Jul 2014 05:35:18 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 Jul 2014 05:35:18 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of dingguitao@baixing.com designates 209.85.216.54 as permitted sender) Received: from [209.85.216.54] (HELO mail-qa0-f54.google.com) (209.85.216.54) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 Jul 2014 05:35:14 +0000 Received: by mail-qa0-f54.google.com with SMTP id k15so2404258qaq.41 for ; Wed, 23 Jul 2014 22:34:53 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:content-type; bh=zvTBj0Jnd7QoH3jyN46AGPZQt6BZ1ZI8EOHQK4GBC/8=; b=HO3CA1Iw42fIca1xL5e6aKlG/JvJHyXc/d/GHQRH7w6wGsPPSQMMCoj1EBlfqn/UM0 JPchGRl78hZVFRzYblsemQl4/v1eoWKN6CSOz92baNayB7oyAeNQ/jyit9II3I3mij0y 8ednOJ6GEcCcg35CrJ7NhQ00mzymBQP5l0Z6g0drSwE8FCxLIHHExkgBA+v5wVcbNiSE Kc3jz9B7AkLRNGitFRhPEw7Z0It/OU8xMMAupVtj6pSXpMShA73US1HiibNap5gLvsii NHZCKP6FjSb+kG5ai985vfhbUsLSHvhkdqL6BO9M1r/k+xPZPir697uTn7O6hZ9KgogY 9yrw== X-Gm-Message-State: ALoCoQnnX/LFoykzr4nuyu1EsjdZL9bWLdjuq5stF5JKBext+1beJy23QcaFeWZX/F4pcHm049dH X-Received: by 10.140.86.147 with SMTP id p19mr10451962qgd.66.1406180092910; Wed, 23 Jul 2014 22:34:52 -0700 (PDT) MIME-Version: 1.0 Received: by 10.96.92.103 with HTTP; Wed, 23 Jul 2014 22:34:32 -0700 (PDT) In-Reply-To: References: From: =?UTF-8?B?5LiB5qGC5rab77yI5qGC6Iqx77yJ?= Date: Thu, 24 Jul 2014 13:34:32 +0800 Message-ID: Subject: Re: Hive UDF gives duplicate result regardless of parameters, when nested in a subquery To: user@hive.apache.org Content-Type: multipart/alternative; boundary=001a11c12750edba7804fee9d0a7 X-Virus-Checked: Checked by ClamAV on apache.org --001a11c12750edba7804fee9d0a7 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Yes. The output is correct: ["tp","p","sp"]. I developed the UDF using JAVA in eclipse and exported the jar file into the auxlib directory of hive. Then add the following line into the ~/.hiverc file. create temporary function getad as 'xxxxxxx'; The hive version is 0.12.0. Perhaps the problem resulted from the mis-optimization of hive. On Thu, Jul 24, 2014 at 1:11 PM, Jie Jin wrote: > Have you tried this query without UDF, say: > > select > array(tp, p, sp) as ps > from > ( > select > 'tp' as tp, > 'p' as p, > 'sp' as sp > from > table_name > where > id =3D xxxx > ) t; > > > =E2=80=8BAnd how you implement the UDF?=E2=80=8B > > > =E8=B0=A2=E8=B0=A2 > =E9=87=91=E6=9D=B0 (Jie Jin) > > > On Wed, Jul 23, 2014 at 1:34 PM, =E4=B8=81=E6=A1=82=E6=B6=9B=EF=BC=88=E6= =A1=82=E8=8A=B1=EF=BC=89 wrote: > >> Recently I developed a Hive Generic UDF *getad*. It accepts a map type >> and a string type parameter and outputs a string value. But I found the = UDF >> output really confusing in different conditions. >> >> Condition A: >> >> select >> getad(map_col, 'tp') as tp, >> getad(map_col, 'p') as p, >> getad(map_col, 'sp') as sp >> from >> table_name >> where >> id =3D xxxx; >> >> The output is right: 'tp', 'p', 'sp'. >> >> Condition B: >> >> select >> array(tp, p, sp) as ps >> from >> ( >> select >> getad(map_col, 'tp') as tp, >> getad(map_col, 'p') as p, >> getad(map_col, 'sp') as sp >> from >> table_name >> where >> id =3D xxxx >> ) t; >> >> The output is wrong: 'tp', 'tp', 'tp'. And the following query outputs >> the same result: >> >> select >> array( >> getad(map_col, 'tp'), >> getad(map_col, 'p'), >> getad(map_col, 'sp') >> ) as ps >> from >> table_name >> where >> id =3D xxxx; >> >> Could you please provide me some hints on this? Thanks! >> >> -- >> =E4=B8=81=E6=A1=82=E6=B6=9B >> > > --=20 =E4=B8=81=E6=A1=82=E6=B6=9B --001a11c12750edba7804fee9d0a7 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Yes. The output is correct:=C2=A0["tp","p&q= uot;,"sp"].

I developed the UDF using JAVA in = eclipse and exported the jar file into the auxlib directory of hive. Then a= dd the following line into the ~/.hiverc file.

create temporary function getad as 'xxxxxxx';

The hive version is 0.12.0. Perhaps the problem res= ulted from the mis-optimization of hive.


On Thu, Jul 24, 2014 at 1:11 PM, Jie Jin= <hellojinjie@gmail.com> wrote:
Have you tried this= query without UDF, say:


select
  array(tp, p, sp) as ps
from
  (
  select
    'tp' as tp,
    'p' as p,
    'sp' as sp
  from
    table_name
  where
    id =3D xxxx
  ) t;

=E2=80=8BAnd how you i= mplement the UDF?=E2=80=8B


=E8=B0=A2=E8=B0=A2
<= div>=E9=87=91= =E6=9D=B0 (Jie Jin)


On Wed, Jul 23, 2014 at 1:34 PM, =E4=B8= =81=E6=A1=82=E6=B6=9B=EF=BC=88=E6=A1=82=E8=8A=B1=EF=BC=89 <dingguitao= @baixing.com> wrote:

Recently I developed a Hive Generic UDF=C2=A0getad= . It accepts a map type and a string type parameter and outputs a string va= lue. But I found the UDF output really confusing in different conditions.

Condition A:


select
  getad(map_col, 'tp') as tp,
  getad(map_col, 'p') as p,
  getad(map_col, 'sp') as sp
from
  table_name
where
  id =3D xxxx;

The output is right: 'tp', 'p', 'sp'.

Condition B:


select
  array(tp, p, sp) as ps
from
  (
  select
    getad(map_col, 'tp') as tp,
    getad(map_col, 'p') as p,
    getad(map_col, 'sp') as sp
  from
    table_name
  where
    id =3D xxxx
  ) t;

The output is wrong: 'tp', 'tp', 'tp'. And the foll= owing query outputs the same result:


select
  array(
    getad(map_col, 'tp'),
    getad(map_col, 'p'),
    getad(map_col, 'sp')
  ) as ps
from
  table_name
where
  id =3D xxxx;

Could you please provide me some hints on this? Thanks!


--
=E4=B8=81=E6=A1=82=E6=B6=9B




--
=E4=B8=81=E6=A1=82=E6=B6=9B
--001a11c12750edba7804fee9d0a7--