Return-Path: X-Original-To: apmail-hive-user-archive@www.apache.org Delivered-To: apmail-hive-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 92DE1C5CD for ; Mon, 11 Jun 2012 20:19:10 +0000 (UTC) Received: (qmail 46530 invoked by uid 500); 11 Jun 2012 20:19:09 -0000 Delivered-To: apmail-hive-user-archive@hive.apache.org Received: (qmail 46306 invoked by uid 500); 11 Jun 2012 20:19:09 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 46298 invoked by uid 99); 11 Jun 2012 20:19:09 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 11 Jun 2012 20:19:09 +0000 X-ASF-Spam-Status: No, hits=2.0 required=5.0 tests=FSL_RCVD_USER,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,URI_NOVOWEL,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of snaik@attributor.com designates 209.85.215.48 as permitted sender) Received: from [209.85.215.48] (HELO mail-lpp01m010-f48.google.com) (209.85.215.48) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 11 Jun 2012 20:19:04 +0000 Received: by lagz14 with SMTP id z14so3834604lag.35 for ; Mon, 11 Jun 2012 13:18:42 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:x-gm-message-state; bh=Ihft4LJdjvR1+f1x+KJoZVdBxg2MDg33jI3ckPk5HaQ=; b=XHzdsIlQ7TgZJuFvSIZrYKbvO76XKEgXVfJKL4aFeOLW5rcp8kmiQwkjl087o32w8i PGGd3PXeMpG+1CzlEO/+kzkCXrLq1aZZR0Q4C36TUVVwTOv2O953m5YfXfYjaW9Y/htT HOqNfgczaIt/0RnC2X3juGrWCUYjFryJWFL5VPdb0k3PozGd5pB0SHLJlT2ep+w3d8R0 1FDFrrvPYC/zwcfKeWCvQFKU8gQSAhlpP3l+OfttyK1Xg1deFKKUWCVcz85vtai4KEvr Qqmmy7oPaZPaDenJiWyc93NXSKzTsDiG0Iv3rOMJYJekFJkA8vABU6omV89T1iubM6lJ iudA== Received: by 10.112.83.229 with SMTP id t5mr3742793lby.8.1339445922268; Mon, 11 Jun 2012 13:18:42 -0700 (PDT) MIME-Version: 1.0 Received: by 10.112.102.65 with HTTP; Mon, 11 Jun 2012 13:18:21 -0700 (PDT) In-Reply-To: References: From: sagar naik Date: Mon, 11 Jun 2012 13:18:21 -0700 Message-ID: Subject: Re: Querying HBase Records with null valued-columns using hive To: user@hive.apache.org Content-Type: multipart/alternative; boundary=f46d04016d9d64893204c2380ecd X-Gm-Message-State: ALoCoQkzmKVJoKqSohHOJYKR8GuRn8KqH6rC6Ya/8atseAT+7A7nryzpA+WJYmbOCYNUQZswkpSo X-Virus-Checked: Checked by ClamAV on apache.org --f46d04016d9d64893204c2380ecd Content-Type: text/plain; charset=ISO-8859-1 Sorry for the late reply, Ashutosh. Thanks for the pointers. I will soon try it out with the hive version -Sagar On Sat, Jun 2, 2012 at 10:30 AM, Ashutosh Chauhan wrote: > Hey Sagar, > > Seems like you have inserted data in your hbase table directly through > hbase client and not through hive client. If so, you need > https://issues.apache.org/jira/browse/HIVE-1634 to correctly read > pre-existing data in hbase. Hive-1634 is available as a part of 0.9 > release. So, upgrade to 0.9 and your problem should go away. > > Hope it helps, > Ashutosh > > On Fri, Jun 1, 2012 at 9:31 PM, sagar naik wrote: > >> I am using hive-0.7-cdh3u0 >> >> Thanks Again >> >> On Fri, Jun 1, 2012 at 9:20 PM, sagar naik wrote: >> > Hi , >> > I am seeing a very weird hive-hbase query behaviour. >> > I have an externally mounted hbase table in hive >> > >> > >> > select creation_ts, length(url), isnull(url), ! (isnull(url)) from >> > task_table limit 10; >> > Total MapReduce jobs = 1 >> > Launching Job 1 out of 1 >> > Number of reduce tasks is set to 0 since there's no reduce operator >> > Starting Job = job_201206011557_0023, Tracking URL = >> > http://xxxxxxxx:50030/jobdetails.jsp?jobid=job_201206011557_0023 >> > Kill Command = /xxxxxx/xxxxx/xxxxx/../bin/hadoop job >> > -Dmapred.job.tracker=xxxxxx:54311 -kill job_201206011557_0023 >> > 2012-06-01 20:37:09,878 Stage-1 map = 0%, reduce = 0% >> > 2012-06-01 20:37:15,920 Stage-1 map = 100%, reduce = 0% >> > 2012-06-01 20:37:16,929 Stage-1 map = 100%, reduce = 100% >> > Ended Job = job_201206011557_0023 >> > OK >> > 1337061992484 NULL false true >> > 1334307650105 184 false true >> > 1336532379103 229 false true >> > 1335226875331 NULL false true >> > 1335746654565 NULL false true >> > 1335400140889 NULL false true >> > 1338419117954 NULL false true >> > 1338425256315 NULL false true >> > 1336554120401 NULL false true >> > 1338002526497 NULL false true >> > Time taken: 10.528 seconds >> > >> > Notice that isnull(url) is false for all strings even if length >> reported is null >> > >> > My ultimate aim is to get the number of records where url is null and >> > join those records with another table >> > >> > >> > >> > I noticed that FilterOPerator passes (returns TRUE) >> > however, when it is forwarded (forward (...,...) ) it returns FALSE :O >> > >> > Any pointers / help is highly appreciated. >> > >> > >> > -Sagar >> > > --f46d04016d9d64893204c2380ecd Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Sorry for the late reply, Ashutosh.

Thanks for the point= ers.
I will soon try it out with the hive version

<= /div>
-Sagar

On Sat, Jun 2, 2012 at 1= 0:30 AM, Ashutosh Chauhan <hashutosh@apache.org> wrote:
Hey Sagar,

Seems li= ke you have inserted data in your hbase table directly through hbase client= and not through hive client. If so, you need=A0https://issues.apache.or= g/jira/browse/HIVE-1634 to correctly read pre-existing data in hbase. H= ive-1634 is available as a part of 0.9 release. So, upgrade to 0.9 and your= problem should go away.

Hope it helps,
Ashutosh

On Fri, Jun 1, 2012 at 9:31 PM, sagar naik <= snaik@attributor.= com> wrote:
I am using hive-0.7-cdh3u0

Thanks Again

On Fri, Jun 1, 2012 at 9:20 PM, sagar naik <snaik@attributor.com> wrote:
> Hi ,
> I am seeing a very weird hive-hbase query behaviour.
> I have an externally mounted hbase table in hive
>
>
> select creation_ts, length(url), isnull(url), ! (isnull(url)) from
> task_table limit 10;
> Total MapReduce jobs =3D 1
> Launching Job 1 out of 1
> Number of reduce tasks is set to 0 since there's no reduce operato= r
> Starting Job =3D job_201206011557_0023, Tracking URL =3D
> http://xxxxxxxx:50030/jobdetails.jsp?jobid=3Djob= _201206011557_0023
> Kill Command =3D /xxxxxx/xxxxx/xxxxx/../bin/hadoop job
> -Dmapred.job.tracker=3Dxxxxxx:54311 -kill job_201206011557_0023
> 2012-06-01 20:37:09,878 Stage-1 map =3D 0%, =A0reduce =3D 0%
> 2012-06-01 20:37:15,920 Stage-1 map =3D 100%, =A0reduce =3D 0%
> 2012-06-01 20:37:16,929 Stage-1 map =3D 100%, =A0reduce =3D 100%
> Ended Job =3D job_201206011557_0023
> OK
> 1337061992484 NULL false true
> 1334307650105 184 false true
> 1336532379103 229 false true
> 1335226875331 NULL false true
> 1335746654565 NULL false true
> 1335400140889 NULL false true
> 1338419117954 NULL false true
> 1338425256315 NULL false true
> 1336554120401 NULL false true
> 1338002526497 NULL false true
> Time taken: 10.528 seconds
>
> Notice that isnull(url) is false for all strings even if length report= ed is null
>
> My ultimate aim is to get the number of records where url is null and<= br> > join those records with another table
>
>
>
> I noticed that FilterOPerator passes (returns TRUE)
> however, when it is forwarded (forward (...,...) ) it returns FALSE :O=
>
> Any pointers / help is highly appreciated.
>
>
> -Sagar


--f46d04016d9d64893204c2380ecd--