Return-Path: X-Original-To: apmail-hive-user-archive@www.apache.org Delivered-To: apmail-hive-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 900F9CEEB for ; Sat, 2 Jun 2012 17:31:16 +0000 (UTC) Received: (qmail 78762 invoked by uid 500); 2 Jun 2012 17:31:15 -0000 Delivered-To: apmail-hive-user-archive@hive.apache.org Received: (qmail 78715 invoked by uid 500); 2 Jun 2012 17:31:15 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 78702 invoked by uid 99); 2 Jun 2012 17:31:15 -0000 Received: from minotaur.apache.org (HELO minotaur.apache.org) (140.211.11.9) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 02 Jun 2012 17:31:15 +0000 Received: from localhost (HELO mail-lpp01m010-f48.google.com) (127.0.0.1) (smtp-auth username hashutosh, mechanism plain) by minotaur.apache.org (qpsmtpd/0.29) with ESMTP; Sat, 02 Jun 2012 17:31:14 +0000 Received: by lagz14 with SMTP id z14so2902224lag.35 for ; Sat, 02 Jun 2012 10:31:12 -0700 (PDT) Received: by 10.112.48.39 with SMTP id i7mr3811767lbn.31.1338658272583; Sat, 02 Jun 2012 10:31:12 -0700 (PDT) MIME-Version: 1.0 Received: by 10.112.23.5 with HTTP; Sat, 2 Jun 2012 10:30:52 -0700 (PDT) In-Reply-To: References: From: Ashutosh Chauhan Date: Sat, 2 Jun 2012 10:30:52 -0700 Message-ID: Subject: Re: Querying HBase Records with null valued-columns using hive To: user@hive.apache.org Content-Type: multipart/alternative; boundary=bcaec554dcaad020f504c180aaea --bcaec554dcaad020f504c180aaea Content-Type: text/plain; charset=ISO-8859-1 Hey Sagar, Seems like you have inserted data in your hbase table directly through hbase client and not through hive client. If so, you need https://issues.apache.org/jira/browse/HIVE-1634 to correctly read pre-existing data in hbase. Hive-1634 is available as a part of 0.9 release. So, upgrade to 0.9 and your problem should go away. Hope it helps, Ashutosh On Fri, Jun 1, 2012 at 9:31 PM, sagar naik wrote: > I am using hive-0.7-cdh3u0 > > Thanks Again > > On Fri, Jun 1, 2012 at 9:20 PM, sagar naik wrote: > > Hi , > > I am seeing a very weird hive-hbase query behaviour. > > I have an externally mounted hbase table in hive > > > > > > select creation_ts, length(url), isnull(url), ! (isnull(url)) from > > task_table limit 10; > > Total MapReduce jobs = 1 > > Launching Job 1 out of 1 > > Number of reduce tasks is set to 0 since there's no reduce operator > > Starting Job = job_201206011557_0023, Tracking URL = > > http://xxxxxxxx:50030/jobdetails.jsp?jobid=job_201206011557_0023 > > Kill Command = /xxxxxx/xxxxx/xxxxx/../bin/hadoop job > > -Dmapred.job.tracker=xxxxxx:54311 -kill job_201206011557_0023 > > 2012-06-01 20:37:09,878 Stage-1 map = 0%, reduce = 0% > > 2012-06-01 20:37:15,920 Stage-1 map = 100%, reduce = 0% > > 2012-06-01 20:37:16,929 Stage-1 map = 100%, reduce = 100% > > Ended Job = job_201206011557_0023 > > OK > > 1337061992484 NULL false true > > 1334307650105 184 false true > > 1336532379103 229 false true > > 1335226875331 NULL false true > > 1335746654565 NULL false true > > 1335400140889 NULL false true > > 1338419117954 NULL false true > > 1338425256315 NULL false true > > 1336554120401 NULL false true > > 1338002526497 NULL false true > > Time taken: 10.528 seconds > > > > Notice that isnull(url) is false for all strings even if length reported > is null > > > > My ultimate aim is to get the number of records where url is null and > > join those records with another table > > > > > > > > I noticed that FilterOPerator passes (returns TRUE) > > however, when it is forwarded (forward (...,...) ) it returns FALSE :O > > > > Any pointers / help is highly appreciated. > > > > > > -Sagar > --bcaec554dcaad020f504c180aaea Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Hey Sagar,

Seems like you have inserted data in yo= ur hbase table directly through hbase client and not through hive client. I= f so, you need=A0https://issues.apache.org/jira/browse/HIVE-1634 to correctly read pre= -existing data in hbase. Hive-1634 is available as a part of 0.9 release. S= o, upgrade to 0.9 and your problem should go away.

Hope it helps,
Ashutosh
On Fri, Jun 1= , 2012 at 9:31 PM, sagar naik <snaik@attributor.com> wrot= e:
I am using hive-0.7-cdh3u0

Thanks Again

On Fri, Jun 1, 2012 at 9:20 PM, sagar naik <snaik@attributor.com> wrote:
> Hi ,
> I am seeing a very weird hive-hbase query behaviour.
> I have an externally mounted hbase table in hive
>
>
> select creation_ts, length(url), isnull(url), ! (isnull(url)) from
> task_table limit 10;
> Total MapReduce jobs =3D 1
> Launching Job 1 out of 1
> Number of reduce tasks is set to 0 since there's no reduce operato= r
> Starting Job =3D job_201206011557_0023, Tracking URL =3D
> http://xxxxxxxx:50030/jobdetails.jsp?jobid=3Djob= _201206011557_0023
> Kill Command =3D /xxxxxx/xxxxx/xxxxx/../bin/hadoop job
> -Dmapred.job.tracker=3Dxxxxxx:54311 -kill job_201206011557_0023
> 2012-06-01 20:37:09,878 Stage-1 map =3D 0%, =A0reduce =3D 0%
> 2012-06-01 20:37:15,920 Stage-1 map =3D 100%, =A0reduce =3D 0%
> 2012-06-01 20:37:16,929 Stage-1 map =3D 100%, =A0reduce =3D 100%
> Ended Job =3D job_201206011557_0023
> OK
> 1337061992484 NULL false true
> 1334307650105 184 false true
> 1336532379103 229 false true
> 1335226875331 NULL false true
> 1335746654565 NULL false true
> 1335400140889 NULL false true
> 1338419117954 NULL false true
> 1338425256315 NULL false true
> 1336554120401 NULL false true
> 1338002526497 NULL false true
> Time taken: 10.528 seconds
>
> Notice that isnull(url) is false for all strings even if length report= ed is null
>
> My ultimate aim is to get the number of records where url is null and<= br> > join those records with another table
>
>
>
> I noticed that FilterOPerator passes (returns TRUE)
> however, when it is forwarded (forward (...,...) ) it returns FALSE :O=
>
> Any pointers / help is highly appreciated.
>
>
> -Sagar

--bcaec554dcaad020f504c180aaea--