hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nadya Privalova <nprival...@mirantis.com>
Subject Re: SingleColumnValueFilter with setFilterIfMissing=True behaviour
Date Fri, 11 Apr 2014 09:52:59 GMT
I tried 0.96 and 0.98 hadoop 1. Behaviour is the same. HBase configuration
is the most simple, from QuickStartGuide.

Nadya


On Fri, Apr 11, 2014 at 1:43 PM, Anoop John <anoop.hbase@gmail.com> wrote:

> Which version you are using?
>
> -Anoop-
>
> On Fri, Apr 11, 2014 at 2:42 PM, Nadya Privalova <nprivalova@mirantis.com
> >wrote:
>
> > Hello folks,
> > I saw a lot of questions about SingleColumnValueFilter but I have one
> more.
> > The question is about setFilterIfMissing value and it's behaviour with
> > other column families.
> > I have the following in my db:
> > hbase(main):004:0> scan 'c59d09d425244b9bb216a229c2441819_resource'
> > ROW
> > COLUMN+CELL
> >
> >  resource-id
> column=f:project_id,
> > timestamp=1397138905401,
> > value="project-id"
> >
> >  resource-id                                           column=f:s_test-1,
> > timestamp=1397138905401,
> > value="1"
> >
> >  resource-id
> > column=m:9222030811254775807+test-1+instance!cumulative!,
> > timestamp=1397138905401, value={"$date":
> > 1341225600000}
> >  resource-id
> > column=m:9222030811314775807+test-1+instance!cumulative!,
> > timestamp=1397138905377, value={"$date":
> > 1341225540000}
> >  resource-id-2
> column=f:project_id,
> > timestamp=1397138905422,
> > value="project-id-2"
> >
> >  resource-id-2                                         column=f:s_test,
> > timestamp=1397138905422,
> > value="1"
> >
> >
> >  resource-id-2
> > column=m:9222030811134775807+test+instance!cumulative!,
> > timestamp=1397138905422, value={"$date": 1341225720000}
> >
> > After filter applying  I see the following:
> > hbase(main):005:0> scan 'c59d09d425244b9bb216a229c2441819_resource',
> > {FILTER => "(SingleColumnValueFilter ('f', 's_test-1', =, 'binary:\"1\"',
> > true, false))"}
> > ROW
> > COLUMN+CELL
> >
> >  resource-id
> column=f:project_id,
> > timestamp=1397138905401,
> > value="project-id"
> >
> >  resource-id                                           column=f:s_test-1,
> > timestamp=1397138905401,
> > value="1"
> >
> >
> >  resource-id
> > column=m:9222030811254775807+test-1+instance!cumulative!,
> > timestamp=1397138905401, value={"$date":
> > 1341225600000}
> >  resource-id
> > column=m:9222030811314775807+test-1+instance!cumulative!,
> > timestamp=1397138905377, value={"$date":
> > 1341225540000}
> >  resource-id-2
> > column=m:9222030811134775807+test+instance!cumulative!,
> > timestamp=1397138905422, value={"$date": 1341225720000}
> >
> > I wonder why I see  resource-id-2 in output even with
> > setFilterIfMissing==True? Row with id 'resource-id-2' doesn't contain
> > "f:s_test-1", it contains only "f:s_test". From docs about
> > setFilterIfMissing: "If true, the entire row will be skipped if the
> column
> > is not found."
> > So column 's_test-1' is not found in resource-id-2. But I still see this
> > row (but only 'm' CF) in output.
> > Of course I can determine COLUMNS=['f'] and resource-id-2 will not be
> > shown. But I can't because I need values from 'm'.
> > Could you please comment this behaviour?
> >
> > Thanks,
> > Nadya
> > (From OpenStack Ceilometer team)
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message