hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lars hofhansl <la...@apache.org>
Subject Re: Re: How to quickly count the rows that meet several conditions using hbase coprocessor
Date Mon, 20 Jan 2014 07:41:00 GMT
The real fix is in the parent (HBASE-9428), though.

-- Lars



________________________________
 From: Ted Yu <yuzhihong@gmail.com>
To: "user@hbase.apache.org" <user@hbase.apache.org> 
Sent: Sunday, January 19, 2014 9:22 PM
Subject: Re: Re: How to quickly count the rows that meet several conditions using hbase coprocessor
 

bq. The HBase version 0.94.6-cdh4.3.1

That explains it :-)

See HBASE-9711

Please upgrade your HBase.

Cheers



On Sun, Jan 19, 2014 at 9:08 PM, leiwangouc@gmail.com
<leiwangouc@gmail.com>wrote:

>
> It makes no difference even i change it to a single character "A".
>
> Thanks,
> Lei
>
>
>
>
> leiwangouc@gmail.com
>
> From: Ted Yu
> Date: 2014-01-18 14:28
> To: user@hbase.apache.org
> CC: user; lars hofhansl
> Subject: Re: How to quickly count the rows that meet several conditions
> using hbase coprocessor
> Can you use other string for fake value ?
> DOESNOTEXIST is a bit long. Shouldn't be difficult to come up with a
> single character string that doesn't appear in the first two columns.
>
> Cheers
>
> On Jan 17, 2014,  at 8:34 PM, "leiwangouc@gmail.com" <leiwangouc@gmail.com>
> wrote:
>
> > Hi Lars,
> >
> > public class AggregationCountForMultiFilter {
> >
> > private static final byte[] TABLE_NAME = Bytes.toBytes("userdigest");
> > private static final byte[] CF = Bytes.toBytes("cf");
> > private static final byte[] FAKE_VLAUE = Bytes.toBytes("DOESNOTEXIST");
> >
> > public static void main(String[] args) {
> >
> > Configuration conf = new Configuration();
> > Configuration configuration = HBaseConfiguration.create(conf);
> > AggregationClient aggregationClient = new
> AggregationClient(configuration);
> >
> > byte[] colA = Bytes.toBytes("tags");
> > byte[] colB = Bytes.toBytes("googleid");
> > byte[] colC = Bytes.toBytes("createtime");
> >
> > List<Filter> filters = new ArrayList<Filter>();
> >
> > SingleColumnValueFilter filter1 = new SingleColumnValueFilter(CF, colA,
> CompareOp.NOT_EQUAL, FAKE_VLAUE);
> > filter1.setFilterIfMissing(true);
> > filters.add(filter1);
> >
> > SingleColumnValueFilter filter2 = new SingleColumnValueFilter(CF, colB,
> CompareOp.NOT_EQUAL, FAKE_VLAUE);
> > filter2.setFilterIfMissing(true);
> > filters.add(filter2);
> >
> > SingleColumnValueFilter filter3 = new SingleColumnValueFilter(CF, colC,
> CompareOp.EQUAL, new RegexStringComparator("^2014-01-15"));
> > filter3.setFilterIfMissing(true);
> > filters.add(filter3);
> >
> > FilterList filterList = new
> FilterList(FilterList.Operator.MUST_PASS_ALL, filters);
> >
> > Scan scan = new Scan();
> > scan.addFamily(CF);
> > scan.setFilter(filterList);
> >
> > long rowCount = 0;
> > try {
> > rowCount = aggregationClient.rowCount(TABLE_NAME, null, scan);
> > } catch (Throwable e) {
> > e.printStackTrace();
> > }
> > System.out.println("rowCount: " + rowCount);
> > }
> > }
> > }
> >
> > The HBase version 0.94.6-cdh4.3.1
> >
> > Thanks,
> > Lei
> >
> >
> >
> > leiwangouc@gmail.com
> >
> > From: lars hofhansl
> > Date: 2014-01-18 11:18
> > To: user@hbase.apache.org
> > Subject: Re: Re: How to quickly count the rows that meet several
> conditions using hbase coprocessor
> > Offhand there is no reason for that.
> > If you send some sample code that can seed the data and then run the
> filter that shows the problem, I'll offer to do some profiling.
> >
> > Which version of HBase are you using?
> >
> > -- Lars
> >
> >
> > ________________________________
> > From: "leiwangouc@gmail.com" <leiwangouc@gmail.com>
> > To: user <user@hbase.apache.org>
> > Cc: user <user@hbase.apache.org>
> > Sent: Friday, January 17, 2014 5:24 PM
> > Subject: Re: Re: How to quickly count the rows that meet several
> conditions using hbase coprocessor
> >
> > Hi,
> >
> > I have tried.
> > For a talbe with about 600 million rowkey,  just pass a single
> QualifierFilter,  it can get the result quickly.
> > But when i add the SingleColumnValueFilter with FilterList, it becoumes
> very slow and i can't stand it.
> >
> > I think i can write my own custumed aggregation client.  Is there any
> example or user guide about how to write custumed aggregation client using
> coprocessor?
> >
> > Thanks,
> > Lei
> >
> >
> >
> >
> > leiwangouc@gmail.com
> >
> > From: Ted Yu
> > Date: 2014-01-17 18:03
> > To: user@hbase.apache.org
> > CC: user
> > Subject: Re: How to quickly count the rows that meet several conditions
> using hbase coprocessor
> > Take a look at
> http://hbase.apache.org/0.94/apidocs/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.html#rowCount(byte[],%20org.apache.hadoop.hbase.coprocessor.ColumnInterpreter,%20org.apache.hadoop.hbase.client.Scan)
> >
> > You can pass custom filter through Scan parameter.
> >
> > Cheers
> >
> > On Jan 16, 2014, at 11:58 PM, "leiwangouc@gmail.com" <
> leiwangouc@gmail.com> wrote:
> >
> >> Hi,
> >>
> >> I know that hbase copocessor provides a quick way to count the rows of
> a table.
> >> But how can i count the rows that meet several conditions.
> >>
> >> Take this for example.
> >> I have a hbase table with one column family, several columns. I want to
> caculate the number of rows that meet 3 conditions:
> >> has column1
> >> has column2
> >> has column3  and the value of column3 satisfy a regular expression
> >>
> >> Thans,
> >> Lei
> >>
> >>
> >>
> >> leiwangouc@gmail.com
>
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message