Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 71F86E326 for ; Tue, 25 Dec 2012 16:43:25 +0000 (UTC) Received: (qmail 59790 invoked by uid 500); 25 Dec 2012 16:43:23 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 59724 invoked by uid 500); 25 Dec 2012 16:43:23 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 59715 invoked by uid 99); 25 Dec 2012 16:43:22 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 25 Dec 2012 16:43:22 +0000 X-ASF-Spam-Status: No, hits=3.2 required=5.0 tests=FREEMAIL_REPLY,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of dalia.mohsobhy@hotmail.com designates 157.55.1.143 as permitted sender) Received: from [157.55.1.143] (HELO dub0-omc2-s4.dub0.hotmail.com) (157.55.1.143) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 25 Dec 2012 16:43:13 +0000 Received: from DUB114-W94 ([157.55.1.136]) by dub0-omc2-s4.dub0.hotmail.com with Microsoft SMTPSVC(6.0.3790.4675); Tue, 25 Dec 2012 08:42:53 -0800 X-EIP: [fFM5n/a+sVdBD/XbkVeRzHdMpo/MUfdx] X-Originating-Email: [dalia.mohsobhy@hotmail.com] Message-ID: Content-Type: multipart/alternative; boundary="_835d6147-4d58-4b66-b95f-70f418661be8_" From: Dalia Sobhy To: "user@hbase.apache.org" Subject: RE: Hbase Count Aggregate Function Date: Tue, 25 Dec 2012 18:42:52 +0200 Importance: Normal In-Reply-To: References: ,,,,,, MIME-Version: 1.0 X-OriginalArrivalTime: 25 Dec 2012 16:42:53.0016 (UTC) FILETIME=[E2315580:01CDE2BE] X-Virus-Checked: Checked by ClamAV on apache.org --_835d6147-4d58-4b66-b95f-70f418661be8_ Content-Type: text/plain; charset="windows-1256" Content-Transfer-Encoding: 8bit Do you mean I implement a new rowCount method in Aggregation Client Class. I cannot understand, could u illustrate with a code sample Ram? > > Date: Tue, 25 Dec 2012 00:21:14 +0530 > > Subject: Re: Hbase Count Aggregate Function > > From: ramkrishna.s.vasudevan@gmail.com > > To: user@hbase.apache.org > > > > Hi > > You could have custom filter implemented which is similar to > > FirstKeyOnlyfilter. > > Implement the filterKeyValue method such that it should match your keyvalue > > (the specific qualifier that you are looking for). > > > > Deploy it in your cluster. It should work. > > > > Regards > > Ram > > > > On Mon, Dec 24, 2012 at 10:35 PM, Dalia Sobhy wrote: > > > > > > > > So do you have a suggestion how to enable/work the filter? > > > > > > > Date: Mon, 24 Dec 2012 22:22:49 +0530 > > > > Subject: Re: Hbase Count Aggregate Function > > > > From: ramkrishna.s.vasudevan@gmail.com > > > > To: user@hbase.apache.org > > > > > > > > Okie, seeing the shell script and the code I feel that while you use this > > > > counter, the user's filter is not taken into account. > > > > It adds a FirstKeyOnlyFilter and proceeds with the scan. :(. > > > > > > > > Regards > > > > Ram > > > > > > > > On Mon, Dec 24, 2012 at 10:11 PM, Dalia Sobhy < > > > dalia.mohsobhy@hotmail.com>wrote: > > > > > > > > > > > > > > yeah scan gives the correct number of rows, while count returns the > > > total > > > > > number of rows. > > > > > > > > > > Both are using the same filter, I even tried it using Java API, using > > > row > > > > > count method. > > > > > > > > > > rowCount = aggregationClient.rowCount(TABLE_NAME, null, scan); > > > > > > > > > > I get the total number of rows not the number of rows filtered. > > > > > > > > > > So any idea ?? > > > > > > > > > > Thanks Ram :) > > > > > > > > > > > Date: Mon, 24 Dec 2012 21:57:54 +0530 > > > > > > Subject: Re: Hbase Count Aggregate Function > > > > > > From: ramkrishna.s.vasudevan@gmail.com > > > > > > To: user@hbase.apache.org > > > > > > > > > > > > So you find that scan with a filter and count with the same filter is > > > > > > giving you different results? > > > > > > > > > > > > Regards > > > > > > Ram > > > > > > > > > > > > On Mon, Dec 24, 2012 at 8:33 PM, Dalia Sobhy < > > > dalia.mohsobhy@hotmail.com > > > > > >wrote: > > > > > > > > > > > > > > > > > > > > Dear all, > > > > > > > > > > > > > > I have 50,000 row with diagnosis qualifier = "cardiac", and another > > > > > 50,000 > > > > > > > rows with "renal". > > > > > > > > > > > > > > When I type this in Hbase shell, > > > > > > > > > > > > > > import org.apache.hadoop.hbase.filter.CompareFilter > > > > > > > import org.apache.hadoop.hbase.filter.SingleColumnValueFilter > > > > > > > import org.apache.hadoop.hbase.filter.SubstringComparator > > > > > > > import org.apache.hadoop.hbase.util.Bytes > > > > > > > > > > > > > > scan 'patient', { COLUMNS => "info:diagnosis", FILTER => > > > > > > > SingleColumnValueFilter.new(Bytes.toBytes('info'), > > > > > > > Bytes.toBytes('diagnosis'), > > > > > > > CompareFilter::CompareOp.valueOf('EQUAL'), > > > > > > > SubstringComparator.new('cardiac'))} > > > > > > > > > > > > > > Output = 50,000 row > > > > > > > > > > > > > > import org.apache.hadoop.hbase.filter.CompareFilter > > > > > > > import org.apache.hadoop.hbase.filter.SingleColumnValueFilter > > > > > > > import org.apache.hadoop.hbase.filter.SubstringComparator > > > > > > > import org.apache.hadoop.hbase.util.Bytes > > > > > > > > > > > > > > count 'patient', { COLUMNS => "info:diagnosis", FILTER => > > > > > > > SingleColumnValueFilter.new(Bytes.toBytes('info'), > > > > > > > Bytes.toBytes('diagnosis'), > > > > > > > CompareFilter::CompareOp.valueOf('EQUAL'), > > > > > > > SubstringComparator.new('cardiac'))} > > > > > > > Output = 100,000 row > > > > > > > > > > > > > > Even though I tried it using Hbase Java API, Aggregation Client > > > > > Instance, > > > > > > > and I enabled the Coprocessor aggregation for the table. > > > > > > > rowCount = aggregationClient.rowCount(TABLE_NAME, null, scan) > > > > > > > > > > > > > > Also when measuring the improved performance on case of adding more > > > > > nodes > > > > > > > the operation takes the same time. > > > > > > > > > > > > > > So any advice please? > > > > > > > > > > > > > > I have been throughout all this mess from a couple of weeks > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > > --_835d6147-4d58-4b66-b95f-70f418661be8_--