Return-Path: X-Original-To: apmail-incubator-accumulo-user-archive@minotaur.apache.org Delivered-To: apmail-incubator-accumulo-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id BF9AB99AF for ; Fri, 9 Mar 2012 19:48:28 +0000 (UTC) Received: (qmail 60647 invoked by uid 500); 9 Mar 2012 19:48:28 -0000 Delivered-To: apmail-incubator-accumulo-user-archive@incubator.apache.org Received: (qmail 60622 invoked by uid 500); 9 Mar 2012 19:48:28 -0000 Mailing-List: contact accumulo-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: accumulo-user@incubator.apache.org Delivered-To: mailing list accumulo-user@incubator.apache.org Received: (qmail 60614 invoked by uid 99); 9 Mar 2012 19:48:28 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 09 Mar 2012 19:48:28 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [209.85.161.175] (HELO mail-gx0-f175.google.com) (209.85.161.175) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 09 Mar 2012 19:48:21 +0000 Received: by ggcy3 with SMTP id y3so1061252ggc.6 for ; Fri, 09 Mar 2012 11:48:00 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding:x-gm-message-state; bh=CrFQ2XTg9fnf6DbbCu+o6xYSu+JD1bXdWe0jjgbda9c=; b=KJA2OBietdn9WjtwKWc6gIjV5nS8Wlofln4MZxDTI1A3reQcKHkou18+HuD/uxBIe8 nHLbdU3ZXO5RJEuXxHnv57nUwO3MBk+EshR+HSkcd3GkUNYQH52G7uo1cOUnKtWZocRR U9DN63KjXll58T47UIqBe+qsWvkplSX38pxXj3WyuUg5YzV+3T5a332aYtmvtRN+P9rq xC9io2WQ0ZBcLFxNcuWQ8wxyPy1lQ5yQpeY1uzYhK21UOyJF7BXoyyr5lr1MOQ4gtwil a928fzvN87pKgilEKn4jdX6dWsbKF/vq1NEf52Oiei7rsjMXYKtYe7UKe+ElBan0LtvJ rLqw== MIME-Version: 1.0 Received: by 10.224.218.10 with SMTP id ho10mr1254677qab.16.1331322480728; Fri, 09 Mar 2012 11:48:00 -0800 (PST) Received: by 10.229.164.20 with HTTP; Fri, 9 Mar 2012 11:48:00 -0800 (PST) In-Reply-To: References: Date: Fri, 9 Mar 2012 14:48:00 -0500 Message-ID: Subject: Re: filter on value ranges From: Keith Turner To: accumulo-user@incubator.apache.org Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable X-Gm-Message-State: ALoCoQloKMQ9tHRVVP72f3CIM9LUZY0df3UZ+lhg16HM8Fy85mu8UWz/YH5pP5bhXbFkE9BuJoXG X-Virus-Checked: Checked by ClamAV on apache.org The WholeRowIterator can filter rows, just override it and implement the filter function. Also new in 1.4 is org.apache.accumulo.core.iterators.user.RowFilter. If provides similar functionality, but does not require reading the entire row into memory. Keith On Fri, Mar 9, 2012 at 1:11 PM, Kini, Ameet M. wrote: > > > > > Thanks for the comments. > > > > I=92m ok with rolling my own iterator/filter but not sure how to go about > doing it (see next para), so it=92d be great to get pointers on it. =A0I= =92d > prefer keeping the schema to how it is today where each employee is > represented by a row in the table with a properties cf containing name an= d > salary cq. Here=92s how it looks today > > > > rowID colfam=A0=A0=A0=A0 colqual=A0=A0=A0=A0=A0=A0=A0=A0 value > > > > abc=A0 properties name=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 john > > abc=A0 properties salary=A0=A0=A0=A0 =A0=A0=A0=A0 10000 > > def=A0 properties name=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 alice > > def=A0 properties salary=A0=A0=A0=A0 =A0=A0=A0=A0 20000 > > > > Part of my confusion lies in not knowing how to implement this range filt= er > class, because my query needs to get both the name as well as salary base= d > on a particular salary. What I would like to do is something like a Filte= r > equivalent to WholeRowIterator, say WholeRowFilter whose accept(Key k, Va= lue > v) was provided the entire row in the Value argument alongwith appropriat= e > encodeRow/decodeRow as in WholeRowIterator. If the accept method returns > true, the whole row is returned to the client. Then I could extend this > class by writing a MyRangeFilter which would look inside the row and make > row level accept/reject decisions based on values of particular cq. > > > > Maybe this WholeRowFilter is already there in some form? > > > > -Ameet Kini > > > > From: Aaron Cordova [mailto:aaron@cordovas.org] > Sent: Friday, March 09, 2012 9:20 AM > To: accumulo-user@incubator.apache.org > Subject: Re: filter on value ranges > > > > To answer your question, I would not use built-in iterators for this. > > > > But if you were determined, you could use what is known as 'document > sharding' as opposed to 'term sharding' and use an intersecting iterator. > > > > Instructions on how to do this should be added to the manual ... > > > > > > On Mar 9, 2012, at 9:07 AM, Kini, Ameet M. wrote: > > > > > > In 1.4, is there a way to use built-in iterators to run the following que= ry > : > > =A0 =93get the name and salary of all employees where the salary is betwe= en X > and Y=94 > > > > Assuming a straightforward schema where name and salary are both cq. > > > > I=92d like both the cq restriction and the range predicate applied on the > tservers. > > > > I see that Scanner.setColumnQualifierRegex would take care of the cq > restriction. But I don=92t know of a built-in iterator for the range pred= icate > and I don=92t know of how to compose those two iterators. > > > > Thanks, > > -Ameet Kini > > > >