Return-Path: Delivered-To: apmail-hadoop-hbase-dev-archive@minotaur.apache.org Received: (qmail 82827 invoked from network); 10 Feb 2010 14:00:53 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 10 Feb 2010 14:00:53 -0000 Received: (qmail 74834 invoked by uid 500); 10 Feb 2010 14:00:52 -0000 Delivered-To: apmail-hadoop-hbase-dev-archive@hadoop.apache.org Received: (qmail 74807 invoked by uid 500); 10 Feb 2010 14:00:52 -0000 Mailing-List: contact hbase-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-dev@hadoop.apache.org Delivered-To: mailing list hbase-dev@hadoop.apache.org Received: (qmail 74771 invoked by uid 99); 10 Feb 2010 14:00:52 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 10 Feb 2010 14:00:52 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 10 Feb 2010 14:00:50 +0000 Received: from brutus.apache.org (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 5ADD1234C4AB for ; Wed, 10 Feb 2010 06:00:29 -0800 (PST) Message-ID: <2127150406.179081265810429371.JavaMail.jira@brutus.apache.org> Date: Wed, 10 Feb 2010 14:00:29 +0000 (UTC) From: "Ferdy (JIRA)" To: hbase-dev@hadoop.apache.org Subject: [jira] Commented: (HBASE-2198) SingleColumnValueFilter should be able to find the column value even when it's not specifically added as input on the scan. In-Reply-To: <352179872.146211265710527893.JavaMail.jira@brutus.apache.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HBASE-2198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831990#action_12831990 ] Ferdy commented on HBASE-2198: ------------------------------ Created a new issue for the new Filter: https://issues.apache.org/jira/browse/HBASE-2211 > SingleColumnValueFilter should be able to find the column value even when it's not specifically added as input on the scan. > --------------------------------------------------------------------------------------------------------------------------- > > Key: HBASE-2198 > URL: https://issues.apache.org/jira/browse/HBASE-2198 > Project: Hadoop HBase > Issue Type: Improvement > Components: filters > Affects Versions: 0.20.3 > Reporter: Ferdy > Fix For: 0.20.4, 0.21.0 > > Attachments: HBASE-2198.patch > > > Whenever applying a SingleColumnValueFilter to a Scan that has specific columns as it's input (but not the column to be checked in the Filter), the Filter won't be able to find the value that it should be checking. > For example, let's say we want to do a scan, but we only need COLUMN_2 columns. Furthermore, we only want rows that have a specific value for COLUMN_1. Using the following code won't do the trick: > Scan scan = new Scan(); > scan.addColumn(FAMILY, COLUMN_2); > SingleColumnValueFilter filter = new SingleColumnValueFilter(FAMILY, COLUMN_1, CompareOp.EQUAL, TEST_VALUE); > filter.setFilterIfMissing(true); > scan.setFilter(filter); > However, we can make it work when specifically also adding the tested column as an input column: > scan.addColumn(FAMILY, COLUMN_1); > Is this by design? Personally I think that adding a filter with columns tests should not bother the user to check that it's also on the input. It is prone to bugs. > I suggest either one of 3 solutions: > A) Update the Javadoc of Filter / SingleColumnValueFilter / possibly other affecting Filters to indicate this behaviour. > B) Fix the problem client-side (i.e. prior to using a Scan object, it should check that the corresponding inputs for filters are set, but only if the user has configured specific input columns in the first place). This is perhaps inefficient performance-wise, because unnecessary inputs columns are returned to the user. (Inputs that would only have to be used for filtering). > C) Fix the problem server-side. This would me most efficient, because the input column would only be read to do filtering at the regionserver. > What do you think? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.