Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2A40ED0A6 for ; Tue, 2 Oct 2012 21:25:09 +0000 (UTC) Received: (qmail 44247 invoked by uid 500); 2 Oct 2012 21:25:08 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 44204 invoked by uid 500); 2 Oct 2012 21:25:08 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 44133 invoked by uid 99); 2 Oct 2012 21:25:08 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 02 Oct 2012 21:25:08 +0000 Date: Wed, 3 Oct 2012 08:25:08 +1100 (NCT) From: "Gary Helmling (JIRA)" To: issues@hbase.apache.org Message-ID: <1174327539.156389.1349213108662.JavaMail.jiratomcat@arcas> In-Reply-To: <1509615013.91956.1347970447665.JavaMail.jiratomcat@arcas> Subject: [jira] [Commented] (HBASE-6805) Extend co-processor framework to provide observers for filter operations MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-6805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13468086#comment-13468086 ] Gary Helmling commented on HBASE-6805: -------------------------------------- Looking at the encryption example here, it seems like you could provide that with the existing coprocessor hooks: Option A: # In {{EncryptingRegionObserver.preScannerOpen()}}, if any Filter is set on the Scan object, wrap it in a custom {{DecryptingFilterWrapper}}. This would just decrypt the KVs before passing them on to the client provided Filter, essentially doing the same work your example preFilterXXX methods are doing. # In {{EncryptingRegionObserver.postScannerNext()}}, again decrypt the final KVs being returned to the client, same as your example. The duplicate decryption here seems unnecessary, but it should give you the same results as your provided example, without the need to add a batch of pre/postFilterXXX hooks to RegionObservers. Option B: # In {{EncryptingRegionObserver.preStoreScannerOpen()}} return a custom KeyValueScanner implementation that extends or wraps the default StoreScanner implementation. Note that this would still be a little tricky since filters are applied down in ScanQueryMatcher. For decryption what you would really want is to hook in above the StoreFileScanners and MemStoreScanners used internally by StoreScanner, but below the ScanQueryMatcher operations, so that you can decrypt each KV once as it's read. Seems like that would currently require duplicating a fair amount of StoreScanner functionality. Maybe something needs to be added to better hook in to this data reading layer? The main issue I see is that the added hooks fuzz the line between Filters and RegionObservers and their areas of responsibility. It doesn't seem like we should really need pre/postFilterXXX hooks, because that's what filters are supposed to provide. And of course adding more Observer hooks does have a cost in increasing complexity of the coprocessor interfaces and added overhead (especially in hot code paths). Are there really cases that require the pre/postFilter hooks that can't be accomplished by having a RegionObserver wrap gets/scans with it's own Filter implementation that coordinates with the RegionObserver instance? > Extend co-processor framework to provide observers for filter operations > ------------------------------------------------------------------------ > > Key: HBASE-6805 > URL: https://issues.apache.org/jira/browse/HBASE-6805 > Project: HBase > Issue Type: Sub-task > Components: Coprocessors > Affects Versions: 0.96.0 > Reporter: Jason Dai > Attachments: extend_coprocessor.patch > > > There are several filter operations (e.g., filterKeyValue, filterRow, transform, etc.) at the region server side that either exclude KVs from the returned results, or transform the returned KV. We need to provide observers (e.g., preFilterKeyValue and postFilterKeyValue) for these operations in the same way as the observers for other data access operations (e.g., preGet and postGet). This extension is needed to support DOT (e.g., extracting individual fields from the document in the observers before passing them to the related filter operations) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira