Return-Path: X-Original-To: apmail-accumulo-notifications-archive@minotaur.apache.org Delivered-To: apmail-accumulo-notifications-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DCFA8F20D for ; Wed, 3 Apr 2013 20:39:15 +0000 (UTC) Received: (qmail 30482 invoked by uid 500); 3 Apr 2013 20:39:15 -0000 Delivered-To: apmail-accumulo-notifications-archive@accumulo.apache.org Received: (qmail 30455 invoked by uid 500); 3 Apr 2013 20:39:15 -0000 Mailing-List: contact notifications-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: jira@apache.org Delivered-To: mailing list notifications@accumulo.apache.org Received: (qmail 30410 invoked by uid 99); 3 Apr 2013 20:39:15 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 03 Apr 2013 20:39:15 +0000 Date: Wed, 3 Apr 2013 20:39:15 +0000 (UTC) From: "Christopher Tubbs (JIRA)" To: notifications@accumulo.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (ACCUMULO-1228) Allow clients to disable column families and locality groups MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/ACCUMULO-1228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13621297#comment-13621297 ] Christopher Tubbs commented on ACCUMULO-1228: --------------------------------------------- Locality groups are an admin feature, for optimizing the data storage for more efficient queries, similar to setting the compression for a table. There is no parallel between setting locality groups and the query API, so there's no opportunity to be inconsistent. The RFile remembers the locality groups that were configured on the table at the time that RFile was written, so that if it changes, the RFile will still be readable correctly. Locality groups can be changed at any time, though they may not have an effect on query until a full major compaction causes all the data to be re-written with the new organization (just like compression). When you query a file, you query for column families. The TServer checks the file's metadata to determine which locality groups, as defined in *that* file, hold the specified column families, and then reads the blocks for those localities. This is simply an optimization to avoid loading blocks that aren't needed (because locality groups are used to isolate data into separate sets of blocks for different localities). This optimization will work on *every* query, without the user having to know anything about whether their column families were stored in one locality group by themselves, or another, or in the default locality group with all the other data. > Allow clients to disable column families and locality groups > ------------------------------------------------------------ > > Key: ACCUMULO-1228 > URL: https://issues.apache.org/jira/browse/ACCUMULO-1228 > Project: Accumulo > Issue Type: New Feature > Components: client, tserver > Affects Versions: 1.5.0 > Reporter: William Slacum > Priority: Minor > Fix For: 1.6.0 > > > There's an inconsistency between what a server is capable of and what a client can tell it to do with respect to fetching column families. > Currently, a user can tell a {{Scanner}} to fetch some set of column families. The iterators support not only this, but also the converse where a user does not want to retrieve column families. An iterator implementation can do this by hand, but a client cannot specifically tell a Scanner to not return data from a set of column families. Clients should be able to specify this option. > There also seems to be an inconsistency with how locality groups are defined and then utilized. If I want to specify a set of column families as being part of a locality group, I have to provide a mapping of locality group name to a list of column families. If I want to fetch a locality group, I have to get the mapping first, rather than just set which locality group I want to use. It'd be more convenient to tell the scanner just to fetch which locality groups I want, and have the server know which column families that means. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira