Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 25B879A41 for ; Thu, 23 Feb 2012 20:11:15 +0000 (UTC) Received: (qmail 18959 invoked by uid 500); 23 Feb 2012 20:11:13 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 18889 invoked by uid 500); 23 Feb 2012 20:11:13 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 18696 invoked by uid 99); 23 Feb 2012 20:11:13 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 23 Feb 2012 20:11:13 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 23 Feb 2012 20:11:11 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 1F1E3336237 for ; Thu, 23 Feb 2012 20:10:50 +0000 (UTC) Date: Thu, 23 Feb 2012 20:10:50 +0000 (UTC) From: "He Yongqiang (Commented) (JIRA)" To: issues@hbase.apache.org Message-ID: <821629245.11361.1330027850128.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <219814322.6418.1329947988689.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HBASE-5457) add inline index in data block for data which are not clustered together MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HBASE-5457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13215002#comment-13215002 ] He Yongqiang commented on HBASE-5457: ------------------------------------- @lars, in today's implementation we actually create another column family and reorg the column name to be 'ts and string', so the data is sorted by ts in this new column family. And we redirect the query to use the second column family. But this approach duplicates data. Without the second column family, we can do a search once we found the row. but that requires searching all data with the target row key. It hurts cpu. > add inline index in data block for data which are not clustered together > ------------------------------------------------------------------------ > > Key: HBASE-5457 > URL: https://issues.apache.org/jira/browse/HBASE-5457 > Project: HBase > Issue Type: New Feature > Reporter: He Yongqiang > > As we are go through our data schema, and we found we have one large column family which is just duplicating data from another column family and is just a re-org of the data to cluster data in a different way than the original column family in order to serve another type of queries efficiently. > If we compare this second column family with similar situation in mysql, it is like an index in mysql. So if we can add inline block index on required columns, the second column family then is not needed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira