Return-Path: Delivered-To: apmail-hadoop-hbase-dev-archive@minotaur.apache.org Received: (qmail 52084 invoked from network); 29 Mar 2009 19:43:12 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 29 Mar 2009 19:43:12 -0000 Received: (qmail 53575 invoked by uid 500); 29 Mar 2009 19:43:11 -0000 Delivered-To: apmail-hadoop-hbase-dev-archive@hadoop.apache.org Received: (qmail 53506 invoked by uid 500); 29 Mar 2009 19:43:11 -0000 Mailing-List: contact hbase-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-dev@hadoop.apache.org Delivered-To: mailing list hbase-dev@hadoop.apache.org Received: (qmail 53494 invoked by uid 99); 29 Mar 2009 19:43:11 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 29 Mar 2009 19:43:11 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 29 Mar 2009 19:43:10 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 91381234C003 for ; Sun, 29 Mar 2009 12:42:50 -0700 (PDT) Message-ID: <1749659089.1238355770579.JavaMail.jira@brutus> Date: Sun, 29 Mar 2009 12:42:50 -0700 (PDT) From: "Jonathan Gray (JIRA)" To: hbase-dev@hadoop.apache.org Subject: [jira] Commented: (HBASE-1249) Rearchitecting of server, client, API, key format, etc for 0.20 In-Reply-To: <728871099.1236620030451.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HBASE-1249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12693609#action_12693609 ] Jonathan Gray commented on HBASE-1249: -------------------------------------- Ah, so not setting it and performing deletes on the memcache means reading a deletefamily means everything prior storefiles is deleted for that row. I guess I just don't agree with that kind of selective restrictions for performance unless we're going to make a conscious and logical design decision. There's a very clear and logical argument for disallowing the manual setting of timestamps. However, this ability is part of the BigTable spec and there are numerous use cases for this (including pset). It closes the door for potential optimizations for those of us who have no need to manually set them, but it's not terrible to allow it as long as they're only in the past. The same argument can be applied to this and a bunch of other issues we've been tossing back and forth. Let's not make these kinds of decisions without deciding what our requirements are. Either timestamp is a user-settable attribute, or it isn't. I think it should be. Part of the issues with the current API is you can do certain things in one part of the API that aren't supported in the other type. Scanning and versions don't play nice even though we logically can support it. There shouldn't be caveats like, you can insert at any time in the past, but if you want to delete a row, you can only delete every version or particular versions of particular columns, not all versions older than a specified stamp. Erik's digging has shown numerous potential optimizations for the future, very good stuff. BUT Let's not alter our requirements or the properties of HBase in significant ways in the name of minor optimization of edge cases. If I understand correctly, even with #2 if you do a deleteFamily and specify NOW, it would have the same early-out possibility as with #1. I see a DeleteFamily with a stamp that is newer than the latest stamp in the next storefile. I know all columns are deleted so I do nothing. Enforcing the deletes in memcache means you tuck it away untli the next storefile anyways. So implementation is identical with #2 if used in the way #1 forces you to. But you remove the ability of the user to put a past stamp in. And this just adds additional caveats instead of keeping it simple. If a user does a deletefamily with a past stamp, then read queries would need to open additional stores. That's required for correctness of the query, this is not an inefficiency this is what the user wants to happen if he uses puts and deletes in this way. > Rearchitecting of server, client, API, key format, etc for 0.20 > --------------------------------------------------------------- > > Key: HBASE-1249 > URL: https://issues.apache.org/jira/browse/HBASE-1249 > Project: Hadoop HBase > Issue Type: Improvement > Reporter: Jonathan Gray > Priority: Blocker > Fix For: 0.20.0 > > Attachments: HBASE-1249-Example-v1.pdf, HBASE-1249-Example-v2.pdf, HBASE-1249-GetQuery-v1.pdf, HBASE-1249-GetQuery-v2.pdf, HBASE-1249-GetQuery-v3.pdf, HBASE-1249-StoreFile-v1.pdf > > > To discuss all the new and potential issues coming out of the change in key format (HBASE-1234): zero-copy reads, client binary protocol, update of API (HBASE-880), server optimizations, etc... -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.