Return-Path: Delivered-To: apmail-hadoop-hbase-issues-archive@minotaur.apache.org Received: (qmail 9383 invoked from network); 16 Mar 2010 06:32:48 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 16 Mar 2010 06:32:48 -0000 Received: (qmail 22795 invoked by uid 500); 16 Mar 2010 06:32:48 -0000 Delivered-To: apmail-hadoop-hbase-issues-archive@hadoop.apache.org Received: (qmail 22763 invoked by uid 500); 16 Mar 2010 06:32:48 -0000 Mailing-List: contact hbase-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-issues@hadoop.apache.org Delivered-To: mailing list hbase-issues@hadoop.apache.org Received: (qmail 22755 invoked by uid 99); 16 Mar 2010 06:32:48 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 16 Mar 2010 06:32:48 +0000 X-ASF-Spam-Status: No, hits=-1024.4 required=10.0 tests=ALL_TRUSTED,AWL X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 16 Mar 2010 06:32:47 +0000 Received: from brutus.apache.org (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 28E67234C1F2 for ; Tue, 16 Mar 2010 06:32:27 +0000 (UTC) Message-ID: <904618956.283741268721147153.JavaMail.jira@brutus.apache.org> Date: Tue, 16 Mar 2010 06:32:27 +0000 (UTC) From: "Todd Lipcon (JIRA)" To: hbase-issues@hadoop.apache.org Subject: [jira] Commented: (HBASE-2294) Enumerate ACID properties of HBase in a well defined spec MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-2294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12845736#action_12845736 ] Todd Lipcon commented on HBASE-2294: ------------------------------------ bq. IMHO having the scanner stay 'up to date' as much as possible is a nice-to-have, definitely not important enough to hurt performance. I think I agree with you. I don't want to sidetrack this particular JIRA towards implementation details, so I'll leave it at that. Without regard to the specifics of the other JIRA, it seems likely to me that the "as up to date as possible" can often be implemented _more_ efficiently than the "snapshot iterator". The current implementation may not be up to snuff, so I'll leave it at this: I think the scanner semantics should be as loose as possible to achieve the maximum speed, and I view "up to date" as _looser_ than snapshot. bq. I would think that clients which do 'lengthy scans' don't particularly care about performance I disagree - MR jobs are a typical "lengthy scan" application and throughput is certainly important. Especially important is the ability to have the bulk (MR) jobs coexist with high concurrent live load on the table. > Enumerate ACID properties of HBase in a well defined spec > --------------------------------------------------------- > > Key: HBASE-2294 > URL: https://issues.apache.org/jira/browse/HBASE-2294 > Project: Hadoop HBase > Issue Type: Task > Components: documentation > Reporter: Todd Lipcon > Priority: Blocker > Fix For: 0.20.4, 0.21.0 > > > It's not written down anywhere what the guarantees are for each operation in HBase with regard to the various ACID properties. I think the developers know the answers to these questions, but we need a clear spec for people building systems on top of HBase. Here are a few sample questions we should endeavor to answer: > - For a multicell put within a CF, is the update made durable atomically? > - For a put across CFs, is the update made durable atomically? > - Can a read see a row that hasn't been sync()ed to the HLog? > - What isolation do scanners have? Somewhere between snapshot isolation and no isolation? > - After a client receives a "success" for a write operation, is that operation guaranteed to be visible to all other clients? > etc > I see this JIRA as having several points of discussion: > - Evaluation of what the current state of affairs is > - Evaluate whether we currently provide any guarantees that aren't useful to users of the system (perhaps we can drop in exchange for performance) > - Evaluate whether we are missing any guarantees that would be useful to users of the system -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.