Return-Path: X-Original-To: apmail-hbase-dev-archive@www.apache.org Delivered-To: apmail-hbase-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id ACE0192AB for ; Sat, 11 Feb 2012 22:35:25 +0000 (UTC) Received: (qmail 74591 invoked by uid 500); 11 Feb 2012 22:35:25 -0000 Delivered-To: apmail-hbase-dev-archive@hbase.apache.org Received: (qmail 74539 invoked by uid 500); 11 Feb 2012 22:35:24 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Received: (qmail 74531 invoked by uid 99); 11 Feb 2012 22:35:24 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 11 Feb 2012 22:35:24 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [98.138.229.68] (HELO nm33-vm4.bullet.mail.ne1.yahoo.com) (98.138.229.68) by apache.org (qpsmtpd/0.29) with SMTP; Sat, 11 Feb 2012 22:35:18 +0000 Received: from [98.138.90.52] by nm33.bullet.mail.ne1.yahoo.com with NNFMP; 11 Feb 2012 22:34:57 -0000 Received: from [98.138.87.6] by tm5.bullet.mail.ne1.yahoo.com with NNFMP; 11 Feb 2012 22:34:57 -0000 Received: from [127.0.0.1] by omp1006.mail.ne1.yahoo.com with NNFMP; 11 Feb 2012 22:34:57 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 2679.88876.bm@omp1006.mail.ne1.yahoo.com Received: (qmail 68467 invoked by uid 60001); 11 Feb 2012 22:34:56 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1328999696; bh=Udz0Vd/bPb+/EBObLnZDGaAFvbYQ//G1xo+j6NeFv6E=; h=X-YMail-OSG:Received:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=YUo4/2JeKU+jvVFHCcotG0dliaWU9dJmgql3CBJv2L6jAfgWppoNjrqTdXCLd/y2PXHuUE2zivl52PrN/yRCPR9DWd+z2z00/LPWlK5+IHAruh8JhbV7AfJ4PEBsfwC4rdBgRgR/LQcIFYf6zbUvCO3bJNsUkccUp3Qrs2LVoso= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=KvZftZ2P8z/mGphmZ80ImTawR0/8YQXsUY29ArxFo5KJFl7juHH/rce6GaCxl1RjsmH5acV75SD4rAMY6wplwbcQO8J/L6gxHt8HpztY7R0uCxsXjuiTOHTI4MMeOnLi/CYbVQeDjwB5mqpLNtXCIF4G5uajy9zPBkBhV1iVWm4=; X-YMail-OSG: SCQpb4MVM1lkqoE3MSI4NHVfZpXdiHNktua2AWLvWCk_RuC y52oq1LKfkhciFgxoOexoEpjHB8VhYEK91aaaGYpJ32jBbQWsmXQPHFfZNuT gboHFXVGgpzO2U2OgWE7FVoPJoLapHKOFN3efy9m72L7ZuL6pV5zJ9ibU8Gf TlyepDYRBzq0gf_0Q3pk8_1V.nbAAZIq8DY_JJB.vgBo_tAypTUGwMwONavG w6hNk6e4nCYBeUSPIU1RMdI3dJO5FjQT5r0MYQ4NcIMWLPVyyPvDzv8ktP1Z Se4p3AtftJLTucQvcV4bBGWa0uGubXql6kwkUeh0UZIxb3JDNBVZAYk8bD_e H1ytZoDU0LfjU98xMoH8oPbsgWOIbtwuvtU9ElXMvc_Wagnrmdvb7tWgUI55 0EGU0vSX0SFtKxGuBSlbByTJK_GhndCkry4eAj9SAqwjbE42Qjkps5SE861R hlhP0GqUFwaxcfPkzyGthweDwjiMkEMLFjnYkdttkoFM_I2oHheu7I8Y44k0 oxOjEkO630jWPaAeMfRBtn4q9O0ojcp_ofuT5WS92LvWu6PDqKLwFfGv_0XQ sonzORDcYeVeGc6s2htPIFrTbyAg0VjZtyKkmuMsHM63soA6qowuinudwoPs xFZ4lYznPmkHJ1P2dW7dQw7qd6ufw_LM8NelxlOc.bmKUnU7WYsXm6E7xd5k a6QjytgVa6wQwTa.8svijJA9nYgWv29UFIX0nga9NZR6N_RsV6MDm4xRgPQg GOaUhUuorMtVitNwIeCYlt6VRYmBYgdnVsbjhMiNdS7SvTlJbUkUFUFknFLf yy2MTId0l5vrXy1hmujJFPpk6FxrvsluQDD8B2792EAp5P1eou0fozq3wUSO sEC..Ba5nN4uB30.ks8AC2qqOpaINZUbeQ_7lbuDYZNVW09hhaxvMmEEX6i5 Q2Gc- Received: from [69.181.180.38] by web121703.mail.ne1.yahoo.com via HTTP; Sat, 11 Feb 2012 14:34:56 PST X-Mailer: YahooMailWebService/0.8.116.338427 References: <1326835632.53005.YahooMailNeo@web121703.mail.ne1.yahoo.com> <5D5578DD-FE5D-409C-B4CF-CCAC060D73DE@gmail.com> <1326865523.28711.YahooMailNeo@web121703.mail.ne1.yahoo.com> <1326934920.33092.YahooMailNeo@web121705.mail.ne1.yahoo.com> Message-ID: <1328999696.65225.YahooMailNeo@web121703.mail.ne1.yahoo.com> Date: Sat, 11 Feb 2012 14:34:56 -0800 (PST) From: lars hofhansl Reply-To: lars hofhansl Subject: Re: Limited cross row transactions To: "dev@hbase.apache.org" In-Reply-To: <1326934920.33092.YahooMailNeo@web121705.mail.ne1.yahoo.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="1001534069-540856092-1328999696=:65225" --1001534069-540856092-1328999696=:65225 Content-Type: text/plain; charset=us-ascii A quick followup... This is now committed as a "developer feature". That means it is not exposed via HTable or any other standard client interface. Instead it exposes as a general hook that coprocessor endpoints can call. Coprocessor endpoints are (1) already region aware and (2) also had the ability to modify mutliple rows (but weren't able to do efficiently or correctly; said hook just makes it possible). Along with HBASE-5304, which makes RegionSplitPolicy more accessible, this provides all part for cheap *local* transactions. A sample KeyPrefixRegionSplitPolicy and a MultiRowMutationEndpoint are shipped with HBase. This blog post explains how one could use this: http://hadoop-hbase.blogspot.com/2012/02/limited-cross-row-transactions-in-hbase.html Now, let's add 2nd'ary indexes and global transactions :) -- Lars ________________________________ From: lars hofhansl To: "dev@hbase.apache.org" Sent: Wednesday, January 18, 2012 5:02 PM Subject: Re: Limited cross row transactions Filed https://issues.apache.org/jira/browse/HBASE-5229 for further discussion, attached a patch that does this. As for your point... The below is one way to define limited groups of rows that can participate in transactions (I should not have named it parent/child, that just confuses my point). Your scenario calls for global transaction (unless you have to some other approach to limit the scope of rows that could participate in your FK transactions to something less than the entire database). If every transaction is a global transaction the database will not scale. See http://www.julianbrowne.com/article/viewer/brewers-cap-theorem and http://www.cloudera.com/blog/2010/04/cap-confusion-problems-with-partition-tolerance/ Also check out two phase commit failure and blocking scenarios, and Paxos' conditions for termination. -- Lars ----- Original Message ----- From: Mikael Sitruk To: dev@hbase.apache.org; lars hofhansl Cc: Sent: Wednesday, January 18, 2012 12:01 AM Subject: Re: Limited cross row transactions This is for parent child relationship, but what if there is no parent child relationship, but more a foreign key like relationship? Using this model you do a full scan to get all the index (since you don't know the parent, you just know the "secondary index"). Or will you use a group ID as a prefix of parent key and "child" key? In this case splitting according to group may be more difficult, (due to different growth of groups). Doing this aren't we back in the headache of sharding in rdbms? Mikael.S On Wed, Jan 18, 2012 at 7:45 AM, lars hofhansl wrote: > This thread is probably getting too long... > > In HBase we have to let go of "global stuff". I submit that global > transactions across 1000's of nodes that can fail will never work > adequately. > For that kind of consistency you will be hit in availability. > > Like Megastore the trick is in creating a local grouping of entities that > can participate in local transactions. > If you limit the (consistent) index to child entities of parent entity you > can form your index like this: > parentKey1... > parentKey1.childTableName1.indexedField1 > parentKey1.childTableName1.indexedField2 > ... > parentKey1.childTableName2.indexedField1 > parentKey1.childTableName2.indexedField2 > ... > (assuming . cannot be in any parent key or child table name here, but you > get the idea). > > > When scanning the parent you'd have to skip the index rows with a filter. > Within a parentKey you can find childKeys efficiently by scanning the > index rows. > > Since the parent and the index entries would sort together the table can > be pre-split (or one could have a simple prefix based balancer). > > -- Lars > > ----- Original Message ----- > From: Mikael Sitruk > To: dev@hbase.apache.org > Cc: > Sent: Tuesday, January 17, 2012 3:07 PM > Subject: Re: Limited cross row transactions > > Well i understand the limitation now, asking to be in the same region is > really hard constraint. > Even if this is on the same RS this is not enough, because after a restart, > regions may be allocated differently and now part of the data may be in one > region under server A and the other part under server B. > > Well perhaps we need use case for better understanding, and perhaps finding > alternative. > > The first use case i was thinking of is as follow - > I need to insert data with different access criteria, but the data inserted > should be inserted in atomic way. > In RDBMS i would have two table, insert data in the first one with key#1 > and then in the second one with key #2 then commit. > In HBase i need to use different column family with key #1 (for atomicity) > then to manage a kind of secondary index to map key#2 to key #1 (perhaps > via co-processor) to have quick access to the data of key#2. > Having cross row trx, i would think of sing different keys under the same > table (and probably different cf too), without the need to have secondary > index, but again with the limitation it does not seems to be easily > feasible. > > Mik. > > On Wed, Jan 18, 2012 at 12:22 AM, Ted Yu wrote: > > > People rely on RDBMS for the transaction support. > > > > Consider the following example: > > A highly de-normalized schema puts related users in the same region where > > this 'limited cross row transactions' works. > > After some time, the region has to be split (maybe due to good business > > condition). > > What should the HBase user do now ? > > > > Cheers > > > > On Tue, Jan 17, 2012 at 2:13 PM, Mikael Sitruk > >wrote: > > > > > Ted - My 2 cents as a user. > > > The user should know what he is doing, this is like a 'delete' > operation, > > > this is less intuitive that the original delete in RDBMS, so the same > > will > > > be for this light transaction. > > > If the transaction fails because of cross region server then the design > > of > > > the user was wrong > > > if the transaction fails because of concurrent access, then he should > be > > > able to re-read and reprocess its request. > > > The only problem is how to make sure in advance that the different rows > > > will be in the same RS? > > > > > > Lars - is the limitation is at the region or at the region server? It > was > > > not so clear. > > > > > > Mikael.S > > > > > > On Tue, Jan 17, 2012 at 11:52 PM, Ted Yu wrote: > > > > > > > Back to original proposal: > > > > If client side grouping reveals that the batch of operations cannot > be > > > > supported by 'limited cross row transactions', what should the user > do > > ? > > > > > > > > Cheers > > > > > > > > On Tue, Jan 17, 2012 at 1:49 PM, Ted Yu wrote: > > > > > > > > > Whether Omid fits the bill is open to discussion. > > > > > > > > > > We should revisit HBASE-2315 and provide the support Flavio, et al > > > need. > > > > > > > > > > Cheers > > > > > > > > > > > > > > > On Tue, Jan 17, 2012 at 1:41 PM, Lars George < > lars.george@gmail.com > > > > >wrote: > > > > > > > > > >> Hi Ted, > > > > >> > > > > >> Wouldn't Omid (https://github.com/yahoo/omid) help there? Or is > > that > > > > too > > > > >> broad? Just curious. > > > > >> > > > > >> Lars > > > > >> > > > > >> On Jan 17, 2012, at 4:36 PM, Ted Yu wrote: > > > > >> > > > > >> > Can we collect use case for 'limited cross row transactions' > > first ? > > > > >> > > > > > >> > I have been thinking about (unlimited) multi-row transaction > > support > > > > in > > > > >> > HBase. It may not be a one-man task. But we should definitely > > > > implement > > > > >> it > > > > >> > someday. > > > > >> > > > > > >> > Cheers > > > > >> > > > > > >> > On Tue, Jan 17, 2012 at 1:27 PM, lars hofhansl < > > lhofhansl@yahoo.com > > > > > > > > >> wrote: > > > > >> > > > > > >> >> I just committed HBASE-5203 (together with HBASE-3584 this > > > implements > > > > >> >> atomic row operations). > > > > >> >> Although a relatively small patch it lays the groundwork for > > > > >> heterogeneous > > > > >> >> operations in a single WALEdit. > > > > >> >> > > > > >> >> The interesting part is that even though the code enforced the > > > atomic > > > > >> >> operation to be a for single row, this is not required. > > > > >> >> It is enough if all involved KVs reside in the same region. > > > > >> >> > > > > >> >> I am not saying that we should add any high level concept to > > HBase > > > > >> (such > > > > >> >> as the EntityGroups of Megastore). > > > > >> >> > > > > >> >> But, with a slight addition to the API (allowing a grouping of > > > > multiple > > > > >> >> row operations) client applications have all the building > blocks > > to > > > > do > > > > >> >> limited cross row atomic operations. > > > > >> >> The client application would be responsible for either > correctly > > > > >> >> pre-splitting the table, or a custom balancer has to be > provided. > > > > >> >> > > > > >> >> The operation would fail if the regionserver determines that it > > > would > > > > >> need > > > > >> >> data from multiple region servers. > > > > >> >> > > > > >> >> I think this needs at least minimal support from HBase and > cannot > > > > >> >> (efficiently or without adding more moving parts) by a client > API > > > > only. > > > > >> >> > > > > >> >> > > > > >> >> Comments? Is this worth pursuing? If so, I'll file a jira and > > > > provide a > > > > >> >> patch. > > > > >> >> > > > > >> >> Thanks. > > > > >> >> > > > > >> >> > > > > >> >> -- Lars > > > > >> >> > > > > >> >> > > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > > > -- > > > Mikael.S > > > > > > > > > -- > Mikael.S > > -- Mikael.S --1001534069-540856092-1328999696=:65225--