Return-Path: X-Original-To: apmail-hbase-dev-archive@www.apache.org Delivered-To: apmail-hbase-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B71A6D966 for ; Tue, 28 Aug 2012 18:13:02 +0000 (UTC) Received: (qmail 1887 invoked by uid 500); 28 Aug 2012 18:13:02 -0000 Delivered-To: apmail-hbase-dev-archive@hbase.apache.org Received: (qmail 1828 invoked by uid 500); 28 Aug 2012 18:13:02 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Received: (qmail 1820 invoked by uid 99); 28 Aug 2012 18:13:02 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 28 Aug 2012 18:13:02 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [98.138.91.188] (HELO nm28-vm4.bullet.mail.ne1.yahoo.com) (98.138.91.188) by apache.org (qpsmtpd/0.29) with SMTP; Tue, 28 Aug 2012 18:12:53 +0000 Received: from [98.138.90.49] by nm28.bullet.mail.ne1.yahoo.com with NNFMP; 28 Aug 2012 18:12:33 -0000 Received: from [98.138.89.174] by tm2.bullet.mail.ne1.yahoo.com with NNFMP; 28 Aug 2012 18:12:32 -0000 Received: from [127.0.0.1] by omp1030.mail.ne1.yahoo.com with NNFMP; 28 Aug 2012 18:12:32 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 885429.8463.bm@omp1030.mail.ne1.yahoo.com Received: (qmail 58367 invoked by uid 60001); 28 Aug 2012 18:12:32 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1346177552; bh=AEM+3g99bATHUga+5IKQisyxvzuc1kV2jOyX+0D4Ocg=; h=X-YMail-OSG:Received:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=5ImIkHay7KjD2aWcAJdUwfCnUdySSGnSMdsASym8hDNc6lVNFLqgFHNA7wDPfcQxohpWe0pHPlD1wCSVXbsGebdO26+Dnm0tcwz3KNgZQKj48Z14NPC9Bff0uSYutNKAncM3ze1BP7JqOO/mRRZcmOrSmURpPk7JGF2GX24cqDA= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=w7Oe8MizSkl8y6LG+mfjTi8Ylyy+4gwcw8UeJ9fcqkP56ySxKoZJEAWU3h88vcTNGGFZGz9dWb73P6ED6Dj9wBEdKGYIhKRWc19QnN89AQMIhvVOj4jhgCM1vQKELNDwKuP09WsJVDuSgJS9s2zS20Ikk8kAaS3xNs+fSkbVizw=; X-YMail-OSG: krO.LgIVM1ln.lkMYjMv1qMMycOSh2RhoLcdYoqQt9aScE_ JRwc8uzNezjer9r62pttq0dxS3jB8uMxo73ITRp8OxVJ9zq5ORwR70YGcsBV _CUGib.uW362P0UjGKwlhadWXFyEGiKu1afYk1Ainx6t7cIwMYjpwbI5g4GR 9x3JrYjrufos6JVsfempz.AjnRk78hYw.Cw8_xoMzlvZE5GnL8aCTzFT6KNw SrhFKKJDX9BOdtraKFWWjtZLnQeUXMmMDTQw4v_D3yyfadsIsHqhrHRH8Ks5 E3lEgMYaZAwoTGiT4nSRqPBUfBPO01iwSWMOfM0_7QrNWgUIfaaXWMY0aJUf jN3Oy3heunx_FmmORCX8d0TW1DSFPLDgj_SbU_clMNBE56G2mRaReLkaFAg. gTszsrGhMhbG.GOCABJYKUhS9k.RrwV8sNrfluHu92_9rgbz8ZS0yghK_JPf 8tYsTCm7PMi1xnktQO_E1EWNsPEa.eKCJi7wekLVuimY- Received: from [204.14.239.221] by web121702.mail.ne1.yahoo.com via HTTP; Tue, 28 Aug 2012 11:12:32 PDT X-Mailer: YahooMailWebService/0.8.121.416 References: Message-ID: <1346177552.57605.YahooMailNeo@web121702.mail.ne1.yahoo.com> Date: Tue, 28 Aug 2012 11:12:32 -0700 (PDT) From: lars hofhansl Reply-To: lars hofhansl Subject: Re: Improving Coprocessor postSplit/postOpen synchronization To: "dev@hbase.apache.org" In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org That approach sounds good to me.=0A=0A=0A=0A----- Original Message -----=0A= From: Andrew Purtell =0ATo: dev@hbase.apache.org=0ACc:= =0ASent: Tuesday, August 28, 2012 3:05 AM=0ASubject: Re: Improving Coproce= ssor postSplit/postOpen synchronization=0A=0ANever mind, I went to look at = the code. Should have done that first.=0A=0ALooking at 0.94 sources, in Spl= itTransaction, first we notify the master=0Athat the split has happened, an= d wait for the master to process it (which=0Aopens daughters), and then cal= l up to the CP with the daughter regions as=0Aarguments.=0A=0AI seem to rem= ember that in my prototype patch for the CP framework,=0ApostSplit notifica= tion let the CP know the split took place and allow it to=0Atake actions be= fore the master opened the daughters. In any event that's=0Anot the code no= w, so it seems what you need here is for us to move the=0ApostSplit upcall = up prior to master notification or add another hook at=0Athat location.=0A= =0AOn Tue, Aug 28, 2012 at 12:53 PM, Andrew Purtell wr= ote:=0A=0A> (from postSplit)=0A>=0A>=0A> On Tue, Aug 28, 2012 at 12:53 PM, = Andrew Purtell wrote:=0A>=0A>> What about writing a ma= rker (a file) into the region at split (from=0A>> preSplit) which is then e= xistence checked and read at open (postOpen)? This=0A>> file would contain = whatever indexing metadata is required.=0A>>=0A>> Also, splits are nearly i= nstant because the daughters are created with=0A>> reference files to the p= arent, until a later compaction brings the data=0A>> from the parent over. = Can you do the same with your indexes? Reason I ask=0A>> is this notion of = "ignoring" new data until indexes are available seems=0A>> undesirable.=0A>= >=0A>>=0A>> On Mon, Aug 27, 2012 at 11:29 PM, Kevin Shin <=0A>> kevin.shin@= thinkbiganalytics.com> wrote:=0A>>=0A>>> Hi everyone,=0A>>>=0A>>> A colleag= ue and I were working with HBase coprocessors for secondary=0A>>> indexes a= nd ran into an interesting problem regarding splits=0A>>> and synchronizing= the corresponding parent/daughter regions.=0A>>>=0A>>> The goal with split= s is to create two new daughter regions with the=0A>>> corresponding splits= of the secondary indexes and lock these regions such=0A>>> that Puts/Delet= es that occur while postSplit is in progress will be=0A>>> queued=0A>>> up = so we don't run into consistency issues. IE, if a delete gets called=0A>>> = before a daughter region receives the split index, that delete would=0A>>> = essentially have been ignored, so we would want to wait until postSplit=0A>= >> is=0A>>> finished before running any new Puts/Deletes on the split regio= ns.=0A>>>=0A>>> As of right now, the HBase coprocessors do not easily suppo= rt a way to=0A>>> achieve this level of consistency in that there is no way= to distinguish=0A>>> a=0A>>> region being opened from a split or a regular= open. If we could=0A>>> distinguish, we could open up the correct index fr= om the start and stall=0A>>> until postSplit is finished in the background = in the event of a split. I=0A>>> would thus like to propose a way to "lock"= the daughter regions when=0A>>> postSplit is called. That is, when we open= a daughter region from a=0A>>> split,=0A>>> we can pass in the parent regi= on name alongside it (or Null if there is=0A>>> no=0A>>> parent) to disting= uish a region being opened from a split or open. I am=0A>>> thinking about = submitting a patch into JIRA but would greatly appreciate=0A>>> any thought= s or suggestions for another solution to the problem or=0A>>> perhaps=0A>>>= a better patch. I am using HBase 0.92 for development at this moment.=0A>>= >=0A>>> Best,=0A>>> Kevin=0A>>>=0A>>=0A>>=0A>>=0A>> --=0A>> Best regards,= =0A>>=0A>>=A0 =A0 - Andy=0A>>=0A>> Problems worthy of attack prove their wo= rth by hitting back. - Piet Hein=0A>> (via Tom White)=0A>>=0A>>=0A>=0A>=0A>= --=0A> Best regards,=0A>=0A>=A0 =A0 - Andy=0A>=0A> Problems worthy of atta= ck prove their worth by hitting back. - Piet Hein=0A> (via Tom White)=0A>= =0A>=0A=0A=0A-- =0ABest regards,=0A=0A=A0 - Andy=0A=0AProblems worthy of a= ttack prove their worth by hitting back. - Piet Hein=0A(via Tom White)=0A