Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 66E39DEE7 for ; Sat, 20 Oct 2012 03:09:17 +0000 (UTC) Received: (qmail 97067 invoked by uid 500); 20 Oct 2012 03:09:13 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 96695 invoked by uid 500); 20 Oct 2012 03:09:12 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 96636 invoked by uid 99); 20 Oct 2012 03:09:10 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 20 Oct 2012 03:09:10 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of anoop.hbase@gmail.com designates 209.85.219.41 as permitted sender) Received: from [209.85.219.41] (HELO mail-oa0-f41.google.com) (209.85.219.41) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 20 Oct 2012 03:09:04 +0000 Received: by mail-oa0-f41.google.com with SMTP id k14so1260773oag.14 for ; Fri, 19 Oct 2012 20:08:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=JjhaCiXkKsIudZSw+UuI5MsGFiMYChfba+y4YI/bKx8=; b=FBSyL09yMwaUc76UszfquibU08rmAdyEQ/P294g2vAaAEr67925yzlriTa0EjFQ6Nk FI/N5HR4wkZCTaZH+Te9YwmQbiPu9gnN3TyDeW9QhlIMel1+F3VuoXwHoKPuStGQqewK 4BYb9HiRjlG8VirhZ15EXEnmOXvtSHQb3VfLzMo+seEVh5onS/Up7NBTaVdvvXBmQEIO IA6g3paKUr1naxgCiw2dsj/Yl2dLNbS6cafe53FABeJs+qkF9k4MvspWjYw+rE9qqXqm /zMKHtBujanTIe+sKao4Izda8KnXKBG3Uk8mitc2tMQ5s6uPhwkW1K3KGyl4B5pL297U mvvQ== MIME-Version: 1.0 Received: by 10.182.240.45 with SMTP id vx13mr2061403obc.21.1350702524104; Fri, 19 Oct 2012 20:08:44 -0700 (PDT) Received: by 10.60.6.161 with HTTP; Fri, 19 Oct 2012 20:08:44 -0700 (PDT) In-Reply-To: <902367BF-342E-4722-9E8A-D0AB30468B9A@gmail.com> References: <507ea940.8559420a.4715.ffffb94cSMTPIN_ADDED@mx.google.com> <0CE69E9126D0344088798A3B7F7F80863A4E45AF@szxeml531-mbx.china.huawei.com> <507f83c1.43db440a.3c8e.ffffb03eSMTPIN_ADDED@mx.google.com> <902367BF-342E-4722-9E8A-D0AB30468B9A@gmail.com> Date: Sat, 20 Oct 2012 08:38:44 +0530 Message-ID: Subject: Re: Where is code in hbase that physically delete a record? From: Anoop John To: user@hbase.apache.org Content-Type: multipart/alternative; boundary=14dae93a168f257fa604cc74f0df X-Virus-Checked: Checked by ClamAV on apache.org --14dae93a168f257fa604cc74f0df Content-Type: text/plain; charset=ISO-8859-1 Yes the KVs coming out from your delegate Scanner will be in sorted form.. Also with all other logic applied like removing TTL expired data, handling max versions etc.. Thanks for updating.. -Anoop- On Sat, Oct 20, 2012 at 1:11 AM, PG wrote: > Hi, Anoop and Ram, > As I have coded the idea, the detailed instructions are very helpful. One > minor thing to add is that coming out from scanner are the KeyValues which > are already sorted by column qualifier and time stamps. though i did not > find it mentioned in java doc, but i found it very useful feature to do > filtering. > > Thanks. > Yun > > On Oct 18, 2012, at 12:20 AM, "Ramkrishna.S.Vasudevan" < > ramkrishna.vasudevan@huawei.com> wrote: > > > Hi Yun > > > > Hope Anoop's clear explanation will help you. > > Just to add on, after you wrap the StoreScanner in your Custome Scanner > Impl > > you will invoke the next(List) on the delegator(here the > delegator > > is the actual StoreScanner). > > The delegator will give you the KV list that it has fetched from > underlying > > Scanners (Memstore and StoreFileScanner). > > Now on the returned kv you can do a check say if the KV has a column C1 > and > > its value is 'a', just skip it so that this scanner does not send the kv > to > > the actual Scanner on the outside of the custom Scanner which the > compaction > > tries to use. > > > > The Code may look lik this > > Class CustomScanner implements InternalScanner{ > > StoreScanner delegate; > > Public CustomScanner(){ > > Delegate = new SToreScanner(); > > > > Public boolean next(Listkv) > > { > > delegate.next(kv); > > foreach(kv){ > > //Do necessary filtering here. > > } > > > > } > > } > > > > Regards > > Ram > > > >> -----Original Message----- > >> From: Anoop Sam John [mailto:anoopsj@huawei.com] > >> Sent: Thursday, October 18, 2012 9:02 AM > >> To: user@hbase.apache.org > >> Subject: RE: Where is code in hbase that physically delete a record? > >> > >> Hi Yun, > >> We have preCompactScannerOpen() and preCompact() hooks.. > >> As we said, for compaction, a scanner for reading all corresponding > >> HFiles ( all HFiles in major compaction) will be created and scan via > >> that scanner.. ( calling next() methods).. The kernel will do this > >> way.. > >> Now using these hooks you can create a wrapper over the actual > >> scanner... In fact you can use preCompact() hook(I think that is fine > >> for you).. By the time this is being called, the actual scanner is > >> made and will get that object passed to your hook... You can create a > >> custom scanner impl and wrap the actual scanner within that and return > >> the new wrapper scanner from your post hook.. [Yes its return type is > >> InternalScanner] The actual scanner you can use as a delegator to do > >> the actual scanning purpose... Now all the KVs ( which the underlying > >> scanner passed) will flow via ur new wrapper scanner where you can > >> avoid certain KVs based on your condition or logic > >> > >> Core WrapperScannerImpl Actual > >> Scanner (created by core) > >> -> next(List) -> > >> next(List) > >> <- > >> Do the real scan from HFiles > >> See List KVs and remove > >> those u dont want > >> <- > >> Only the passed > >> KVs come in final > >> merged file > >> > >> Hope I make it clear for you :) > >> > >> Note : - preCompactScannerOpen() will be called before even creating > >> the actual scanner while preCompact() after this scanner creation.. You > >> can see the code in Store#compactStore() > >> > >> -Anoop- > >> ________________________________________ > >> From: yun peng [pengyunmomo@gmail.com] > >> Sent: Wednesday, October 17, 2012 9:04 PM > >> To: user@hbase.apache.org > >> Subject: Re: Where is code in hbase that physically delete a record? > >> > >> Hi, Ram and Anoop, Thanks for the nice reference on the java file, > >> which I > >> will check through. > >> > >> It is interesting to know about the recent feature on > >> preCompactScannerOpen() hook. Ram, it would be nice if I can know how > >> to > >> specify conditions like c1 = 'a'. I have also checked the example code > >> in > >> hbase 6496 link . > >> which > >> show how to delete data before time as in a on-demand specification... > >> Cheers, > >> Yun > >> > >> On Wed, Oct 17, 2012 at 8:46 AM, Ramkrishna.S.Vasudevan < > >> ramkrishna.vasudevan@huawei.com> wrote: > >> > >>> Also to see the code how the delete happens pls refer to > >> StoreScanner.java > >>> and how the ScanQueryMatcher.match() works. > >>> > >>> That is where we decide if any kv has to be avoided due to already > >> deleted > >>> tombstone marker. > >>> > >>> Forgot to tell you about this. > >>> > >>> Regards > >>> Ram > >>> > >>>> -----Original Message----- > >>>> From: yun peng [mailto:pengyunmomo@gmail.com] > >>>> Sent: Wednesday, October 17, 2012 5:54 PM > >>>> To: user@hbase.apache.org > >>>> Subject: Where is code in hbase that physically delete a record? > >>>> > >>>> Hi, All, > >>>> I want to find internal code in hbase where physical deleting a > >> record > >>>> occurs. > >>>> > >>>> -some of my understanding. > >>>> Correct me if I am wrong. (It is largely based on my experience and > >>>> even > >>>> speculation.) Logically deleting a KeyValue data in hbase is > >> performed > >>>> by > >>>> marking tombmarker (by Delete() per records) or setting > >> TTL/max_version > >>>> (per Store). After these actions, however, the physical data are > >> still > >>>> there, somewhere in the system. Physically deleting a record in > >> hbase > >>>> is > >>>> realised by *a scanner to discard a keyvalue data record* during > >> the > >>>> major_compact. > >>>> > >>>> -what I need > >>>> I want to extend hbase to associate some actions with physically > >>>> deleting a > >>>> record. Does hbase provide such hook (or coprocessor API) to inject > >>>> code > >>>> for each KV record that is skipped by hbase storescanner in > >>>> major_compact. > >>>> If not, anyone knows where should I look into in hbase (-0.94.2) > >> for > >>>> such > >>>> code modification? > >>>> > >>>> Thanks. > >>>> Yun > >>> > >>> = > > > --14dae93a168f257fa604cc74f0df--