Return-Path: X-Original-To: apmail-hbase-dev-archive@www.apache.org Delivered-To: apmail-hbase-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9C07FC517 for ; Fri, 19 Jul 2013 15:12:50 +0000 (UTC) Received: (qmail 88268 invoked by uid 500); 19 Jul 2013 15:12:49 -0000 Delivered-To: apmail-hbase-dev-archive@hbase.apache.org Received: (qmail 88120 invoked by uid 500); 19 Jul 2013 15:12:49 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Received: (qmail 88107 invoked by uid 99); 19 Jul 2013 15:12:48 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 19 Jul 2013 15:12:48 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of anoop.hbase@gmail.com designates 209.85.216.172 as permitted sender) Received: from [209.85.216.172] (HELO mail-qc0-f172.google.com) (209.85.216.172) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 19 Jul 2013 15:12:42 +0000 Received: by mail-qc0-f172.google.com with SMTP id j10so2408952qcx.17 for ; Fri, 19 Jul 2013 08:12:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=N+lVImxZC0KKz5V2p5nc4WP66RaiBlPQYfcKtYJECzk=; b=W6eYq/Rj0L/pbDSH9/pKz+K9u9NRUslPpO1kawB/BCrEk0EdK/Tpax1B1EjwHL8YVr Yp/WL3scgXq+CYgZDMCfnQpFSZpFhcRHSoQVz7pk7aTSfx13jgf33yPlND7n5nP5Qx0V S2GSWhNyeEx6dMGs+ABEIg9X6WM8QvRoQPhSEmhSPE+P4QTlnD/uN3I4pwZ8EefgOCon W/0XCOF/ITOJoxZCI3SKJA9pvFft1mNWKDlWt4v7gZ2rS5cG9jqBlEiMzObLQZ8Oi596 vqWGuYu+jzkKWRPvyZXA7VCqRqhwPiv+8sGCHNX6RGXsPzcHzYOyeQuWxTk5AunTGaqF Qklw== MIME-Version: 1.0 X-Received: by 10.229.141.10 with SMTP id k10mr4263973qcu.44.1374246741839; Fri, 19 Jul 2013 08:12:21 -0700 (PDT) Received: by 10.49.116.179 with HTTP; Fri, 19 Jul 2013 08:12:21 -0700 (PDT) In-Reply-To: References: Date: Fri, 19 Jul 2013 20:42:21 +0530 Message-ID: Subject: Re: DISCUSS : HFile V3 proposal for tags in 0.96 From: Anoop John To: dev@hbase.apache.org Content-Type: multipart/alternative; boundary=90e6ba3098a8e1967a04e1dec0e3 X-Virus-Checked: Checked by ClamAV on apache.org --90e6ba3098a8e1967a04e1dec0e3 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable It should be Ted. Tags will be present in KV(Cell). So whichever part deals with KVs (Cells) can use the tags and do some thing with that. Do some checks in Filter and filter out KVs, or access in CP etc etc. -Anoop- On Fri, Jul 19, 2013 at 7:48 PM, Ted Yu wrote: > Would tags be visible to methods of BaseRegionObserver, other than > AccessController ? > > Meaning, would other (non-secure) components of HBase be able to use cell > tagging to store certain information ? > > Please clarify. > > Thanks > > On Fri, Jul 19, 2013 at 6:09 AM, Jean-Marc Spaggiari < > jean-marc@spaggiari.org> wrote: > > > Thanks Ram and Anoop for those details again. I don't think there is a > need > > to be able to revert from V3 to V2. And 1 byte overhead on an HFile is > not > > really an overhead. As Anoop proposed, if there is a way to de-activate > the > > tags feature when all the KVs in a file are having tag length as zero, > then > > it's all good! > > > > Looking forward to test that! > > > > JM > > > > 2013/7/19 ramkrishna vasudevan > > > > > But am afraid that once the user switches to V3 with tags he cannot > come > > > back to V2. If this scenario is possible then we need to see a work > > around > > > for that? > > > Particularly in the case if the user has written the tags and tries t= o > > read > > > it back with V2 then it would not work. > > > > > > If user switches to V3 but does not write any tags then if we go with > the > > > option of making tags optional using the Fileinfo then atleast after > the > > > compaction is done the Hfile could be read with the V2 reader also. > But > > i > > > don't think the user would intend to do this given the fact that he > needs > > > tags for his usecase. > > > > > > Regards > > > Ram > > > > > > > > > On Fri, Jul 19, 2013 at 5:21 PM, Anoop John > > wrote: > > > > > > > Jean > > > > When V2 will be used there wont any extra bytes and so no > > > overhead > > > > in write or read paths. > > > > When V3 is used, and there are no tags present at all, we will have > > extra > > > > bytes for writing tag length. Trying to put tag length as VInt so > that > > > > this will be 1 byte only. Then using File infos we can avoid > overhead. > > > > > > > > Say when all the KVs in a file are having tag length as zero( a fil= er > > > > trailer indicate this) , during read we can avoid the read and deco= de > > of > > > > teh tag length. Just skip one byte of tag length. > > > > > > > > Regarding avoiding the tag length (even the 1 byte fully) maybe > during > > > > compaction it should be possible. But whether really needed I am > > > thinikng. > > > > User can select V3 when there is a need for Tags. > > > > > > > > -Anoop- > > > > > > > > On Fri, Jul 19, 2013 at 4:53 PM, Jean-Marc Spaggiari < > > > > jean-marc@spaggiari.org> wrote: > > > > > > > > > Thanks Ram. > > > > > > > > > > One last. Space wise. If I understand correctly, between V2 and V= 3, > > > when > > > > > tags are de-activated, there will be only a 1 bit difference, so > same > > > > > storage space used. If tags are activated but empty, is it going = to > > be > > > > the > > > > > same thing? Or are we going to have all the tags overhead? Like c= an > > we > > > > have > > > > > a byte to say "no tags in that file" in addition to "tags are > > activated > > > > for > > > > > that file"? > > > > > > > > > > So 2 questions. > > > > > > > > > > 1) what the overhead on disk space from the tags. > > > > > 2) should we have a flag(bit) per file to say no tags even if > > activated > > > > to > > > > > limit this overhead and ket people activate it for futur uses? > > > > > > > > > > JMS > > > > > Le 2013-07-19 07:11, "ramkrishna vasudevan" < > > > > > ramkrishna.s.vasudevan@gmail.com> a =E9crit : > > > > > > > > > > > >>Based on your details, I think it will be, but very minimal, = or > > > > > > almost invisible, correct? > > > > > > Yes of course. > > > > > > Regarding migration, any file written with V2 would still be re= ad > > > with > > > > > > HFileReaderV2 and the new files will be written with V3. So > there > > > > should > > > > > > not be any problem here. We are anyway testing these things to > > make > > > > > sure > > > > > > we don't break anywhere. Thanks Jean for the interest. > > > > > > > > > > > > @Stack > > > > > > I would write up on the changes foreseen for the Codec changes = to > > > > support > > > > > > RPC and HFileV3. > > > > > > Discussing with Anoop, we have some benefits when the Tags are > > > written > > > > as > > > > > > the byte array and when tags are in memory. Anyway that i woul= d > > > write > > > > up > > > > > > in a seperate thread also considering the inputs on the current > way > > > the > > > > > > patch has been made. > > > > > > > > > > > > Regards > > > > > > Ram > > > > > > > > > > > > > > > > > > On Fri, Jul 19, 2013 at 4:32 PM, Jean-Marc Spaggiari < > > > > > > jean-marc@spaggiari.org> wrote: > > > > > > > > > > > > > Like Ted and St.Ack, I read all of this with a great interest > and > > > > > > > everything looked good to me. > > > > > > > > > > > > > > My only concern will be performance wise. Even if tags are > > > disabled, > > > > > di > > > > > > > you forsee some performances impacts because everything will > now > > > need > > > > > to > > > > > > be > > > > > > > tag aware? Based on your details, I think it will be, but ver= y > > > > minimal, > > > > > > or > > > > > > > almost invisible, correct? > > > > > > > > > > > > > > Also, for migrations from v2 to v3, if v3 is activated, that > will > > > be > > > > > > simply > > > > > > > done when HFilea will be written, correct? So not really any > > > > migration > > > > > > > process required? > > > > > > > > > > > > > > JM > > > > > > > Le 2013-07-19 01:13, "Stack" a =E9crit : > > > > > > > > > > > > > > > On Thu, Jul 18, 2013 at 10:14 AM, ramkrishna vasudevan < > > > > > > > > ramkrishna.s.vasudevan@gmail.com> wrote: > > > > > > > > ... > > > > > > > > > > > > > > > > > We can avoid several problems with HFile V2 internals, a= nd > > > > > backwards > > > > > > > > > compatibility concerns, and allow for working tags suppor= t > > with > > > > no > > > > > > > > > performance impact and low risk to all HBase users who do > not > > > > want > > > > > > tag > > > > > > > > > support, while still allowing for inline tags capabilitie= s > > in a > > > > > > > shipping > > > > > > > > > version of HBase, by introducing this in a new V3 version > for > > > > > HFile. > > > > > > > > > > > > > > > > > > > > > > > > > > This seems like a good tactic to me. HFileV2 has the curre= nt > > KV > > > > > format > > > > > > > > hard-coded all over and trying to 'fix' this would probably > > take > > > a > > > > > > bunch > > > > > > > of > > > > > > > > effort and would jeopardize current workings. > > > > > > > > > > > > > > > > .... > > > > > > > > > > > > > > > > > > > > > > > > > > We have been working on this and will have a clean patch > > with > > > > good > > > > > > > > amount > > > > > > > > > of testing in time for 0.96. > > > > > > > > > > > > > > > > > > > > > > > > > > I'd think that your moving into a green field by doing an > > hfilev3 > > > > > would > > > > > > > > make it so your work could run independent of 0.96 timeline= ; > > i.e. > > > > it > > > > > > > could > > > > > > > > come in post 0.96? > > > > > > > > > > > > > > > > What sort of changes do you foresee necessary in core to > > support > > > > cell > > > > > > > > codecs? Between rpc and hfilev3? > > > > > > > > > > > > > > > > Thanks Ram, > > > > > > > > St.Ack > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > --90e6ba3098a8e1967a04e1dec0e3--