hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: DISCUSS : HFile V3 proposal for tags in 0.96
Date Fri, 19 Jul 2013 14:18:27 GMT
Would tags be visible to methods of BaseRegionObserver, other than
AccessController ?

Meaning, would other (non-secure) components of HBase be able to use cell
tagging to store certain information ?

Please clarify.

Thanks

On Fri, Jul 19, 2013 at 6:09 AM, Jean-Marc Spaggiari <
jean-marc@spaggiari.org> wrote:

> Thanks Ram and Anoop for those details again. I don't think there is a need
> to be able to revert from V3 to V2. And 1 byte overhead on an HFile is not
> really an overhead. As Anoop proposed, if there is a way to de-activate the
> tags feature when all the KVs in a file are having tag length as zero, then
> it's all good!
>
> Looking forward to test that!
>
> JM
>
> 2013/7/19 ramkrishna vasudevan <ramkrishna.s.vasudevan@gmail.com>
>
> > But am afraid that once the user switches to V3 with tags he cannot come
> > back to V2.  If this scenario is possible then we need to see a work
> around
> > for that?
> > Particularly in the case if the user has written the tags and tries to
> read
> > it back with V2 then it would not work.
> >
> > If user switches to V3 but does not write any tags then if we go with the
> > option of making tags optional using the Fileinfo then atleast after the
> > compaction is done the Hfile could be read with the V2 reader also.  But
> i
> > don't think the user would intend to do this given the fact that he needs
> > tags for his usecase.
> >
> > Regards
> > Ram
> >
> >
> > On Fri, Jul 19, 2013 at 5:21 PM, Anoop John <anoop.hbase@gmail.com>
> wrote:
> >
> > > Jean
> > >         When V2 will be used there wont any extra bytes and so no
> > overhead
> > > in write or read paths.
> > > When V3 is used, and there are no tags present at all, we will have
> extra
> > > bytes for writing tag length.  Trying to put tag length as VInt so that
> > > this will be 1 byte only.  Then using File infos we can avoid overhead.
> > >
> > > Say when all the KVs in a file are having tag length as zero( a filer
> > > trailer indicate this) , during read we can avoid the read and decode
> of
> > > teh tag length. Just skip one byte of tag length.
> > >
> > > Regarding avoiding the tag length (even the 1 byte fully)  maybe during
> > > compaction it should be possible. But whether really needed I am
> > thinikng.
> > > User can select V3 when there is a need for Tags.
> > >
> > > -Anoop-
> > >
> > > On Fri, Jul 19, 2013 at 4:53 PM, Jean-Marc Spaggiari <
> > > jean-marc@spaggiari.org> wrote:
> > >
> > > > Thanks Ram.
> > > >
> > > > One last. Space wise. If I understand correctly, between V2 and V3,
> > when
> > > > tags are de-activated, there will be only a 1 bit difference, so same
> > > > storage space used. If tags are activated but empty, is it going to
> be
> > > the
> > > > same thing? Or are we going to have all the tags overhead? Like can
> we
> > > have
> > > > a byte to say "no tags in that file" in addition to "tags are
> activated
> > > for
> > > > that file"?
> > > >
> > > > So 2 questions.
> > > >
> > > > 1) what the overhead on disk space from the tags.
> > > > 2) should we have a flag(bit) per file to say no tags even if
> activated
> > > to
> > > > limit this overhead and ket people activate it for futur uses?
> > > >
> > > > JMS
> > > > Le 2013-07-19 07:11, "ramkrishna vasudevan" <
> > > > ramkrishna.s.vasudevan@gmail.com> a écrit :
> > > >
> > > > > >>Based on your details, I think it will be, but very minimal,
or
> > > > > almost invisible, correct?
> > > > > Yes of course.
> > > > > Regarding migration, any file written with V2 would still be read
> > with
> > > > > HFileReaderV2 and the new files will be written with V3.  So there
> > > should
> > > > > not be any problem here.  We are anyway testing these things to
>  make
> > > > sure
> > > > > we don't break anywhere.  Thanks Jean for the interest.
> > > > >
> > > > > @Stack
> > > > > I would write up on the changes foreseen for the Codec changes to
> > > support
> > > > > RPC and HFileV3.
> > > > > Discussing with Anoop, we have some benefits when the Tags are
> > written
> > > as
> > > > > the byte array and when tags are in memory.  Anyway that i would
> > write
> > > up
> > > > > in a seperate thread also considering the inputs on the current way
> > the
> > > > > patch has been made.
> > > > >
> > > > > Regards
> > > > > Ram
> > > > >
> > > > >
> > > > > On Fri, Jul 19, 2013 at 4:32 PM, Jean-Marc Spaggiari <
> > > > > jean-marc@spaggiari.org> wrote:
> > > > >
> > > > > > Like Ted and St.Ack, I read all of this with a great interest
and
> > > > > > everything looked good to me.
> > > > > >
> > > > > > My only concern will be performance wise.  Even if tags are
> > disabled,
> > > > di
> > > > > > you forsee some performances impacts because everything will
now
> > need
> > > > to
> > > > > be
> > > > > > tag aware? Based on your details, I think it will be, but very
> > > minimal,
> > > > > or
> > > > > > almost invisible, correct?
> > > > > >
> > > > > > Also, for migrations from v2 to v3, if v3 is activated, that
will
> > be
> > > > > simply
> > > > > > done when HFilea will be written, correct? So not really any
> > > migration
> > > > > > process required?
> > > > > >
> > > > > > JM
> > > > > > Le 2013-07-19 01:13, "Stack" <stack@duboce.net> a écrit
:
> > > > > >
> > > > > > > On Thu, Jul 18, 2013 at 10:14 AM, ramkrishna vasudevan
<
> > > > > > > ramkrishna.s.vasudevan@gmail.com> wrote:
> > > > > > > ...
> > > > > > >
> > > > > > > >  We can avoid several problems with HFile V2 internals,
and
> > > > backwards
> > > > > > > > compatibility concerns, and allow for working tags
support
> with
> > > no
> > > > > > > > performance impact and low risk to all HBase users
who do not
> > > want
> > > > > tag
> > > > > > > > support, while still allowing for inline tags capabilities
> in a
> > > > > > shipping
> > > > > > > > version of HBase, by introducing this in a new V3
version for
> > > > HFile.
> > > > > > > >
> > > > > > > >
> > > > > > > This seems like a good tactic to me.  HFileV2 has the current
> KV
> > > > format
> > > > > > > hard-coded all over and trying to 'fix' this would probably
> take
> > a
> > > > > bunch
> > > > > > of
> > > > > > > effort and would jeopardize current workings.
> > > > > > >
> > > > > > > ....
> > > > > > >
> > > > > > > >
> > > > > > > >  We have been working on this and will have a clean
patch
> with
> > > good
> > > > > > > amount
> > > > > > > > of testing in time for 0.96.
> > > > > > > >
> > > > > > > >
> > > > > > > I'd think that your moving into a green field by doing
an
> hfilev3
> > > > would
> > > > > > > make it so your work could run independent of 0.96 timeline;
> i.e.
> > > it
> > > > > > could
> > > > > > > come in post 0.96?
> > > > > > >
> > > > > > > What sort of changes do you foresee necessary in core to
> support
> > > cell
> > > > > > > codecs?  Between rpc and hfilev3?
> > > > > > >
> > > > > > > Thanks Ram,
> > > > > > > St.Ack
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message