hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Marc Spaggiari <jean-m...@spaggiari.org>
Subject Re: DISCUSS : HFile V3 proposal for tags in 0.96
Date Fri, 19 Jul 2013 13:09:34 GMT
Thanks Ram and Anoop for those details again. I don't think there is a need
to be able to revert from V3 to V2. And 1 byte overhead on an HFile is not
really an overhead. As Anoop proposed, if there is a way to de-activate the
tags feature when all the KVs in a file are having tag length as zero, then
it's all good!

Looking forward to test that!

JM

2013/7/19 ramkrishna vasudevan <ramkrishna.s.vasudevan@gmail.com>

> But am afraid that once the user switches to V3 with tags he cannot come
> back to V2.  If this scenario is possible then we need to see a work around
> for that?
> Particularly in the case if the user has written the tags and tries to read
> it back with V2 then it would not work.
>
> If user switches to V3 but does not write any tags then if we go with the
> option of making tags optional using the Fileinfo then atleast after the
> compaction is done the Hfile could be read with the V2 reader also.  But i
> don't think the user would intend to do this given the fact that he needs
> tags for his usecase.
>
> Regards
> Ram
>
>
> On Fri, Jul 19, 2013 at 5:21 PM, Anoop John <anoop.hbase@gmail.com> wrote:
>
> > Jean
> >         When V2 will be used there wont any extra bytes and so no
> overhead
> > in write or read paths.
> > When V3 is used, and there are no tags present at all, we will have extra
> > bytes for writing tag length.  Trying to put tag length as VInt so that
> > this will be 1 byte only.  Then using File infos we can avoid overhead.
> >
> > Say when all the KVs in a file are having tag length as zero( a filer
> > trailer indicate this) , during read we can avoid the read and decode of
> > teh tag length. Just skip one byte of tag length.
> >
> > Regarding avoiding the tag length (even the 1 byte fully)  maybe during
> > compaction it should be possible. But whether really needed I am
> thinikng.
> > User can select V3 when there is a need for Tags.
> >
> > -Anoop-
> >
> > On Fri, Jul 19, 2013 at 4:53 PM, Jean-Marc Spaggiari <
> > jean-marc@spaggiari.org> wrote:
> >
> > > Thanks Ram.
> > >
> > > One last. Space wise. If I understand correctly, between V2 and V3,
> when
> > > tags are de-activated, there will be only a 1 bit difference, so same
> > > storage space used. If tags are activated but empty, is it going to be
> > the
> > > same thing? Or are we going to have all the tags overhead? Like can we
> > have
> > > a byte to say "no tags in that file" in addition to "tags are activated
> > for
> > > that file"?
> > >
> > > So 2 questions.
> > >
> > > 1) what the overhead on disk space from the tags.
> > > 2) should we have a flag(bit) per file to say no tags even if activated
> > to
> > > limit this overhead and ket people activate it for futur uses?
> > >
> > > JMS
> > > Le 2013-07-19 07:11, "ramkrishna vasudevan" <
> > > ramkrishna.s.vasudevan@gmail.com> a écrit :
> > >
> > > > >>Based on your details, I think it will be, but very minimal, or
> > > > almost invisible, correct?
> > > > Yes of course.
> > > > Regarding migration, any file written with V2 would still be read
> with
> > > > HFileReaderV2 and the new files will be written with V3.  So there
> > should
> > > > not be any problem here.  We are anyway testing these things to  make
> > > sure
> > > > we don't break anywhere.  Thanks Jean for the interest.
> > > >
> > > > @Stack
> > > > I would write up on the changes foreseen for the Codec changes to
> > support
> > > > RPC and HFileV3.
> > > > Discussing with Anoop, we have some benefits when the Tags are
> written
> > as
> > > > the byte array and when tags are in memory.  Anyway that i would
> write
> > up
> > > > in a seperate thread also considering the inputs on the current way
> the
> > > > patch has been made.
> > > >
> > > > Regards
> > > > Ram
> > > >
> > > >
> > > > On Fri, Jul 19, 2013 at 4:32 PM, Jean-Marc Spaggiari <
> > > > jean-marc@spaggiari.org> wrote:
> > > >
> > > > > Like Ted and St.Ack, I read all of this with a great interest and
> > > > > everything looked good to me.
> > > > >
> > > > > My only concern will be performance wise.  Even if tags are
> disabled,
> > > di
> > > > > you forsee some performances impacts because everything will now
> need
> > > to
> > > > be
> > > > > tag aware? Based on your details, I think it will be, but very
> > minimal,
> > > > or
> > > > > almost invisible, correct?
> > > > >
> > > > > Also, for migrations from v2 to v3, if v3 is activated, that will
> be
> > > > simply
> > > > > done when HFilea will be written, correct? So not really any
> > migration
> > > > > process required?
> > > > >
> > > > > JM
> > > > > Le 2013-07-19 01:13, "Stack" <stack@duboce.net> a écrit :
> > > > >
> > > > > > On Thu, Jul 18, 2013 at 10:14 AM, ramkrishna vasudevan <
> > > > > > ramkrishna.s.vasudevan@gmail.com> wrote:
> > > > > > ...
> > > > > >
> > > > > > >  We can avoid several problems with HFile V2 internals,
and
> > > backwards
> > > > > > > compatibility concerns, and allow for working tags support
with
> > no
> > > > > > > performance impact and low risk to all HBase users who
do not
> > want
> > > > tag
> > > > > > > support, while still allowing for inline tags capabilities
in a
> > > > > shipping
> > > > > > > version of HBase, by introducing this in a new V3 version
for
> > > HFile.
> > > > > > >
> > > > > > >
> > > > > > This seems like a good tactic to me.  HFileV2 has the current
KV
> > > format
> > > > > > hard-coded all over and trying to 'fix' this would probably
take
> a
> > > > bunch
> > > > > of
> > > > > > effort and would jeopardize current workings.
> > > > > >
> > > > > > ....
> > > > > >
> > > > > > >
> > > > > > >  We have been working on this and will have a clean patch
with
> > good
> > > > > > amount
> > > > > > > of testing in time for 0.96.
> > > > > > >
> > > > > > >
> > > > > > I'd think that your moving into a green field by doing an hfilev3
> > > would
> > > > > > make it so your work could run independent of 0.96 timeline;
i.e.
> > it
> > > > > could
> > > > > > come in post 0.96?
> > > > > >
> > > > > > What sort of changes do you foresee necessary in core to support
> > cell
> > > > > > codecs?  Between rpc and hfilev3?
> > > > > >
> > > > > > Thanks Ram,
> > > > > > St.Ack
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message