hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anoop John <anoop.hb...@gmail.com>
Subject Re: DISCUSS : HFile V3 proposal for tags in 0.96
Date Fri, 19 Jul 2013 11:51:54 GMT
Jean
        When V2 will be used there wont any extra bytes and so no overhead
in write or read paths.
When V3 is used, and there are no tags present at all, we will have extra
bytes for writing tag length.  Trying to put tag length as VInt so that
this will be 1 byte only.  Then using File infos we can avoid overhead.

Say when all the KVs in a file are having tag length as zero( a filer
trailer indicate this) , during read we can avoid the read and decode of
teh tag length. Just skip one byte of tag length.

Regarding avoiding the tag length (even the 1 byte fully)  maybe during
compaction it should be possible. But whether really needed I am thinikng.
User can select V3 when there is a need for Tags.

-Anoop-

On Fri, Jul 19, 2013 at 4:53 PM, Jean-Marc Spaggiari <
jean-marc@spaggiari.org> wrote:

> Thanks Ram.
>
> One last. Space wise. If I understand correctly, between V2 and V3, when
> tags are de-activated, there will be only a 1 bit difference, so same
> storage space used. If tags are activated but empty, is it going to be the
> same thing? Or are we going to have all the tags overhead? Like can we have
> a byte to say "no tags in that file" in addition to "tags are activated for
> that file"?
>
> So 2 questions.
>
> 1) what the overhead on disk space from the tags.
> 2) should we have a flag(bit) per file to say no tags even if activated to
> limit this overhead and ket people activate it for futur uses?
>
> JMS
> Le 2013-07-19 07:11, "ramkrishna vasudevan" <
> ramkrishna.s.vasudevan@gmail.com> a écrit :
>
> > >>Based on your details, I think it will be, but very minimal, or
> > almost invisible, correct?
> > Yes of course.
> > Regarding migration, any file written with V2 would still be read with
> > HFileReaderV2 and the new files will be written with V3.  So there should
> > not be any problem here.  We are anyway testing these things to  make
> sure
> > we don't break anywhere.  Thanks Jean for the interest.
> >
> > @Stack
> > I would write up on the changes foreseen for the Codec changes to support
> > RPC and HFileV3.
> > Discussing with Anoop, we have some benefits when the Tags are written as
> > the byte array and when tags are in memory.  Anyway that i would write up
> > in a seperate thread also considering the inputs on the current way the
> > patch has been made.
> >
> > Regards
> > Ram
> >
> >
> > On Fri, Jul 19, 2013 at 4:32 PM, Jean-Marc Spaggiari <
> > jean-marc@spaggiari.org> wrote:
> >
> > > Like Ted and St.Ack, I read all of this with a great interest and
> > > everything looked good to me.
> > >
> > > My only concern will be performance wise.  Even if tags are disabled,
> di
> > > you forsee some performances impacts because everything will now need
> to
> > be
> > > tag aware? Based on your details, I think it will be, but very minimal,
> > or
> > > almost invisible, correct?
> > >
> > > Also, for migrations from v2 to v3, if v3 is activated, that will be
> > simply
> > > done when HFilea will be written, correct? So not really any migration
> > > process required?
> > >
> > > JM
> > > Le 2013-07-19 01:13, "Stack" <stack@duboce.net> a écrit :
> > >
> > > > On Thu, Jul 18, 2013 at 10:14 AM, ramkrishna vasudevan <
> > > > ramkrishna.s.vasudevan@gmail.com> wrote:
> > > > ...
> > > >
> > > > >  We can avoid several problems with HFile V2 internals, and
> backwards
> > > > > compatibility concerns, and allow for working tags support with no
> > > > > performance impact and low risk to all HBase users who do not want
> > tag
> > > > > support, while still allowing for inline tags capabilities in a
> > > shipping
> > > > > version of HBase, by introducing this in a new V3 version for
> HFile.
> > > > >
> > > > >
> > > > This seems like a good tactic to me.  HFileV2 has the current KV
> format
> > > > hard-coded all over and trying to 'fix' this would probably take a
> > bunch
> > > of
> > > > effort and would jeopardize current workings.
> > > >
> > > > ....
> > > >
> > > > >
> > > > >  We have been working on this and will have a clean patch with good
> > > > amount
> > > > > of testing in time for 0.96.
> > > > >
> > > > >
> > > > I'd think that your moving into a green field by doing an hfilev3
> would
> > > > make it so your work could run independent of 0.96 timeline; i.e. it
> > > could
> > > > come in post 0.96?
> > > >
> > > > What sort of changes do you foresee necessary in core to support cell
> > > > codecs?  Between rpc and hfilev3?
> > > >
> > > > Thanks Ram,
> > > > St.Ack
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message