hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: [DISCUSS] Move Type out of KeyValue
Date Tue, 03 Oct 2017 16:42:08 GMT
On Mon, Oct 2, 2017 at 11:46 PM, Chia-Ping Tsai <chia7712@apache.org> wrote:

> ...
>
> > We'd still be stuck at read time when all we had was Cell#getTypeByte
> > returning a byte.
> >
> > We could add a CellType with the KV#Type static utility codeToType but
> > would be sweet if we could do without having to explain the byte to
> users.
> It seems to me that Cell is a general interface for kv format and others
> (not happen now), but we design the Cell's APIs according to the KV format.
> The cell's type should not be exposed to end user - I mean the
> Cell#getTypeByte should be removed - because it is hard to say other
> formats have the type field.



Or they may chose to implement typing differently; i.e. not via a dedicated
byte.

So, yes, Cell#getTypeByte is problematic in our Cell interface. We should
at least note its problems in the Interface so some future implementer who
has not seen this conversation doesn't get hung up on it.


S


The KV should have own interface which used in HRegion/HStore/HStoreFIle.
> The ExtendedCell is a good candidate because i don't think we need to mix
> together the kv format and other format in the same table/region/file. For
> normal user, all cells they get via Scan/Get are real Cell (not Delete mark
> or other type), so the type API is redundant. For advanced user, they can
> get the IA.LimitedPrivate Cell, such as KVInterface, via raw scan or other
> advanced operations.
>
> Cell(IA.Public) <---- KVInterface(IA.LimitedPrivate) <--- KeyValue,
> Bufferedxxx (IA.Private)
>                         <---- XXXInterface(IA.LimitedPrivate) <--- xxx
> cell (IA.Private)
>
>
>





> On 2017-10-03 13:29, Stack <stack@duboce.net> wrote:
> > On Mon, Oct 2, 2017 at 1:54 AM, Chia-Ping Tsai <chia7712@apache.org>
> wrote:
> >
> > > How about introducing an new enum "CellType" which is subset of
> > > KeyValue#Type? It will be exposed as IA.Public to end user for helping
> > > build the custom cell (via CellBuilder). The types which "CellType"
> should
> > > have are shown below.
> > > 1) Put
> > > 2) Delete
> > > 3) DeleteFamilyVersion
> > > 4) DeleteColumn
> > > 5) DeleteFamily
> > > Hence, the CellBuilder#setType(byte) will be replaced by
> > > CellBuilder#setType(CellType). Our internal use still reference to
> > > KeyValue#Type.
> > >
> > >
> > There is a 'hole' in our Cell Interface; we have a getTypeByte but expose
> > no means of signaling what the byte stands for.
> >
> > You could add to CellBuilder methods to do setPutType, setDeleteType,
> etc.,
> > which would work for build time allowing you could keep Type byte
> private.
> > This would seem to be enough for your case Chia-Ping?
> >
> > We'd still be stuck at read time when all we had was Cell#getTypeByte
> > returning a byte.
> >
> > We could add a CellType with the KV#Type static utility codeToType but
> > would be sweet if we could do without having to explain the byte to
> users.
> >
> > S
> >
> >
> >
> >
> >
> >
> >
> > >
> > > On 2017-09-29 18:39, Anoop John <anoop.hbase@gmail.com> wrote:
> > > > Ya as Chia-Ping said, the problem he is trying to solve is very basic
> > > > one. As long as we allow custom Cell creation (Via CellBuilder API)
> > > > and allow Mutations to be added with Cells and pass that from client
> > > > side APIs, we have to make the Type public accessible.
> > > > Or else the Cell building APIs should not be taking in a type byte.
> > > > We have to some way allow user to make put/delete cells stc.
> > > >
> > > > Is type that bound for only KV?   We have getType in Cell also right?
> > > > The type in full form what we have in KV now, may be making us
> confuse
> > > > here?  As Ram said it contains some internal types also which the
> user
> > > > has never to know abt.   Pls correct if saying in wrong way.
> > > >
> > > > Good that Chia-Ping brought this out here.   We have to either way
> > > > solve it and make the public API fully public.
> > > >
> > > > -Anoop-
> > > >
> > > > On Fri, Sep 29, 2017 at 2:27 PM, ramkrishna vasudevan
> > > > <ramkrishna.s.vasudevan@gmail.com> wrote:
> > > > > Even if we are trying to move out I think only few of the types are
> > > really
> > > > > user readable. So we should be very careful here. So since we have
> > > > > CellBuilder way it is better we check what type of cells a user can
> > > build.
> > > > > I think for now the Cellbuilder is not client exposed?
> > > > > But again moving to Cell means it becomes public which is not right
> > > IMO and
> > > > > I thinks others here also agree to it.
> > > > >
> > > > > Regards
> > > > > Ram
> > > > >
> > > > > On Fri, Sep 29, 2017 at 10:50 AM, Chia-Ping Tsai <
> chia7712@apache.org>
> > > > > wrote:
> > > > >
> > > > >> Thanks for all comment.
> > > > >>
> > > > >> The problem i want to resolve is the valid code should be exposed
> as
> > > > >> IA.Public. Otherwise, end user have to access the IA.Private
> class to
> > > build
> > > > >> the custom cell.
> > > > >>
> > > > >> For example, I have a use case which plays a streaming role in
our
> > > > >> appliaction. It
> > > > >> applies the CellBuilder(HBASE-18519) to build custom cells. These
> > > cells
> > > > >> have many same fields so they are put in shared-memory for
> avoiding GC
> > > > >> pause. Everything is wonderful. However, we have to access the
> > > IA.Private
> > > > >> class - KeyValue#Type - to get the valid code of Put.
> > > > >>
> > > > >> I believe there are many use cases of custom cell, and
> consequently
> > > it is
> > > > >> worth adding a way to get the valid type via IA.Public class.
> > > Otherwise, it
> > > > >> may imply that the custom cell is based on a unstable way,
> because the
> > > > >> related code can be changed at any time.
> > > > >> --
> > > > >> Chia-Ping
> > > > >>
> > > > >> On 2017-09-29 00:49, Andrew Purtell <apurtell@apache.org>
wrote:
> > > > >> > I agree with Stack. Was typing up a reply to Anoop but let
me
> move
> > > it
> > > > >> down
> > > > >> > here.
> > > > >> >
> > > > >> > The type code exposes some low level details of how our
current
> > > stores
> > > > >> are
> > > > >> > architected. But what if in the future you could swap out
HStore
> > > > >> implements
> > > > >> > Store with PStore implements Store, where HStore is backed
by
> > > HFiles and
> > > > >> > PStore is backed by Parquet? Just as a hypothetical example.
I
> know
> > > there
> > > > >> > would be larger issues if this were actually attempted.
Bear
> with
> > > me. You
> > > > >> > can imagine some different new Store implementation that
has
> some
> > > > >> > advantages but is not a design derived from the log structured
> > > merge tree
> > > > >> > if you like. Most values from a new Cell.Type based on
> KeyValue.Type
> > > > >> > wouldn't apply to cells from such a thing because they are
> > > particular to
> > > > >> > how LSMs work. I'm sure such a project if attempted would
make a
> > > number
> > > > >> of
> > > > >> > changes requiring a major version increment and low level
> details
> > > could
> > > > >> be
> > > > >> > unwound from Cell then, but if we could avoid doing it in
the
> first
> > > > >> place,
> > > > >> > I think it would better for maintainability.
> > > > >> >
> > > > >> >
> > > > >> > On Thu, Sep 28, 2017 at 9:39 AM, Stack <stack@duboce.net>
> wrote:
> > > > >> >
> > > > >> > > On Thu, Sep 28, 2017 at 2:25 AM, Chia-Ping Tsai <
> > > chia7712@apache.org>
> > > > >> > > wrote:
> > > > >> > >
> > > > >> > > > hi folks,
> > > > >> > > >
> > > > >> > > > User is allowed to create custom cell but the
valid code of
> > > type -
> > > > >> > > > KeyValue#Type - is declared as IA.Private. As
i see it, we
> > > should
> > > > >> expose
> > > > >> > > > KeyValue#Type as Public Client. Three possible
ways are
> shown
> > > below:
> > > > >> > > > 1) Change declaration of KeyValue#Type from IA.Private
to
> > > IA.Public
> > > > >> > > > 2) Move KeyValue#Type into Cell.
> > > > >> > > > 3) Move KeyValue#Type to upper level
> > > > >> > > >
> > > > >> > > > Any suggestions?
> > > > >> > > >
> > > > >> > > >
> > > > >> > > What is the problem that we are trying to solve Chia-Ping?
You
> > > want to
> > > > >> make
> > > > >> > > Cells of a new Type?
> > > > >> > >
> > > > >> > > My first reaction is that KV#Type is particular to
the KV
> > > > >> implementation.
> > > > >> > > Any new Cell implementation should not have to adopt
the
> KeyValue
> > > > >> typing
> > > > >> > > mechanism.
> > > > >> > >
> > > > >> > > S
> > > > >> > >
> > > > >> > >
> > > > >> > >
> > > > >> > >
> > > > >> > > > --
> > > > >> > > > Chia-Ping
> > > > >> > > >
> > > > >> > > >
> > > > >> > >
> > > > >> >
> > > > >> >
> > > > >> >
> > > > >> > --
> > > > >> > Best regards,
> > > > >> > Andrew
> > > > >> >
> > > > >> > Words like orphans lost among the crosstalk, meaning torn
from
> > > truth's
> > > > >> > decrepit hands
> > > > >> >    - A23, Crosstalk
> > > > >> >
> > > > >>
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message