hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Struggles around Cell#getType()
Date Fri, 27 Oct 2017 09:20:53 GMT
Sergey:

Here is the DataType I was talking about:
https://github.com/apache/hbase/blob/master/hbase-common/src/main/java/org/
apache/hadoop/hbase/CellBuilder.java#L33


bq. missed in doing DataType is not having a type state and make it align
with those in KeyValue#Type

My earlier comment was in line with Anoop's.
The alignment can start with aligning the ordinals of CellBuilder#DataType
to those of KeyValue#Type

>From Josh's example:

+            cellBuilder.setType(DataType.Put);

This is what we have in the CellBuilder :

  CellBuilder setType(final DataType type);
If a variant of setType() is added which accepts byte (say, the return
from Cell#getType()),
the example would be closer to real life scenario.

bq. move Type out of KeyValue and keep it as a part of public API

The reason we may not want to do the above is that KeyValue#Type has
several internal values such as Minimum.
Not exposing more than CellBuilder#DataType gives hbase freedom in future
enhancements.

Cheers

On Fri, Oct 27, 2017 at 1:35 AM, Sergey Soldatov <sergeysoldatov@gmail.com>
wrote:

> .bq Also we can have DataType#toType(Cell) or so for the conversion
> purpose.
>
> Let me repeat. DataType is the serialization interface for the values and
> has no relations to the type of KV.
>
> .bq This would imply having as many isXX() methods as the number of
> elements
> in CellBuilder#DataType
>
> Have I missed something, but I don't see any DataType nor Type in
> CellBuilder (checked branch-2 and master). The only place where we define
> the mutations type is KeyValue#Type.
> One more note. The public API for Cell#getTypeByte explicitly says "The
> byte representation of the KeyValue.TYPE of this cell".  For me, it sounds
> like we need to move Type out of KeyValue and keep it as a part of public
> API instead of adding checkers for all types to CellUtil. Any other ideas?
>
> Thanks,
> Sergey
>
> On Fri, Oct 27, 2017 at 12:30 AM, Anoop John <anoop.hbase@gmail.com>
> wrote:
>
> > I think what we missed in doing DataType is not having a type state
> > and make it align with those in KeyValue#Type.  Relying on ordinal
> > might be problematic.   Also we can have DataType#toType(Cell) or so
> > for the conversion purpose.  This is needed for CPs as noted by Josh's
> > CP eg:s.  Thanks for the nice find Josh.  Doing more and more CP eg:s
> > reveal this kind of misses.
> >
> > -Anoop-
> >
> > On Fri, Oct 27, 2017 at 10:17 AM, Ted Yu <yuzhihong@gmail.com> wrote:
> > > bq. you may need CellUtil#isPut(Cell) sort of API
> > >
> > > This would imply having as many isXX() methods as the number of
> > > elements in CellBuilder#DataType
> > > , right ?
> > >
> > > On Thu, Oct 26, 2017 at 9:29 PM, ramkrishna vasudevan <
> > > ramkrishna.s.vasudevan@gmail.com> wrote:
> > >
> > >> Sorry just to clarify I mean deprecating the getType in Cell can we
> try
> > >> doing it in 2.0-alpha 4.
> > >>
> > >> On Fri, Oct 27, 2017 at 9:45 AM, ramkrishna vasudevan <
> > >> ramkrishna.s.vasudevan@gmail.com> wrote:
> > >>
> > >> > bq.Cell#getType()
> > >> > We had this discussion. So getType should only be used for user
> > exposed
> > >> > types like Put and Deletes. All others are internal. So having it
in
> > >> public
> > >> > interface may not be needed. Shall we do this in 2.0 alpha-4? Am +1
> > to do
> > >> > this.
> > >> >
> > >> > How ever to solve your problem I think you may need
> > CellUtil#isPut(Cell)
> > >> > sort of API in CellUtl like you already have isDelete(Cell).
> > >> >
> > >> > Regards
> > >> > Ram
> > >> >
> > >> > On Fri, Oct 27, 2017 at 9:08 AM, Ted Yu <yuzhihong@gmail.com>
> wrote:
> > >> >
> > >> >> There is also CellBuilder#DataType which is public. However, the
> > >> ordinals
> > >> >> of CellBuilder#DataType are different from KeyValue.Type .
> > >> >>
> > >> >> What if we align the ordinals of CellBuilder#DataType to be the
> same
> > as
> > >> >> those from KeyValue.Type ?
> > >> >>
> > >> >> On Thu, Oct 26, 2017 at 4:34 PM, Sergey Soldatov <
> > >> >> sergeysoldatov@gmail.com>
> > >> >> wrote:
> > >> >>
> > >> >> > DataType class was introduced as part of HBASE-8693 which
is more
> > >> about
> > >> >> the
> > >> >> > type of data in the cell rather than the type of mutation.
> > >> >> >
> > >> >> > Thanks,
> > >> >> > Sergey
> > >> >> >
> > >> >> > On Thu, Oct 26, 2017 at 3:40 PM, Josh Elser <elserj@apache.org>
> > >> wrote:
> > >> >> >
> > >> >> > > Hiya,
> > >> >> > >
> > >> >> > > (Background: see HBASE-19002)
> > >> >> > >
> > >> >> > > In trying to write some example Observers, I found myself
in a
> > >> pickle:
> > >> >> > how
> > >> >> > > do I tell if a Cell is a Put?
> > >> >> > >
> > >> >> > > * Cell#getType() returns a byte which corresponds to
a
> > KeyValue.Type
> > >> >> > > * KeyValue.Type has API to convert a byte to Type
> > >> >> > > * KeyValue (and thus KeyValue.Type) is IA.Private
> > >> >> > > * DataType o.a.h.h.typesDataType _appears to me_ to
be the
> > >> replacement
> > >> >> > for
> > >> >> > > the KeyValue.Type
> > >> >> > >
> > >> >> > > Best as I can tell, Cell#getType() should be deprecated
and we
> > >> should
> > >> >> > have
> > >> >> > > some kind of API (method on Cell or CellUtil) which
returns a
> > >> DataType
> > >> >> > > instead of Type. The details of the byte and the KeyValue.Type
> > >> should
> > >> >> be
> > >> >> > > hidden inside the implementation.
> > >> >> > >
> > >> >> > > My hunch is that this is an accidental omission, but
Stack
> > >> recommended
> > >> >> > > that I "ask the class" ;). What have I missed? I think
this is
> > >> >> trivial to
> > >> >> > > fix; obviously, I don't want to make a fix if I just
didn't
> look
> > >> hard
> > >> >> > > enough.
> > >> >> > >
> > >> >> > > Thanks!
> > >> >> > >
> > >> >> > > - Josh
> > >> >> > >
> > >> >> >
> > >> >>
> > >> >
> > >> >
> > >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message