hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sergey Soldatov <sergeysolda...@gmail.com>
Subject Re: Struggles around Cell#getType()
Date Sat, 28 Oct 2017 00:21:31 GMT
bq. Here is the DataType I was talking about:

Ah, thanks, Ted! Forgot to make the pull. According to the HBASE-18927 as
well as the discussion that had a place before that, It addresses exactly
the topic we are discussing. But actually, it didn't solve the problem and
just created more ambiguity. We have CellBuilder that requires
CellBuilder#DataType that produces Cell that has getTypeByte method which
returns "The byte representation of the KeyValue.TYPE".

bq. I more think that the maybe getType() was misintepreted from what Nick
originally meant it to be. Maybe intentional, maybe not.

If I remember correctly HBASE-8693 was about the encoding for commonly used
data types to keep them sorted in native order. Similar to PData types in
Phoenix.

bq. You agree with Ram's suggestion for helper methods as a way forward?

Well, we already have helpers for all types except Put/Minimum. Adding one
more is not a big deal. But deprecating getters sounds like a bad idea. For
example, the timestamp is used by many 3rd parties to do their own
transactional/versioning support and actually, it's a part of the public
API. If we may specify timestamp for the cell, why we should restrict users
from reading it? Others fields may be useful for creating a modified copies
of the KV like we do in our custom StoreFileReader for local indexes.

Thanks,
Sergey

On Fri, Oct 27, 2017 at 10:40 AM, Chia-Ping Tsai <chia7712@apache.org>
wrote:

> bq. You agree with Ram's suggestion for helper methods as a way forward?
> Adding the CellUtil#isPut() is ok to me as the PUT is a basic operation in
> hbase.
>
> On 2017-10-28 00:58, Josh Elser <elserj@apache.org> wrote:
> > Re-reading https://issues.apache.org/jira/browse/HBASE-8693 that Sergey
> > pointed out, I more think that the maybe getType() was misintepreted
> > from what Nick originally meant it to be. Maybe intentional, maybe not.
> >
> > I don't think getTimestamp() should be removed -- when we store multiple
> > versions of a Key, users should be able to reconcile the Cells client
> > side (e.g. consider a CP which performs some custom merging logic).
> >
> > getSequenceId() I'd agree probably doesn't belong. getTag() I'll hold
> > off judgement because I'm constantly biased into thinking the feature is
> > something that it isn't :)
> >
> > You agree with Ram's suggestion for helper methods as a way forward?
> >
> > On 10/27/17 7:29 AM, Chia-Ping Tsai wrote:
> > > The CellBuilder#Data type is introduced to make sure all components
> used to builder cell are IA.Public.
> > >
> > > bq. Best as I can tell, Cell#getType() should be deprecated
> > > As i see it, the Cell#getType, #getTimestamp, #getSequenceId, and
> #getTag should be deprecated as these methods is some kind of internal info
> of storage engine. As a key-value store, the key  consisting of row,
> family, and qualifier is enough to the general purpose. Other fields belong
> to the specific storage engine, and they should not be in the Cell which is
> our "frontline" interface of data.
> > >
> > >
> > > On 2017-10-27 06:40, Josh Elser <elserj@apache.org> wrote:
> > >> Hiya,
> > >>
> > >> (Background: see HBASE-19002)
> > >>
> > >> In trying to write some example Observers, I found myself in a pickle:
> > >> how do I tell if a Cell is a Put?
> > >>
> > >> * Cell#getType() returns a byte which corresponds to a KeyValue.Type
> > >> * KeyValue.Type has API to convert a byte to Type
> > >> * KeyValue (and thus KeyValue.Type) is IA.Private
> > >> * DataType o.a.h.h.typesDataType _appears to me_ to be the replacement
> > >> for the KeyValue.Type
> > >>
> > >> Best as I can tell, Cell#getType() should be deprecated and we should
> > >> have some kind of API (method on Cell or CellUtil) which returns a
> > >> DataType instead of Type. The details of the byte and the
> KeyValue.Type
> > >> should be hidden inside the implementation.
> > >>
> > >> My hunch is that this is an accidental omission, but Stack recommended
> > >> that I "ask the class" ;). What have I missed? I think this is trivial
> > >> to fix; obviously, I don't want to make a fix if I just didn't look
> hard
> > >> enough.
> > >>
> > >> Thanks!
> > >>
> > >> - Josh
> > >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message