hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anoop Sam John (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-13754) Allow non KeyValue Cell types also to oswrite
Date Wed, 27 May 2015 03:22:17 GMT

    [ https://issues.apache.org/jira/browse/HBASE-13754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14560338#comment-14560338

Anoop Sam John commented on HBASE-13754:

Yes it is like the PB writeTo(CodedOutputStream)
bq.You could just add it to the Cell Interface? Would that be obnoxious? It is not too much
to expect that every Cell be able to serialize itself on to a Stream like the pb writeTo for
I agree. It is ok to expect every cell to have an impl for writing itself to an OS.  But we
need to consider for BC issues? Users might have their own impl of Cells? If we add new API
is that ok?  If that level is fine I am ok to add.  I selected this path to be on safe side
bq.Don't you also want the opposite, for a Cell being able to deserialize itself from a Stream?
I think that is the correct path especially if we add the method to Cell interface itself.
 All the Cell impls to have default constructor.  The Codec will create a Cell object of its
choice and ask the object to make its data from the stream.   But with the new interface type
I thought to just leave it as we are not sure whether all Cells will impl the new interface.
bq.Do we need to flag tags? Can we do away w/ this flag being every place, especially in hbase
I fear we can not. We have Codecs for with and with out Tags. But when it comes to serialize
on wire, the decision to include or exclude Tags is based on the user/scenario.  That is also
one more reason to keep the API Away from public Cell. The new interface can be Private audience
and we are free to change it when we have a better solution for avoiding sensitive tags to
be sent back to client.  Ideally we would like to use Cell Tags at client level itself. Users
are free to  add Tags of their own.  Also the system adds certain Tags at server.  The system
added Tags should not get send to client. But others has to..  When we can do this, all these
withTags can get removed.

Considering these I still feel it is safe to NOT add new method to Cell interface but have
it to a new Interface (which extends Cell) . The impls that we have, ie. KeyValue, NoTagsKeyValue,
ClonedSeekerState (We need a better name) etc can impl the new interface.  wdyt?

> Allow non KeyValue Cell types also to oswrite
> ---------------------------------------------
>                 Key: HBASE-13754
>                 URL: https://issues.apache.org/jira/browse/HBASE-13754
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Scanners
>            Reporter: Anoop Sam John
>            Assignee: Anoop Sam John
>             Fix For: 2.0.0
>         Attachments: HBASE-13754.patch
> While making the cellblock for returning data to client, we have to write the cell data
into an OutputStream. KeyValue has a static oswrite() method with which it can write data
in one go. (KeyValue components are in a single byte[]). For other cell implementation, we
will call getXXXLength() and getXXXArray() and write each component one after the other. This
is not efficient as the KeyValue way. In fact other cell impls also may have one contigous
byte[] backing for keys atleast. (See ClonedSeekerState) We can optimize for such Cells also.

This message was sent by Atlassian JIRA

View raw message