hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <andrew.purt...@gmail.com>
Subject Re: [DISCUSS] Move Type out of KeyValue
Date Sun, 01 Oct 2017 17:19:42 GMT
Ok, thanks. I understand now.

+1

> On Sep 30, 2017, at 9:28 PM, Chia-Ping Tsai <chia7712@apache.org> wrote:
> 
> The "custom cell type" never exists in the story. (Sorry for misleading you) 
> 
> Here is the story. i add some custom cells (for saving memory) to Put via Put#add(Cell).
The pseudocode of custom cell is shown below.
> 
> {code}
> class MyObject() {
>  Cell toCell() {
>      return CellBuilderFactory.newBuilfer(SHALLOW_COPY)
>                    .setRow(sharedBuffer, myRowOffset, myRowLength).
>                    .setType(KeyValue.Type.Put.getCode()) // We call the IA.Private to
get valid code of Put
>                    // set other fields
>                    .build();
>  }
> }
> 
> put.add(myObject.toCell);
> {code}
> 
> And then, I noticed the Put#add is not optimized for our heavy table(a chunk of cells
in single row), so I also extend the Put to add some #add methods for avoiding resizing collection.
> 
> That was the story -- I try to reducer the cost of converting our object to Put/Cell.
A another story i had mentioned is to build custom write path via Endpoint, but it is unrelated
to this topic. 
> 
> All class we use are shown below:
> 1) Cell -> IA.Public
> 2) CellBuilder -> IA.Public
> 3) CellBuilderFactory -> IA.Public
> 4) Put -> IA.Public
> 5) Put#add(Cell) -> IA.Public
> 5) KeyValue#Type -> IA.Private
> 
> That is why i want to make KeyValue#Type IA.Public.
> 
> --
> Chia-Ping
> 
>> On 2017-10-01 00:34, Andrew Purtell <andrew.purtell@gmail.com> wrote: 
>> Thanks for sharing these details. They are intriguing. If possible could you explain
why the custom type is needed? 
>> 
>> Something has to be deployed on the server or the custom cell type isn’t guaranteed
to be handled correctly. It may work now by accident. I’m a little surprised a custom cell
type doesn’t cause an abort. Did you patch the code to handle it?
>> 
>> 
>>> On Sep 30, 2017, at 1:06 AM, Chia-Ping Tsai <chia7712@apache.org> wrote:
>>> 
>>> Thanks for the nice suggestions. Andrew. Sorry for delay response. Busy today.
>>> 
>>> The root reason we must build own Cell on client side is that the data are located
on shared memory which is similar with MSLAB.
>>> 
>>> You are right. We can use attribute to carry our data but the byte[] is not acceptable
because we can’t assign the offset and length. In fact, the endpoint is a better way for
our case because our object can be  directly converted to PB object. Also it is easy to apply
shared memory to manage our object. However, it will be easier and more readable to follow
regular Put operation. All we have to do is to build own cell and extended Put. Nothing have
to be deployed on server.
>>> 
>>> I agree the custom cell is low level thing, and it should be used by advanced
users. What I concern is the classes related to  custom Cell have different IA declaration.
I’am fine to make them IA.Private but building the custom cell may be a common case.
>>> 
>>> — 
>>> Chia-Ping
>>> 
>>>> On 2017-09-30 06:05, Andrew Purtell <apurtell@apache.org> wrote: 
>>>> ​Construct a normal put or delete or batch mutation, add whatever extra
>>>> state you need in one or more operation attributes, and use a
>>>> regionobserver to extend normal processing to handle the extra state. I'm
>>>> curious what dispatching to extension code because of a custom cell type
>>>> buys you over dispatching to extension code because of the presence of an
>>>> attribute (or cell tag). For example, in security coprocessors we take
>>>> attribute data and attach it to the cell using cell tags. Later we check
>>>> for cell tag(s) to determine if we have to take special action when the
>>>> cell is accessed by a scanner, or during some operations (e.g. appends or
>>>> increments have to do extra handling for cell security tags).
>>>> 
>>>> 
>>>> On Fri, Sep 29, 2017 at 2:43 PM, Chia-Ping Tsai <chia7712@apache.org>
wrote:
>>>> 
>>>>>> Instead of a custom cell, could you use a regular cell with a custom
>>>>>> operation attribute (see OperationWithAttributes).
>>>>> Pardon me, I didn't get what you said.
>>>>> 
>>>>> 
>>>>> 
>>>>>> On 2017-09-30 04:31, Andrew Purtell <apurtell@apache.org> wrote:
>>>>>> Instead of a custom cell, could you use a regular cell with a custom
>>>>>> operation attribute (see OperationWithAttributes).
>>>>>> 
>>>>>> On Fri, Sep 29, 2017 at 1:28 PM, Chia-Ping Tsai <chia7712@apache.org>
>>>>> wrote:
>>>>>> 
>>>>>>> The custom cell help us to save memory consumption. We don't
have own
>>>>>>> serialization/deserialization mechanism, hence to transform data
from
>>>>>>> client to server needs many conversion phase (user data ->
Put/Cell ->
>>>>> pb
>>>>>>> object). The cost of conversion is large in transferring bulk
data. In
>>>>>>> fact, we also have custom mutation to manage the memory usage
of inner
>>>>> cell
>>>>>>> collection.
>>>>>>> 
>>>>>>>> On 2017-09-30 02:43, Andrew Purtell <apurtell@apache.org>
wrote:
>>>>>>>> What are the use cases for a custom cell? It seems a dangerously
low
>>>>>>> level
>>>>>>>> thing to attempt and perhaps we should unwind support for
it. But
>>>>> perhaps
>>>>>>>> there is a compelling justification.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Thu, Sep 28, 2017 at 10:20 PM, Chia-Ping Tsai <
>>>>> chia7712@apache.org>
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Thanks for all comment.
>>>>>>>>> 
>>>>>>>>> The problem i want to resolve is the valid code should
be exposed
>>>>> as
>>>>>>>>> IA.Public. Otherwise, end user have to access the IA.Private
class
>>>>> to
>>>>>>> build
>>>>>>>>> the custom cell.
>>>>>>>>> 
>>>>>>>>> For example, I have a use case which plays a streaming
role in our
>>>>>>>>> appliaction. It
>>>>>>>>> applies the CellBuilder(HBASE-18519) to build custom
cells. These
>>>>> cells
>>>>>>>>> have many same fields so they are put in shared-memory
for
>>>>> avoiding GC
>>>>>>>>> pause. Everything is wonderful. However, we have to access
the
>>>>>>> IA.Private
>>>>>>>>> class - KeyValue#Type - to get the valid code of Put.
>>>>>>>>> 
>>>>>>>>> I believe there are many use cases of custom cell, and
>>>>> consequently it
>>>>>>> is
>>>>>>>>> worth adding a way to get the valid type via IA.Public
class.
>>>>>>> Otherwise, it
>>>>>>>>> may imply that the custom cell is based on a unstable
way, because
>>>>> the
>>>>>>>>> related code can be changed at any time.
>>>>>>>>> --
>>>>>>>>> Chia-Ping
>>>>>>>>> 
>>>>>>>>>> On 2017-09-29 00:49, Andrew Purtell <apurtell@apache.org>
wrote:
>>>>>>>>>> I agree with Stack. Was typing up a reply to Anoop
but let me
>>>>> move it
>>>>>>>>> down
>>>>>>>>>> here.
>>>>>>>>>> 
>>>>>>>>>> The type code exposes some low level details of how
our current
>>>>>>> stores
>>>>>>>>> are
>>>>>>>>>> architected. But what if in the future you could
swap out HStore
>>>>>>>>> implements
>>>>>>>>>> Store with PStore implements Store, where HStore
is backed by
>>>>> HFiles
>>>>>>> and
>>>>>>>>>> PStore is backed by Parquet? Just as a hypothetical
example. I
>>>>> know
>>>>>>> there
>>>>>>>>>> would be larger issues if this were actually attempted.
Bear with
>>>>>>> me. You
>>>>>>>>>> can imagine some different new Store implementation
that has some
>>>>>>>>>> advantages but is not a design derived from the log
structured
>>>>> merge
>>>>>>> tree
>>>>>>>>>> if you like. Most values from a new Cell.Type based
on
>>>>> KeyValue.Type
>>>>>>>>>> wouldn't apply to cells from such a thing because
they are
>>>>>>> particular to
>>>>>>>>>> how LSMs work. I'm sure such a project if attempted
would make a
>>>>>>> number
>>>>>>>>> of
>>>>>>>>>> changes requiring a major version increment and low
level details
>>>>>>> could
>>>>>>>>> be
>>>>>>>>>> unwound from Cell then, but if we could avoid doing
it in the
>>>>> first
>>>>>>>>> place,
>>>>>>>>>> I think it would better for maintainability.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>> On Thu, Sep 28, 2017 at 9:39 AM, Stack <stack@duboce.net>
wrote:
>>>>>>>>>>> 
>>>>>>>>>>> On Thu, Sep 28, 2017 at 2:25 AM, Chia-Ping Tsai
<
>>>>>>> chia7712@apache.org>
>>>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> hi folks,
>>>>>>>>>>>> 
>>>>>>>>>>>> User is allowed to create custom cell but
the valid code of
>>>>> type
>>>>>>> -
>>>>>>>>>>>> KeyValue#Type - is declared as IA.Private.
As i see it, we
>>>>> should
>>>>>>>>> expose
>>>>>>>>>>>> KeyValue#Type as Public Client. Three possible
ways are shown
>>>>>>> below:
>>>>>>>>>>>> 1) Change declaration of KeyValue#Type from
IA.Private to
>>>>>>> IA.Public
>>>>>>>>>>>> 2) Move KeyValue#Type into Cell.
>>>>>>>>>>>> 3) Move KeyValue#Type to upper level
>>>>>>>>>>>> 
>>>>>>>>>>>> Any suggestions?
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> What is the problem that we are trying to solve
Chia-Ping? You
>>>>>>> want to
>>>>>>>>> make
>>>>>>>>>>> Cells of a new Type?
>>>>>>>>>>> 
>>>>>>>>>>> My first reaction is that KV#Type is particular
to the KV
>>>>>>>>> implementation.
>>>>>>>>>>> Any new Cell implementation should not have to
adopt the
>>>>> KeyValue
>>>>>>>>> typing
>>>>>>>>>>> mechanism.
>>>>>>>>>>> 
>>>>>>>>>>> S
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>>> --
>>>>>>>>>>>> Chia-Ping
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> --
>>>>>>>>>> Best regards,
>>>>>>>>>> Andrew
>>>>>>>>>> 
>>>>>>>>>> Words like orphans lost among the crosstalk, meaning
torn from
>>>>>>> truth's
>>>>>>>>>> decrepit hands
>>>>>>>>>>  - A23, Crosstalk
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> --
>>>>>>>> Best regards,
>>>>>>>> Andrew
>>>>>>>> 
>>>>>>>> Words like orphans lost among the crosstalk, meaning torn
from
>>>>> truth's
>>>>>>>> decrepit hands
>>>>>>>>  - A23, Crosstalk
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Best regards,
>>>>>> Andrew
>>>>>> 
>>>>>> Words like orphans lost among the crosstalk, meaning torn from truth's
>>>>>> decrepit hands
>>>>>>  - A23, Crosstalk
>>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> -- 
>>>> Best regards,
>>>> Andrew
>>>> 
>>>> Words like orphans lost among the crosstalk, meaning torn from truth's
>>>> decrepit hands
>>>>  - A23, Crosstalk
>>>> 
>> 

Mime
View raw message