hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vrushali C <vrushalic2...@gmail.com>
Subject Re: [DISCUSS] Increase stability on o.a.h.h.Tag?
Date Fri, 22 Sep 2017 19:27:44 GMT
Thanks everyone.

bq. Thanks for the context in this thread, Vrushali. The information you've
provided has been helpful.

Sure, happy to chat further if anyone wants.

Here are the two files which actually operate on Tags in YARN Timeline
service. We are using HBase 1.2.6 at present.

https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/flow/FlowRunCoprocessor.java

https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/flow/FlowScanner.java

There are some utility functions in
https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/common/HBaseTimelineStorageUtils.java

which are called from the FlowScanner and FlowRunCoprocessor classes.

thanks
Vrushali


On Fri, Sep 22, 2017 at 10:52 AM, Josh Elser <elserj@apache.org> wrote:

> Sounds like we have some consensus to make a switch to
> LimitedPrivate(COPROC)+Evolving on Tag (and CellUtil methods?) for 2.0? I
> don't think we have to make API changes (to mitigate Sean's concerns of
> slowing 2.0). We can treat this as a "bigger" promise moving forward.
>
> I can open a JIRA issue to hash out the rest of the specifics if we're in
> agreement.
>
> Thanks for the context in this thread, Vrushali. The information you've
> provided has been helpful.
>
>
> On 9/22/17 1:09 PM, Andrew Purtell wrote:
>
>> In my opinion that's a valid use case we should support with appropriate
>> changes to interface annotations and, therefore, stability. I don't
>> believe
>> the interfaces have been changing much, so this shouldn't represent a
>> problem other than maybe we want to review what we have before promoting
>> them. The security coprocessors do the same: they use tags to add special
>> metadata to cells, then apply additional logic/filtering while overriding
>> some scanner behavior.
>>
>>
>> On Fri, Sep 22, 2017 at 9:54 AM, Vrushali C <vrushalic2016@gmail.com>
>> wrote:
>>
>> For what it's worth, Yarn Timeline Service v2 makes use of Tags only in
>>> the
>>> coprocessor code in the custom Scanner that is invoked during
>>> get/scan/compact and at the PrePut step.
>>>
>>> thanks
>>> Vrushali
>>>
>>>
>>> On Fri, Sep 22, 2017 at 9:20 AM, Andrew Purtell <
>>> andrew.purtell@gmail.com>
>>> wrote:
>>>
>>> I think not making the relevant APIs LP(Coprocessor) was an oversight. In
>>>> my opinion we should do that. I'm not sure about Public. We could do
>>>> that
>>>> too but somewhere we need to call out that coprocessors have access to
>>>> tags, but not clients. (Tags are removed at RPC except for replication.)
>>>>
>>> LP
>>>
>>>> doesn't imply what Public might.
>>>>
>>>> On Sep 22, 2017, at 9:11 AM, Andrew Purtell <andrew.purtell@gmail.com>
>>>>>
>>>> wrote:
>>>>
>>>>>
>>>>> Tags are server side internal metadata. Some carry sensitive
>>>>>
>>>> information
>>>
>>>> like labels. I guess this could appear odd if not around for discussion
>>>> when they were introduced. So what documentation can be improved to
>>>>
>>> lessen
>>>
>>>> the surprise? Javadoc? Online book? A JIRA with suggestions welcome.
>>>>
>>>>>
>>>>>
>>>>> On Sep 22, 2017, at 9:07 AM, Josh Elser <elserj@apache.org> wrote:
>>>>>>
>>>>>> I can appreciate how we've gotten to this point, it just struck me
>>>>>>
>>>>> extremely odd that the contents of a Tag weren't expected to be
>>>> accessed
>>>>
>>> by
>>>
>>>> users. "Arbitrary metadata that rides along with a cell, you just can't
>>>>
>>> see
>>>
>>>> that metadata" ;)
>>>>
>>>>>
>>>>>> I totally understand not wanting to let another thing come into 2.0.
>>>>>>
>>>>> Like MikeD said, let's hope for a faster 3.0 and we can slate this for
>>>>
>>> that
>>>
>>>> time.
>>>>
>>>>>
>>>>>> Thanks for entertaining the discussion. We'll just deal with the
>>>>>>
>>>>> "downstream pain" for 2.0.
>>>>
>>>>>
>>>>>> On 9/22/17 1:32 AM, ramkrishna vasudevan wrote:
>>>>>>> CellUtil  similar type of methods. Coming to Tags yes there are
not
>>>>>>>
>>>>>> much
>>>>
>>>>> cases where clients can directly set Tags. And I think we don't
>>>>>>>
>>>>>> expose
>>>
>>>> any
>>>>
>>>>> APIs which allow you to use mutations with Tags. So probably moving
>>>>>>>
>>>>>> to
>>>
>>>> LimitedPrivate is better and mark with Evolving if there are some
>>>>>>>
>>>>>> users
>>>
>>>> depending on the internals of Tags and its impl. But this will be a
>>>>>>>
>>>>>> One of
>>>>
>>>>> case.
>>>>>>> And also since Tags are internal ideally the CellUtil#getTAgs()
>>>>>>>
>>>>>> should
>>>
>>>> have
>>>>
>>>>> been in another Util method that is exposed with LimitedPrivate and
>>>>>>>
>>>>>> also
>>>>
>>>>> Tags if tags should be made LimitedPRivate. So this may help in not
>>>>>>>
>>>>>> having
>>>>
>>>>> a PRivate interface like Tag in a public CellUtil class.
>>>>>>> 3.0 is fine but need some clean up in 2.0? Indicating what could
>>>>>>>
>>>>>> happen
>>>
>>>> going forward from 2.0?
>>>>>>> Regards
>>>>>>> Ram
>>>>>>>
>>>>>>>> On Fri, Sep 22, 2017 at 2:59 AM, Sean Busbey <busbey@apache.org>
>>>>>>>>
>>>>>>> wrote:
>>>>
>>>>> Yeah. I mean, I think we should improve  the situation. Just think
>>>>>>>> it's too much to bite off at this stage of 2.0, we can aim
for 3.0
>>>>>>>>
>>>>>>> and
>>>
>>>> start working in some tooling to help us.
>>>>>>>>
>>>>>>>> On Thu, Sep 21, 2017 at 3:35 PM, Josh Elser <elserj@apache.org>
>>>>>>>>>
>>>>>>>> wrote:
>>>>
>>>>> That really makes me groan (we have downstream users depending on
>>>>>>>>>
>>>>>>>> code
>>>>
>>>>> we've
>>>>>>>>
>>>>>>>>> explicitly said "don't use"), but if that's what it is
given the
>>>>>>>>>
>>>>>>>> current
>>>>
>>>>> state, so be it. My complaining won't fix it.
>>>>>>>>>
>>>>>>>>> Thanks.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 9/21/17 4:25 PM, Sean Busbey wrote:
>>>>>>>>>>
>>>>>>>>>> We have lots of examples of including non-Public
stuff in Public
>>>>>>>>>>
>>>>>>>>> APIs.
>>>>
>>>>> we have docs that advise folks to be wary on relying on them
>>>>>>>>>>
>>>>>>>>> beyond
>>>
>>>> opaque symbols.
>>>>>>>>>>
>>>>>>>>>> ref: http://hbase.apache.org/book.html#hbase.client.api.surface
>>>>>>>>>>
>>>>>>>>>> On Thu, Sep 21, 2017 at 3:21 PM, Josh Elser <elserj@apache.org>
>>>>>>>>>>>
>>>>>>>>>> wrote:
>>>>
>>>>>
>>>>>>>>>>> I was going to suggest LimitedPrivate in my original,
but this
>>>>>>>>>>>
>>>>>>>>>> doesn't
>>>>
>>>>> make
>>>>>>>>>>> sense as we're exposing Public API via CellUtil.
>>>>>>>>>>>
>>>>>>>>>>> It seems odd to me that we wouldn't treat the
cell tags as a
>>>>>>>>>>>
>>>>>>>>>> supported
>>>>
>>>>> API
>>>>>>>>>>> call. However, I'm happy to remain "confused"
if the rest of
>>>>>>>>>>>
>>>>>>>>>> folks
>>>
>>>> don't
>>>>>>>>
>>>>>>>>> consider tags to be intended for users :)
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 9/21/17 3:15 PM, Ted Yu wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Can we mark Tag LimitedPrivate ?
>>>>>>>>>>>>
>>>>>>>>>>>> We know how ATS uses Tags so it should be
straight forward to
>>>>>>>>>>>>
>>>>>>>>>>> keep
>>>
>>>> their
>>>>>>>>
>>>>>>>>> usage intact.
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Sep 21, 2017 at 12:03 PM, Josh Elser
<elserj@apache.org
>>>>>>>>>>>>
>>>>>>>>>>>
>>>> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>>> Hiya,
>>>>>>>>>>>>>
>>>>>>>>>>>>> (Background, I'm starting what is likely
to be an onerous task
>>>>>>>>>>>>>
>>>>>>>>>>>> of
>>>
>>>> looking
>>>>>>>>>>>>> through downstream components and seeing
what is broken with
>>>>>>>>>>>>>
>>>>>>>>>>>> the
>>>
>>>> latest
>>>>>>>>
>>>>>>>>> hbase-2.0.0*)
>>>>>>>>>>>>>
>>>>>>>>>>>>> Looking at YARN's use of HBase for the
Application
>>>>>>>>>>>>>
>>>>>>>>>>>> TimelineServer, I
>>>>
>>>>> see
>>>>>>>>>>>>> that they're relying on the Tag interface.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Presently, Tag is marked as Private,
yet we expose it via the
>>>>>>>>>>>>>
>>>>>>>>>>>> Public
>>>>
>>>>> CellUtil.
>>>>>>>>>>>>>
>>>>>>>>>>>>> My gut reaction is that we should bump
Tag up Public since the
>>>>>>>>>>>>>
>>>>>>>>>>>> intent
>>>>
>>>>> is
>>>>>>>>>>>>> for downstream users to, ya know, use
those Tags. Any
>>>>>>>>>>>>>
>>>>>>>>>>>> objections?
>>>
>>>>
>>>>>>>>>>>>> If we don't want to expose Tag, we should
make a pass over the
>>>>>>>>>>>>>
>>>>>>>>>>>> Public
>>>>
>>>>> methods and mark them as Private (so not as to provide a Public
>>>>>>>>>>>>>
>>>>>>>>>>>> method
>>>>>>>>
>>>>>>>>> with
>>>>>>>>>>>>> Private objects). CellUtil#getTag(Cell,
byte) would be one such
>>>>>>>>>>>>> example.
>>>>>>>>>>>>>
>>>>>>>>>>>>> - Josh
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>
>>>
>>
>>
>>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message