phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <els...@apache.org>
Subject Re: Moving Phoenix master to Hbase 2.2
Date Wed, 15 Jan 2020 00:46:26 GMT
Still not having looked at what Tephra does -- I'm intrigued by what 
Istvan has in-progress. Waiting to see what he comes up with would be my 
suggestion :)

On 1/14/20 1:12 PM, larsh@apache.org wrote:
>   Does somebody volunteer to take this up?
> I can see whether I can a resource where I work, but it's highly uncertain.
> It would need a bit of digging and design work to see how we would abstract the HBase
interface in the most effective way.
> As mentioned below, Tephra did a good job at this and could serve as an example here.
(Not dinging OMID, OMID does most of it's work client side and doesn't need these abstractions.)
> -- Lars
> 
>      On Tuesday, January 14, 2020, 01:13:36 AM PST, István Tóth <stoty@cloudera.com.invalid>
wrote:
>   
>   Yes, the HBase API signatures change between versions, so we need to
> compile each compat module against a specific HBase.
> 
> Whether I can define an internal compatibility API that is switchable at
> run (startup) time without a performance hit remains to be seen.
> 
> István
> 
> On Tue, Jan 14, 2020 at 3:21 AM Josh Elser <elserj@apache.org> wrote:
> 
>> Agree that trying to wrangle branches is just too frustrating and
>> error-prone.
>>
>> It would also be great if we could have a single Phoenix jar that works
>> across HBase versions, but would not die on that hill :)
>>
>> On 12/20/19 5:04 AM, larsh@apache.org wrote:
>>>    I said _provided_ they can be isolated easily :) (I meant it in the
>> sense of assuming it's easy).
>>> As I said though, Tephra has a similar problem and they did a really
>> good job isolating HBase versions. We can learn from them. Sometimes they
>> isolate the change only, and sometimes the class needs to be copied, but
>> even then it's the one class that is copied, not another branch that needs
>> to be kept in sync.
>>>
>>> This may also drive the desperately necessary refactoring of Phoenix to
>> make these things easier to isolate, or to reduce the copying to a minimum.
>> And we'd need to think through testing carefully.
>>>
>>> The branch per Phoenix and HBase version is too complex, IMHO. And the
>> complex branch to HBase version mapping that Istvan outlines below confirms
>> that.
>>>
>>> We should all take a brief look at the Tephra solution and see whether
>> we can apply that. (And since Tephra is part of the fold now, perhaps
>> someone can help there...?)
>>> Cheers.
>>> -- Lars
>>>
>>>        On Thursday, December 19, 2019, 8:34:15 PM GMT+1, Geoffrey Jacoby <
>> gjacoby@gmail.com> wrote:
>>>
>>>    Lars,
>>>
>>> I'm curious why you say the differences are easily isolated -- many of
>> the
>>> core classes of Phoenix either directly inherit HBase classes or
>> implement
>>> HBase interfaces, and those can vary between minor versions. (See my
>> above
>>> example of a new coprocessor hook on BaseRegionObserver.)
>>>
>>> Geoffrey
>>>
>>> On Thu, Dec 19, 2019 at 10:54 AM larsh@apache.org <larsh@apache.org>
>> wrote:
>>>
>>>>      Yep. The differences are pretty minimal - provided they can be
>> isolated
>>>> easily.
>>>> Tephra might be a pretty good model. It supports various versions of
>> HBase
>>>> in a single branch and has similar issues as Phoenix (coprocessors,
>> etc).
>>>> -- Lars
>>>>        On Thursday, December 19, 2019, 7:07:51 PM GMT+1, Josh Elser <
>>>> elserj@apache.org> wrote:
>>>>
>>>>      To clarify, you think that compat modules are better than that
>>>> separate-branches model in 4.x?
>>>>
>>>> On 12/18/19 11:29 AM, larsh@apache.org wrote:
>>>>> This is really hard to follow.
>>>>>
>>>>> I think we should do the same with HBase dependencies in Phoenix that
>>>> HBase does with Hadoop dependencies.
>>>>>
>>>>> That is:  We could have a maven module with the specific HBase version
>>>> dependent code.
>>>>> Btw. Tephra does the same... A module for HBase version specific code.
>>>>> -- Lars
>>>>>
>>>>>          On Tuesday, December 17, 2019, 10:00:31 AM GMT+1, Istvan
Toth <
>>>> stoty@apache.org> wrote:
>>>>>
>>>>>      What do you think about tying the minor releases to Hbase minor
>> releases
>>>>> (not necessarily one-to-one)
>>>>>
>>>>> for example (provided 5.1 is 2020H1)
>>>>>
>>>>> 5.0.0 -> HB 2.0
>>>>> 5.1.0 -> HB 2.2.2 (and whatever 2.1 is API compatible with it)
>>>>> 5.1.x -> HB 2.2.x (treat as maintenance branch, no major new features)
>>>>> 5.2.0 -> HB 2.3.0 (if released by that time)
>>>>> 5.2.x -> HB 2.3.x (treat as maintenance branch, no major new features)
>>>>> 5.3.0 -> HB 2.3.x (if there is no new major/minor Hbase release)
>>>>> master -> latest released HBase version
>>>>>
>>>>> Alternatively, we could stick with the same HBase version for patch
>>>>> releases that we used for the first minor release.
>>>>>
>>>>> This would limit the number of branches that we have to maintain in
>>>>> parallel, while providing maintenance branches for older releases, and
>>>>> timely-ish Phoenix releases.
>>>>>
>>>>> The drawback is that users of old HBase versions won't get the latest
>>>>> features, on the other hand they can expect more polish.
>>>>>
>>>>> Istvan
>>>>>
>>>>> On Thu, Dec 12, 2019 at 8:05 PM Geoffrey Jacoby <gjacoby@apache.org>
>>>> wrote:
>>>>>
>>>>>> Since HBase 2.0 is EOM'ed, I'm +1 for not worrying about 2.0.x
>>>>>> compatibility with the 5.x branch going forward.
>>>>>>
>>>>>> Given how coupled Phoenix is to the implementation details of HBase
>>>> though,
>>>>>> I'm not sure trying to abstract those away to keep one Phoenix branch
>>>> per
>>>>>> HBase major version is practical, however. At the least, it would
be
>>>> really
>>>>>> complex.
>>>>>>
>>>>>> For example, in the new year I plan to return to working on the change
>>>> data
>>>>>> capture and Phoenix-level replication features, both of which depend
>> on
>>>>>> WALKey interface changes and a new RegionObserver coprocessor hook
>>>>>> introduced in HBASE-22622 and HBASE-22623. This was released in HBase
>>>> 1.5
>>>>>> and will be in the forthcoming HBase 2.3. While the HBase community
is
>>>>>> discussing EOMing 1.3 right now, and maybe 1.4 will go in the medium
>>>> term,
>>>>>> I don't see all pre-2.3 branch-2's getting deprecated anytime soon.
>>>>>>
>>>>>> So there will be at least two significant features that can only
exist
>>>> in
>>>>>> some but not all of our 4.x and 5.x branches.
>>>>>>
>>>>>> Geoffrey
>>>>>>
>>>>>> On Thu, Dec 12, 2019 at 8:21 AM Josh Elser <elserj@apache.org>
wrote:
>>>>>>
>>>>>>> As much as possible, I'd like to avoid us getting into another
>>>> situation
>>>>>>> with 5.x where we have multiple branches. My hope was/is that
we can
>>>>>>> keep one Phoenix5 branch that works against an acceptable set
of
>> HBase
>>>>>>> branches.
>>>>>>>
>>>>>>> To me, that acceptable set of HBase branches is _a_ 2.1 and 2.2
>>>> release.
>>>>>>> I don't think we need to support all 2.1.x or 2.2.x, nor do I
think
>> we
>>>>>>> need to keep trying to maintain 2.0.x as it's already end of
support
>> by
>>>>>>> the HBase community.
>>>>>>>
>>>>>>> Thanks for updating your PR. I'll add this to my review queue.
>>>>>>>
>>>>>>> On 12/12/19 1:52 AM, Istvan Toth wrote:
>>>>>>>> Hi!
>>>>>>>>
>>>>>>>> I'd like to start a conversation about supporting HBase 2.2.
in the
>>>>>>>> master branch.
>>>>>>>>
>>>>>>>> https://issues.apache.org/jira/browse/PHOENIX-5268 has a
slightly
>> out
>>>>>> of
>>>>>>>> date, but functional PR for HBase 2.2 support on master.
(Please
>>>> review
>>>>>>>> and comment if you have the time, I'll try to update the
PR in the
>>>> next
>>>>>>>> few days)
>>>>>>>>
>>>>>>>> The reason that it is not a straightforward decision to merge
it is
>>>>>> that
>>>>>>>> applying that patch breaks compatibility with HBase 2.0.1,
the
>> current
>>>>>>>> base.
>>>>>>>>
>>>>>>>> I can see the following outcomes:
>>>>>>>>
>>>>>>>> - Do nothing
>>>>>>>> - Move master to HBase 2.2.2
>>>>>>>> - Fork master to Hbase-2.0 and Hbase-2.2 branches
>>>>>>>> - Build time compatibility modules
>>>>>>>> - Run time compatibility modules
>>>>>>>> - Something that I haven't thought of
>>>>>>>>
>>>>>>>>
>>>>>>>> Doing nothing is obviously not a long term solution, as the
current
>>>>>>>> master doesn't work with any of the currently supported HBase
>>>> branches,
>>>>>>>> but we may postpone the inevitable.
>>>>>>>>
>>>>>>>> Simply moving master to HBase 2.2 is the most attractive
solution
>> from
>>>>>> a
>>>>>>>> pure developer POV, but there may be other considerations.
>>>>>>>>
>>>>>>>> Having multiple masters for 2.0 and 2.2 is simple from a
code
>>>>>>>> perspective, but maintaining two branches is a non-trivial
amount of
>>>>>>>> additional work. (See the 4.x situation)
>>>>>>>>
>>>>>>>> Moving the HBase version dependent stuff into a separate
module, and
>>>>>>>> choosing at build time is not pretty from a code POV, but
saves us
>> the
>>>>>>>> hassle of maintaining multiple branches, while maintaining
>>>>>> compatibility
>>>>>>>> with multiple  HBase versions, and can handle future API
changes as
>>>>>> well
>>>>>>>> from a single branch. Doing something like this could have
saved us
>>>> the
>>>>>>>> effort of maintaining three separate 4.x branches.
>>>>>>>>
>>>>>>>> I feel that since Phoenix is closely timed to HBase, and
requires
>>>>>>>> cluster-wide HBase configuration to work anyway, handling
the
>>>> different
>>>>>>>> HBase versions from the same binary/JAR is not worth the
effort.
>>>>>>>>
>>>>>>>> Please share your thoughts!
>>>>>>>>
>>>>>>>> regards
>>>>>>>> Istvan
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
> 
> 

Mime
View raw message