hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Enis Söztutar <enis....@gmail.com>
Subject Re: Handling protocol versions
Date Fri, 28 Dec 2012 01:37:33 GMT
I think what Devaraj describes is a valid use case, and I am sure we will
need it a few times. However, I suspect each of these might be unique, and
we have to deal with how to handle backwards-forwards compat from the
client differently (image META moving to zk, after 0.96). So we cannot
easily generalize, and we may still have to drop support for features
gradually.

If we still keep the version, do we bump it every time a parameter is added
to a method, or only when a new method is added? It does not sound very
maintainable.

Not knowing much about the recent changes, why don't we go full PB, and
define actual rpc methods as services? (as in
https://developers.google.com/protocol-buffers/docs/proto#services)


On Thu, Dec 27, 2012 at 1:13 PM, Jimmy Xiang <jxiang@cloudera.com> wrote:

> +1 for removing VersionedProtocol and SignatureProtocol
> +0 for VersionedService/ProtocolDescriptor
>
> If we do have VersionedService/ProtocolDesscriptor, it will most likely be
> used in some
> mixed environment (most likely, new client and mixed versions of HBase
> servers, since old client doesn't
> know any new feature, old client doesn't assume an existing feature will be
> gone in the future either).
>
> With PB,  I think we are going to support a rolling-upgrade path.  That
> means, some mixed
> versions of HBase servers can be compatible. For enterprise, I think it is
> not that hard to
> maintain compatible HBase clusters.  So I don't think it is absolutely
> needed.
>
> Thanks,
> Jimmy
>
> On Thu, Dec 27, 2012 at 12:05 PM, Stack <stack@duboce.net> wrote:
>
> > So, picking up this thread again because I'm working on
> > https://issues.apache.org/jira/browse/HBASE-6521 "
> > Address the handling of multiple versions of a protocol"Address the
> > handling of multiple versions of a protocol", the original question was
> >  two-fold as I read it.
> >
> > 1. Should we keep VersionedProtocol.
> > 2. How does a client figure if a server supports a particular capability
> >
> > On question 1:
> >
> > VersionedProtocol [1] does two things.  It returns the server version of
> > the protocol and separately, a "ProtocolSignature" Writable which allows
> > you get a 'hash' of the server's protocol method signatures.   There is
> an
> > implication that the server will give out different versions of the
> > protocol dependent on what version the client volunteers (not the case)
> and
> > it is implied that the client does something with these method hash
> > signatures.  It doesn't.
> >
> > So, VP is a Writable that returns Writables we don't make use of
> implying a
> > functionality unrealized.
> >
> > Thats how I read it.  Objections? [3]
> >
> > It sounds like at least ProtocolSignature can go.  If we did want to go
> the
> > route ProtocolSignature implies, we should probably do the native
> protobuf
> > thing and make use of ServiceDescriptors, protobuf descriptions of what a
> > protobuf Service exposes [2].
> >
> > That leaves the VPs return of the server protocol version as all that
> > remains 'useful'.
> >
> > But is it? Is version going to be useful going forward?  If we lean on
> > version, clients will have to keep a registry of versions to available
> > methods.  Or ask the server what it has and somehow sort though the
> return
> > to figure what it can and cannot make sense of by method.  Sounds like a
> > bunch of work.
> >
> > At a minimum, VP will have to be protobuf'd so it is going to have to
> > change.  And we should probably add a bit more info to the return since
> we
> > are going to the trouble of an RPC anyways.
> >
> > This serves as a lead in to question 2:
> >
> > Protobuf as is helps in the case where an ipc takes an extra parameter or
> > adds extra info to the return; the majority of the evolutions that will
> be
> > happening in the ipc interface.  But what to do about the scenario
> Devaraj
> > outlines at the head of the thread where we have shipped a method that
> > causes the server to OOME in production or we add a method to the server
> > that runs ten times faster than the old one?  Or probably more likely,
> the
> > server has a whole new 'feature' (as Todd calls it) orthogonal to the set
> > the protocol version implies?  How does the client figure the new feature
> > is available?
> >
> > We could have the client try the invocation -- as Jimmy suggests -- and
> if
> > it fails, register the fail in a client-wide map so we avoid retrying on
> > each invocation (We should just do this anyways).  The client could go
> back
> > to the server and do the above suggested query of server capabilities and
> > then adjust the call accordingly or since we are doing an ipc setup call
> > anyways, we could have the server return the list of capabilities at this
> > time.  The client could cache what is available or not and just ask the
> > server when convenient for it.
> >
> > Using the bitmap shorthand describing what is available seems like it
> would
> > be less work to do than implementing protobuf service
> > description/interrogation and then dynamically composing method calls.
> >
> > Proposal:
> >
> > + Remove VersionedProtocol and SignatureProtocol
> > + Instead of VP, add a new Interface called VersionedService or probably
> > better, ProtocolDescriptor, that all RPC Protocols implement.  It has
> > methods (getDescriptor) to return a pb Message that has the server
> version
> > of the protocol and a bitmap of feature's the server implements.  This is
> > the call we will make when we set up the ipc proxy.  Clients can cache
> the
> > result.  Every time we change a Service/Protocol, we set a particular bit
> > in the Service/Protocol bitmap.  This new Interface might also return the
> > long form pb ServiceDescriptors (the pb getDescriptorForType from Service
> > Interface).  It could be useful debugging.
> >
> > What you lot think?
> >
> > St.Ack
> >
> > 1.
> >
> >
> http://svn.apache.org/viewvc/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/VersionedProtocol.java?view=markup
> > 2.
> >
> >
> https://developers.google.com/protocol-buffers/docs/reference/java/com/google/protobuf/Service
> > 3. We have VP and PS because, as I understand it, we once that we would
> > support choosing between protocol and protocol versions and that we'd
> > support both protobufs and Writables.  This is no longer an wanted.
> >
> >
> >
> >
> >
> >
> >
> >
> > On Fri, Aug 3, 2012 at 11:40 AM, Devaraj Das <ddas@hortonworks.com>
> wrote:
> >
> > > Responses inline..
> > >
> > > > On Wed, Aug 1, 2012 at 11:04 AM, Todd Lipcon <todd@cloudera.com>
> > wrote:
> > > >> One possibility:
> > > >>
> > > >> During the IPC handshake, we could send the full version string /
> > > >> source checksum. Then, have a client-wide map which caches which
> > > >> methods have been found to be supported or not supported for an
> > > >> individual version. So, we don't need to maintain the mapping
> > > >> ourselves, but we also wouldn't need to do the full retry every
> time.
> > > >>
> > >
> > > Yeah this is what I was thinking as the alternate to the current
> approach
> > > of using VersionedProtocol.
> > >
> > > >> A different idea would be to introduce a call like
> > > >> "getServerCapabilities()" which returns a bitmap, and define a bit
> per
> > > >> time that we add a new feature.
> > > >>
> > > >> The advantage of these approaches vs a single increasing version
> > > >> number is that we sometimes want to backport a new IPC to an older
> > > >> version, but not backport all of the intervening IPCs. Having a
> bitmap
> > > >> allows us to "pick and choose" on backports without having to pull
> in
> > > >> a bunch of things we didn't necessarily want.
> > > >>
> > >
> > > Good point.
> > >
> > > >> On Wed, Aug 1, 2012 at 1:41 AM, Stack <stack@duboce.net> wrote:
> > > >>> On Tue, Jul 31, 2012 at 1:47 AM, Devaraj Das <ddas@hortonworks.com
> >
> > > wrote:
> > > >>>> Wondering whether we should retain the VersionedProtocol now
that
> we
> > > have protobuf implementation for most (all?) of the protocols. I think
> we
> > > still need the version checks and do them when we need to. Take this
> > case:
> > > >>>> 1. Protocol Foo has as one of the methods
> > FooMethod(FooMethodRequest).
> > > >>>> 2. Protocol Foo evolves over time, and the
> > > FooMethod(FooMethodRequest) now has a better implementation called
> > > FooMethod_improved(FooMethodRequest).
> > > >>>> 3. HBase installations have happened with both the protocol
> > > implementations.
> > > >>>> 4. Clients should be able to talk to both old and new servers
(and
> > > invoke the newer implementation of FooMethod if the protocol implements
> > it).
> > > >>>>
> > > >>>> (4) is possible when the getProtocolVersion is implemented
by the
> > > protocol at the server. The client could check what the version of the
> > > protocol was (assuming VersionedProtocol semantics where the protocol
> > > version number is upgraded for such significant changes) and depending
> on
> > > that invoke the appropriate method...
> > > >>>>
> > > >>>> Having to map version-numbers of protocols to the
> methods-supported
> > > is probably arcane IMO but works..
> > > >>>>
> > > >>>> The other approach (that wouldn't require the version#) is
to do
> > > something like - On the client side, get the protocol methods supported
> > at
> > > the server (and cache it) and then look this map up whenever needed to
> > > decide which method to invoke.
> > > >>>>
> > > >>>> Any thoughts on whether we should invest time in the second
> approach
> > > yet?
> > > >>>>
> > > >>>
> > > >>> The VersionedProtocol w/ client being able to interrogate what
> > methods
> > > >>> a server supports strikes me as a facility that will be rarely
used
> > if
> > > >>> at all and bringing it along, keeping up the directory of supported
> > > >>> methods, will take a load of work on our part that we'll do less
> than
> > > >>> perfectly so should it ever be needed, it won't work because we
let
> > it
> > > >>> go stale.
> > > >>>
> > >
> > > Yeah, this won't be a common case. It'd (hopefully) be rare. The
> > directory
> > > of methods would be the methods in the protocol-interface at the server
> > > that could be figured by invoking reflection (and hence staleness issue
> > > shouldn't happen).
> > >
> > > >>> What do you reckon?
> > > >>>
> > > >>> The above painted scenario too is a little on the exotic side.
 We
> > can
> > > >>> do something like Jimmy suggests in those rare cases we need to
> add a
> > > >>> new method because there is insufficient wiggle-room w/i the
> > > >>> particular PB method call (If we get into the issue Ted raises
> where
> > > >>> we'd have to go back to the server twice because there is a third
> new
> > > >>> method call, we're doing our API wrong).
> > > >>>
> > >
> > > Agree that the exception handling hack can be played here.. In general,
> > > having some solution around this might be really helpful *if* we get
> some
> > > API wrong (for e.g., indirect implication on memory by the API
> semantics)
> > > and we need to fix it without breaking compatibility.. In HDFS,
> listFile
> > > proved to be a memory killer for extremely large directories and people
> > > implemented the iterator version of the same.
> > >
> > > >>> The protocol needs a version though.  We'll be still sending that
> > > >>> 'hrpc' long in the header preamble?  Should we add a version long
> > > >>> after the 'hrpc' long?
> > > >>>
> > >
> > > The version in "hrpc" is the RPC version (as opposed to protocol
> > version).
> > > I think that's orthogonal to this discussion..
> > >
> > > >>> As to a directory of supported methods, do we need this in the
> > > >>> protocol at all?  Can't this be knowledge kept outside of the
> > > >>> on-the-wire back and forth?
> > > >>>
> > > >>> St.Ack
> > > >>
> > > >>
> > >
> > > As I answered above, and as Todd also says, it probably makes sense to
> > > have a client wide cache for protocol<->supported-methods .. and look
> up
> > > the cache when and if the client needs to decide between different
> > versions
> > > of a method, or picking a new method, based on the server it is talking
> > > to...
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message