geode-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Stolz <mst...@pivotal.io>
Subject Re: DISCUSS: Deprecating and replacing current serializable string encodings
Date Tue, 05 Dec 2017 04:45:37 GMT
Anything that breaks data on disk is also a big PITA. This change would
break data on disk.

--
Mike Stolz
Principal Engineer, GemFire Product Lead
Mobile: +1-631-835-4771

On Mon, Dec 4, 2017 at 1:52 PM, Dan Smith <dsmith@pivotal.io> wrote:

> The new protocol is currently translating from PDX->JSON before sending
> results to the clients so the client doesn't have to understand PDX or
> DataSerializable.
>
> There is a lot more to DataSerializable than just how a String is
> serialized. And it's not even documented that I am aware of. Just tweaking
> the string format is not going to make that much better. Your hypothetical
> ruby developer is in trouble with or without this proposed change.
>
> Breaking compatibility is a huge PITA for our users. We should do that when
> we are actually giving them real benefits. In this case if we were
> switching to some newer PDX format that was actually easy to implement
> deserialization logic I could see the argument for breaking compatibility.
> Just changing the string format without fixing the rest of the issues
> around DataSerializable isn't providing real benefits.
>
> You can't assume that a client in one language will only be serializing
> > strings for it's own consumption.
> >
>
> I wasn't making that assumption. The suggestion is that the C++ client
> would have to deserialize all 4 valid formats, but it could just always
> serialize data using the valid UTF-16 format. All other clients should be
> able to deserialize that.
>
> -Dan
>
> On Fri, Dec 1, 2017 at 5:35 PM, Jacob Barrett <jbarrett@pivotal.io> wrote:
>
> > On Fri, Dec 1, 2017 at 4:59 PM Dan Smith <dsmith@pivotal.io> wrote:
> >
> > > I think I'm kinda with Mike on this one. The existing string format
> does
> > > seem pretty gnarly. But the complexity of implementing and testing all
> of
> > > the backwards compatibility transcoding that would be required in order
> > to
> > > move to the new proposed format seems to be way more work with much
> more
> > > possibility for errors. Do we really expect people to be writing new
> > > clients that use DataSerializable? It hasn't happened yet, and we're
> > > working on a new protocol that uses protobuf right now.
> > >
> >
> > Consider that any new clients written would have to implement all these
> > encodings. This is going to make writing new clients using the upcoming
> new
> > protocol laborious. The new protocol does not define object encoding, it
> > strictly defines message encoding. Objects sent over the protocol will
> have
> > to be serialized in some format, like PDX or data serializer. We could
> > alway develop a better serialization format than what we have now. If we
> > don't develop something new then we have to use the old. Wouldn't it be
> > nice if the new clients didn't have to deal with legacy encodings?
> >
> > If the issue is really the complexity of serialization from the C++
> client,
> > > maybe the C++ client could always write UTF-16 strings?
> > >
> >
> > You can't assume that a client in one language will only be serializing
> > strings for it's own consumption. We have many people using strings in
> PDX
> > to transform between C++, .NET and Java.
> >
> > The risk is high not to remove this debt. If I am developing a new Ruby
> > client I am forced to deal with all 4 of these encodings. Am I really
> going
> > to want to build a Ruby client for Geode, am I going to get these
> encodings
> > correct? I can tell you that getting them correct may be a challenge if
> the
> > current C++ client is any indication, it has a few incorrect assumptions
> in
> > its encoding of ASCII and modified UTF-8.
> >
> > I am fine with a compromise that deprecates but doesn't remove the old
> > encodings for a few releases. This would give time for users to update.
> New
> > clients written would not be be able to read this old data but could read
> > and write new data.
> >
> >
> >
> > -Jake
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message