accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher <ctubb...@apache.org>
Subject Re: Java (eventually) dropping Serialization
Date Wed, 30 May 2018 19:17:41 GMT
On Wed, May 30, 2018 at 1:26 PM Josh Elser <josh.elser@gmail.com> wrote:

>
>
> On 5/30/18 12:41 PM, Christopher wrote:
> > On Wed, May 30, 2018 at 11:59 AM Josh Elser <josh.elser@gmail.com>
> wrote:
> >
> >> On 5/30/18 9:08 AM, Keith Turner wrote:
> >>> On Wed, May 30, 2018 at 12:16 AM, Christopher <ctubbsii@apache.org>
> >> wrote:
> >>>> I thought this was interesting:
> >>>>
> >>
> https://www.infoworld.com/article/3275924/java/oracle-plans-to-dump-risky-java-serialization.html
> >>>>
> >>>> If the long-term plan is to remove serialization from Java classes (in
> >>>> favor of a lightweight, possibly pluggable, "Records" serialization
> >>>> framework), we should begin thinking about how we use serialization
in
> >>>> Accumulo's code today. At the very least, we should try to avoid any
> >>>> reliance on it in any future persistence of objects in Accumulo. If
we
> >> see
> >>>> an opportunity to remove it in our current code anywhere, it might be
> >> worth
> >>>> spending the time to do follow through with such a change.
> >>>>
> >>>> Of course, this is probably going to be a *very* long time before it
> is
> >>>> actually dropped from Java, but it's not going to hurt to start
> thinking
> >>>> about it now.
> >>>>
> >>>> (Accumulo uses Java serialization for storing FaTE transaction
> >> information,
> >>>> and perhaps elsewhere.)
> >>>
> >>> We currently do not support FaTE transactions across minor versions.
> >>> The upgrade code checks for any outstanding FaTE transactions.  So
> >>> this makes it easier to upgrade on a minor version.  I would like to
> >>> see FaTE use a human readable format like Json because it would make
> >>> debugging easier.
> >>
> >> I'd strongly suggest against using JSON as you it forces the application
> >> to know how to handle drift in "schema". It would be nice to avoid the
> >> need to flush the outstanding fate txns on upgrade.
> >>
> >> If you just want a JSON-ish way to look at the data, I'd suggest moving
> >> over to protobuf3 and check out the support they have around JSON.
> >>
> >> https://developers.google.com/protocol-buffers/docs/proto3#json
> >
> >
> > Protobuf certainly has better support for schemas... but I like the
> > simplicity of using JSON directly and managing our own schema for FaTE to
> > reduce dependencies. (Also, protobuf does not have a native Java
> compiler,
> > AFAICT, which makes it a pain, similar to thrift, for portable code
> > generation.) Whichever we choose, though, we've got plenty of time to
> > hammer out these pros and cons, and experiment.
>
> Actually, you don't need to do a custom compiler installation for
> Protobuf3 on the majority of arches as there are compilers available via
> Maven central for protobuf on x86/64 and ppc. This is a non-issue for
> the majority of platforms.
>
>
I wasn't aware they were publishing pre-built binaries for various
platforms to Maven Central. That could be quite useful if we could
automatically download the correct one during the Maven build, and use that
to generate the code. It could still be problematic if they are dynamically
linked to specific version ranges of system libraries, but I'd be
interested in trying. Do you know if that tooling already exists as a Maven
plugin or similar?



> http://search.maven.org/#search%7Cga%7C1%7Ca%3A%22protoc%22
>
> Managing your own schema is silly when there are tools whose specific
> purpose in creation was "[schema management is hard on its own and we
> can make it easier with guiderails]". Smells like "not-invented-here" to
> me.
>

I can see how you got there from what I wrote, but I wasn't trying to argue
in favor of doing it ourselves. I was just trying to point out that it
might have some PROs of its own. At this point, I'm open to all options.
Actually, I'm very interested in the new "Records" serialization stuff that
was mentioned in the article, as a possible option (but it's also the least
mature option right now).

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message