orc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gopal Vijayaraghavan <gop...@apache.org>
Subject Re: [DISCUSS] ORC 2.0
Date Tue, 08 Aug 2017 07:13:27 GMT

> > Let me make sure I have the backwards compatibility straight.  If a user
> > switches to ORC 2.0, he could choose to continue writing in older formats
> > so that his old tools could read it
>    Yes, exactly.

To chime in on Owen's point, the development process has a slight wrinkle in it, which we
avoided in the 0.11 -> 0.12 migration due to ORC being embedded in Hive.

The feature addition is two-fold - the new features are available only when a user flips the
writer versions.

There is no feature flag for reader versions, so the readers have to keep up to date with
the writer changes (or just fail for the "blackholed" ones, with good errors).

Due to the split between projects, I expect to see a two-step development cycle, to clean
up the integration pathways before the ABI is frozen in 2.0.

The entire process can be gated on the writer version - during the development process, there
will be an experimental version (1.5?) and a stable version.

I have no interest in ever supporting an actual 1.5 version data setup in ORC, but for the
sake of integration testing the 1.5->2.0 writer versions are extremely useful stepping
stones towards a multi-project dependency like ORC.

Once the integrations are all complete and the format can be frozen, ORC 2.0 releases can
still disable the default writer version from being upgraded for another stable release.

After the ecosystem has had all its upgrades, the default version gets flipped to 2.0, while
the ability to write 0.12 files will still remain as an option, while all intermediate reader
versions will get dropped.

That's a bit more complicated than being part of Hive and sync'ing releases, but I think this
gives ORC the flexibility to accept contributions from a wide community, supporting multi-project
release timelines, without leaving the implementation full of reader implementations for many
writer versions.


View raw message