spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From yhuai <>
Subject [GitHub] spark pull request: [SPARK-9876][SQL]: Update Parquet to 1.8.1.
Date Wed, 01 Jun 2016 06:37:36 GMT
Github user yhuai commented on the pull request:
    Hello @rdblue, we are pretty late in this release cycle. I am afraid that we cannot actually
upgrade Parquet to 1.8.1 because of the following two reasons:
    1. Since this change was merged pretty late, I am not sure this change can be thoroughly
tested. All of our previous testing was based on the Parquet 1.7.0. Also, some of our internal
jobs started to fail after this upgrade.
    2. Parquet 1.8.1 potentially introduces performance regression.  I am not sure we can
upgrade Parquet before finish the investigation of this regression.
    So, I'd like to propose to revert this upgrade. We can try to upgrade Parquet in the early
development cycle of 2.1 (assuming we have figured out the regression). So, we can have more
time to test this change.

If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at or file a JIRA ticket
with INFRA.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message