spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Allman <mich...@videoamp.com>
Subject Re: [VOTE] Apache Spark 2.1.1 (RC3)
Date Mon, 24 Apr 2017 17:33:27 GMT
The trouble we ran into is that this upgrade was blocking access to our tables, and we didn't
know why. This sounds like a kind of migration operation, but it was not apparent that this
was the case. It took an expert examining a stack trace and source code to figure this out.
Would a more naive end user be able to debug this issue? Maybe we're an unusual case, but
our particular experience was pretty bad. I have my doubts that the schema inference on our
largest tables would ever complete without throwing some kind of timeout (which we were in
fact receiving) or the end user just giving up and killing our job. We ended up doing a rollback
while we investigated the source of the issue. In our case, INFER_NEVER is clearly the best
configuration. We're going to add that to our default configuration files.

My expectation is that a minor point release is a pretty safe bug fix release. We were a bit
hasty in not doing better due diligence pre-upgrade.

One suggestion the Spark team might consider is releasing 2.1.1 with INVER_NEVER and 2.2.0
with INFER_AND_SAVE. Clearly some kind of up-front migration notes would help in identifying
this new behavior in 2.2.

Thanks,

Michael


> On Apr 24, 2017, at 2:09 AM, Wenchen Fan <wenchen@databricks.com> wrote:
> 
> see https://issues.apache.org/jira/browse/SPARK-19611 <https://issues.apache.org/jira/browse/SPARK-19611>
> 
> On Mon, Apr 24, 2017 at 2:22 PM, Holden Karau <holden@pigscanfly.ca <mailto:holden@pigscanfly.ca>>
wrote:
> Whats the regression this fixed in 2.1 from 2.0?
> 
> On Fri, Apr 21, 2017 at 7:45 PM, Wenchen Fan <wenchen@databricks.com <mailto:wenchen@databricks.com>>
wrote:
> IIRC, the new "spark.sql.hive.caseSensitiveInferenceMode" stuff will only scan all table
files only once, and write back the inferred schema to metastore so that we don't need to
do the schema inference again.
> 
> So technically this will introduce a performance regression for the first query, but
compared to branch-2.0, it's not performance regression. And this patch fixed a regression
in branch-2.1, which can run in branch-2.0. Personally, I think we should keep INFER_AND_SAVE
as the default mode.
> 
> + [Eric], what do you think?
> 
> On Sat, Apr 22, 2017 at 1:37 AM, Michael Armbrust <michael@databricks.com <mailto:michael@databricks.com>>
wrote:
> Thanks for pointing this out, Michael.  Based on the conversation on the PR <https://github.com/apache/spark/pull/16944#issuecomment-285529275>
this seems like a risky change to include in a release branch with a default other than NEVER_INFER.
> 
> +Wenchen?  What do you think?
> 
> On Thu, Apr 20, 2017 at 4:14 PM, Michael Allman <michael@videoamp.com <mailto:michael@videoamp.com>>
wrote:
> We've identified the cause of the change in behavior. It is related to the SQL conf key
"spark.sql.hive.caseSensitiveInferenceMode". This key and its related functionality was absent
from our previous build. The default setting in the current build was causing Spark to attempt
to scan all table files during query analysis. Changing this setting to NEVER_INFER disabled
this operation and resolved the issue we had.
> 
> Michael
> 
> 
>> On Apr 20, 2017, at 3:42 PM, Michael Allman <michael@videoamp.com <mailto:michael@videoamp.com>>
wrote:
>> 
>> I want to caution that in testing a build from this morning's branch-2.1 we found
that Hive partition pruning was not working. We found that Spark SQL was fetching all Hive
table partitions for a very simple query whereas in a build from several weeks ago it was
fetching only the required partitions. I cannot currently think of a reason for the regression
outside of some difference between branch-2.1 from our previous build and branch-2.1 from
this morning.
>> 
>> That's all I know right now. We are actively investigating to find the root cause
of this problem, and specifically whether this is a problem in the Spark codebase or not.
I will report back when I have an answer to that question.
>> 
>> Michael
>> 
>> 
>>> On Apr 18, 2017, at 11:59 AM, Michael Armbrust <michael@databricks.com <mailto:michael@databricks.com>>
wrote:
>>> 
>>> Please vote on releasing the following candidate as Apache Spark version 2.1.1.
The vote is open until Fri, April 21st, 2018 at 13:00 PST and passes if a majority of at least
3 +1 PMC votes are cast.
>>> 
>>> [ ] +1 Release this package as Apache Spark 2.1.1
>>> [ ] -1 Do not release this package because ...
>>> 
>>> 
>>> To learn more about Apache Spark, please see http://spark.apache.org/ <http://spark.apache.org/>
>>> 
>>> The tag to be voted on is v2.1.1-rc3 <https://github.com/apache/spark/tree/v2.1.1-rc3>
(2ed19cff2f6ab79a718526e5d16633412d8c4dd4)
>>> 
>>> List of JIRA tickets resolved can be found with this filter <https://issues.apache.org/jira/browse/SPARK-20134?jql=project%20%3D%20SPARK%20AND%20fixVersion%20%3D%202.1.1>.
>>> 
>>> The release files, including signatures, digests, etc. can be found at:
>>> http://home.apache.org/~pwendell/spark-releases/spark-2.1.1-rc3-bin/ <http://home.apache.org/~pwendell/spark-releases/spark-2.1.1-rc3-bin/>
>>> 
>>> Release artifacts are signed with the following key:
>>> https://people.apache.org/keys/committer/pwendell.asc <https://people.apache.org/keys/committer/pwendell.asc>
>>> 
>>> The staging repository for this release can be found at:
>>> https://repository.apache.org/content/repositories/orgapachespark-1230/ <https://repository.apache.org/content/repositories/orgapachespark-1230/>
>>> 
>>> The documentation corresponding to this release can be found at:
>>> http://people.apache.org/~pwendell/spark-releases/spark-2.1.1-rc3-docs/ <http://people.apache.org/~pwendell/spark-releases/spark-2.1.1-rc3-docs/>
>>> 
>>> 
>>> FAQ
>>> 
>>> How can I help test this release?
>>> 
>>> If you are a Spark user, you can help us test this release by taking an existing
Spark workload and running on this release candidate, then reporting any regressions.
>>> 
>>> What should happen to JIRA tickets still targeting 2.1.1?
>>> 
>>> Committers should look at those and triage. Extremely important bug fixes, documentation,
and API tweaks that impact compatibility should be worked on immediately. Everything else
please retarget to 2.1.2 or 2.2.0.
>>> 
>>> But my bug isn't fixed!??!
>>> 
>>> In order to make timely releases, we will typically not hold the release unless
the bug in question is a regression from 2.1.0.
>>> 
>>> What happened to RC1?
>>> 
>>> There were issues with the release packaging and as a result was skipped.
>> 
> 
> 
> 
> 
> 
> 
> -- 
> Cell : 425-233-8271 <tel:(425)%20233-8271>
> Twitter: https://twitter.com/holdenkarau <https://twitter.com/holdenkarau>


Mime
View raw message