From reviews-return-1034531-archive-asf-public=cust-asf.ponee.io@spark.apache.org Mon Feb 10 02:39:35 2020 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id 9024318065D for ; Mon, 10 Feb 2020 03:39:35 +0100 (CET) Received: (qmail 27655 invoked by uid 500); 10 Feb 2020 02:39:35 -0000 Mailing-List: contact reviews-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list reviews@spark.apache.org Received: (qmail 27642 invoked by uid 99); 10 Feb 2020 02:39:34 -0000 Received: from ec2-52-202-80-70.compute-1.amazonaws.com (HELO gitbox.apache.org) (52.202.80.70) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 10 Feb 2020 02:39:34 +0000 From: GitBox To: reviews@spark.apache.org Subject: [GitHub] [spark] gatorsmile commented on a change in pull request #24682: [SPARK-27838][SQL] Support user provided non-nullable avro schema for nullable catalyst schema without any null record Message-ID: <158130237484.4605.15605265262538327048.gitbox@gitbox.apache.org> References: In-Reply-To: Date: Mon, 10 Feb 2020 02:39:34 -0000 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit gatorsmile commented on a change in pull request #24682: [SPARK-27838][SQL] Support user provided non-nullable avro schema for nullable catalyst schema without any null record URL: https://github.com/apache/spark/pull/24682#discussion_r376848162 ########## File path: docs/sql-migration-guide-upgrade.md ########## @@ -132,6 +132,10 @@ license: | - Since Spark 3.0, Spark will cast `String` to `Date/TimeStamp` in binary comparisons with dates/timestamps. The previous behaviour of casting `Date/Timestamp` to `String` can be restored by setting `spark.sql.legacy.typeCoercion.datetimeToString` to `true`. + - Since Spark 3.0, when Avro files are written with user provided schema, the fields will be matched by field names between catalyst schema and avro schema instead of positions. + + - Since Spark 3.0, when Avro files are written with user provided non-nullable schema, even the catalyst schema is nullable, Spark is still able to write the files. However, Spark will throw runtime NPE if any of the records contains null. Review comment: Let us add a legacy conf and throw an exception by default. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: users@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org For additional commands, e-mail: reviews-help@spark.apache.org