From reviews-return-683654-archive-asf-public=cust-asf.ponee.io@spark.apache.org Sat Aug 4 22:55:20 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id D892A180626 for ; Sat, 4 Aug 2018 22:55:19 +0200 (CEST) Received: (qmail 49421 invoked by uid 500); 4 Aug 2018 20:55:19 -0000 Mailing-List: contact reviews-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list reviews@spark.apache.org Received: (qmail 49408 invoked by uid 99); 4 Aug 2018 20:55:18 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 04 Aug 2018 20:55:18 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id 20BB1E0854; Sat, 4 Aug 2018 20:55:18 +0000 (UTC) From: MaxGekk To: reviews@spark.apache.org Reply-To: reviews@spark.apache.org Message-ID: Subject: [GitHub] spark pull request #21999: [WIP][SQL] Flattening nested structures Content-Type: text/plain Date: Sat, 4 Aug 2018 20:55:18 +0000 (UTC) GitHub user MaxGekk opened a pull request: https://github.com/apache/spark/pull/21999 [WIP][SQL] Flattening nested structures ## What changes were proposed in this pull request? In the PR, I propose new unary expression `StructFlatten` for flattening nested structures. For example, a dataset with the schema: ``` root |-- st: struct (nullable = false) | |-- col1: long (nullable = false) | |-- col2: struct (nullable = false) | | |-- col3: long (nullable = false) ``` by applying `struct_flatten(st)` it will be transformed to: ``` root |-- structflatten(st): struct (nullable = false) | |-- col1: long (nullable = false) | |-- col2_col3: long (nullable = false) ``` ## How was this patch tested? Added new tests to `CollectionExpressionsSuite` to check flattening of 2-3 nested structures and negative tests to be sure that `struct_flatten` doesn't affect other types. You can merge this pull request into a Git repository by running: $ git pull https://github.com/MaxGekk/spark-1 struct_flatten Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21999.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21999 ---- commit 5603918ae963f78aafb2d1f4f2bd9d566870495b Author: Maxim Gekk Date: 2018-08-04T13:38:08Z Initial implementation commit 0be0d059b8bf571068226c515888a64093468cff Author: Maxim Gekk Date: 2018-08-04T16:07:45Z Making the depth and delimiter as parameters commit 5666ec372a4b79f6161120584abc0c312b111bfb Author: Maxim Gekk Date: 2018-08-04T18:04:23Z Test for depth = 0 commit cd88a2125ba6932ba1fdceca1a24d57124a23afa Author: Maxim Gekk Date: 2018-08-04T18:21:19Z Test for depth = 1 commit b0da02d37ac6db38f63bac95dc295ac37fe4a692 Author: Maxim Gekk Date: 2018-08-04T18:30:18Z Renaming st to struct commit ec361791b83d71f29823157a2c2b49162ddb5901 Author: Maxim Gekk Date: 2018-08-04T19:24:37Z Negative tests commit ced63d7f093c168e2bc9457b6c08b87bfe6c0751 Author: Maxim Gekk Date: 2018-08-04T20:10:00Z Register struct_flatten commit 5b568c67951f6f620cd0d549fdbd0c25f819fe43 Author: Maxim Gekk Date: 2018-08-04T20:42:00Z Merge remote-tracking branch 'origin/master' into struct_flatten # Conflicts: # sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org For additional commands, e-mail: reviews-help@spark.apache.org