spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mallman <...@git.apache.org>
Subject [GitHub] spark pull request #16578: [SPARK-4502][SQL] Parquet nested column pruning
Date Sun, 15 Apr 2018 09:08:55 GMT
Github user mallman commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16578#discussion_r181575614
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
    @@ -151,6 +151,9 @@ abstract class Optimizer(sessionCatalog: SessionCatalog)
         // The following batch should be executed after batch "Join Reorder" and "LocalRelation".
         Batch("Check Cartesian Products", Once,
           CheckCartesianProducts) :+
    +    Batch("Field Extraction Pushdown", fixedPoint,
    +      AggregateFieldExtractionPushdown,
    +      JoinFieldExtractionPushdown) :+
    --- End diff --
    
    Hi @gatorsmile.
    
    Given the scope of your request, can I ask you to provide a reason for it? What you ask
would invalidate some of the existing conversation and review of this PR. It would also substantially
restrict the practical usability of this patch.
    
    I believe I've written this patch with a logical separation of concerns along the lines
you've requested. As a compromise, would you consider an incremental review starting with
the basic projection/filter functionality and proceeding to the optimizer rules following
them?
    
    BTW I'm traveling for a few weeks, and I'm spending most of my time away from work. If
I'm delayed in responding, that's the reason. I'll still keep up, but at a slower pace.
    
    Thanks.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Mime
View raw message