spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From marmbrus <...@git.apache.org>
Subject [GitHub] spark pull request: [WIP][Spark-SQL] Optimize the Constant Folding...
Date Wed, 30 Apr 2014 00:02:10 GMT
Github user marmbrus commented on the pull request:

    https://github.com/apache/spark/pull/482#issuecomment-41746809
  
    I made some more comments about the correctness here.  I'm pretty worried about the fact
that our tests did not catch any of these regressions.  I think we need to add a new set of
tests before merging the PR.  I've sketched them out [here](https://github.com/chenghao-intel/spark/pull/1)
and I made a PR against your branch.  In addition to this change, I think you should also
create evaluation tests for each new optimization case.
    
    Testing is a little harder for aggregate expressions, so how about instead we add a HiveQueryTest
for each of these.  That way we make sure we are obeying the correct semantics (i.e. `COUNT(null)
=> 0`).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message