spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From crackcell <...@git.apache.org>
Subject [GitHub] spark pull request #17123: [SPARK-19781][ML] Handle NULLs as well as NaNs in...
Date Mon, 22 Jan 2018 13:28:29 GMT
Github user crackcell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17123#discussion_r162935486
  
    --- Diff: docs/ml-guide.md ---
    @@ -122,6 +122,8 @@ There are no deprecations.
     * [SPARK-21027](https://issues.apache.org/jira/browse/SPARK-21027):
      We are now setting the default parallelism used in `OneVsRest` to be 1 (i.e. serial),
in 2.2 and earlier version,
      the `OneVsRest` parallelism would be parallelism of the default threadpool in scala.
    +* [SPARK-19781](https://issues.apache.org/jira/browse/SPARK-19781):
    + `Bucketizer` handles NULL values the same way as NaN when handleInvalid is skip or keep.
    --- End diff --
    
    Yep, you are right. :-p


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Mime
View raw message