spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From imatiach-msft <...@git.apache.org>
Subject [GitHub] spark pull request #17059: [SPARK-19733][ML]Removed unnecessary castings and...
Date Tue, 28 Feb 2017 05:10:59 GMT
Github user imatiach-msft commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17059#discussion_r103378267
  
    --- Diff: mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala ---
    @@ -82,12 +82,20 @@ private[recommendation] trait ALSModelParams extends Params with HasPredictionCo
        * Attempts to safely cast a user/item id to an Int. Throws an exception if the value
is
        * out of integer range.
        */
    -  protected val checkedCast = udf { (n: Double) =>
    -    if (n > Int.MaxValue || n < Int.MinValue) {
    -      throw new IllegalArgumentException(s"ALS only supports values in Integer range
for columns " +
    -        s"${$(userCol)} and ${$(itemCol)}. Value $n was out of Integer range.")
    -    } else {
    -      n.toInt
    +  protected val checkedCast = udf { (n: Any) =>
    +    n match {
    +      case v: Int => v // Avoid unnecessary casting
    +      case v: Number =>
    +        val intV = v.intValue()
    +        // Checks if number within Int range and has no fractional part.
    +        if (v.doubleValue == intV) {
    --- End diff --
    
    Sorry, I'm not sure if this is a good idea due to floating point precision... the code
above doesn't seem to do this check, it just calls toInt -- however, if this is absolutely
necessary, I would hope that we could give the user some way to specify the Int range or precision.
 Also, if we are going to go ahead with this change, then we should add some tests to verify
the case that the exception is thrown, but without some ability to specify the precision I'm
not sure if this is the correct thing to do (?).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Mime
View raw message