spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Apache Spark (JIRA)" <>
Subject [jira] [Commented] (SPARK-27621) Calling transform() method on a LinearRegressionModel throws NoSuchElementException
Date Thu, 02 May 2019 09:25:00 GMT


Apache Spark commented on SPARK-27621:

User 'ancasarb' has created a pull request for this issue:

> Calling transform() method on a LinearRegressionModel throws NoSuchElementException
> -----------------------------------------------------------------------------------
>                 Key: SPARK-27621
>                 URL:
>             Project: Spark
>          Issue Type: Bug
>          Components: ML
>    Affects Versions: 2.3.0, 2.3.1, 2.3.2, 2.3.3, 2.3.4, 2.4.0, 2.4.1, 2.4.2
>            Reporter: Anca Sarb
>            Priority: Minor
>   Original Estimate: 2h
>  Remaining Estimate: 2h
> When transform(...) method is called on a LinearRegressionModel created directly with
the coefficients and intercepts, the following exception is encountered.
> {code:java}
> java.util.NoSuchElementException: Failed to find a default value for loss at$$anonfun$getOrDefault$2.apply(params.scala:780)
at$$anonfun$getOrDefault$2.apply(params.scala:780) at scala.Option.getOrElse(Option.scala:121)
at$class.getOrDefault(params.scala:779) at
at$class.$(params.scala:786) at$(Pipeline.scala:42)
at at$$anonfun$transformSchema$5.apply(Pipeline.scala:311)
at scala.collection.IndexedSeqOptimized$class.foldl(IndexedSeqOptimized.scala:57) at scala.collection.IndexedSeqOptimized$class.foldLeft(IndexedSeqOptimized.scala:66)
at scala.collection.mutable.ArrayOps$ofRef.foldLeft(ArrayOps.scala:186) at
at at
> {code}
> This is because validateAndTransformSchema() is called both during training and scoring
phases, but the checks against the training related params like loss should really be performed
during training phase only, I think, please correct me if I'm missing anything.
> This issue was first reported for mleap ([combust/mleap#455|])
because basically when we serialize the Spark transformers for mleap, we only serialize the
params that are relevant for scoring. We do have the option to de-serialize the serialized
transformers back into Spark for scoring again, but in that case, we no longer have all the
training params.
> Test to reproduce in PR: []

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message