spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Owen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-19234) AFTSurvivalRegression chokes silently or with confusing errors when any labels are zero
Date Tue, 17 Jan 2017 08:41:26 GMT

    [ https://issues.apache.org/jira/browse/SPARK-19234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15825651#comment-15825651
] 

Sean Owen commented on SPARK-19234:
-----------------------------------

I agree that you could change the implementation to throw an explicit exception for nonpositive
values. It could be worth double-checking with the author first.

> AFTSurvivalRegression chokes silently or with confusing errors when any labels are zero
> ---------------------------------------------------------------------------------------
>
>                 Key: SPARK-19234
>                 URL: https://issues.apache.org/jira/browse/SPARK-19234
>             Project: Spark
>          Issue Type: Bug
>          Components: ML
>    Affects Versions: 2.1.0
>         Environment: spark-shell or pyspark
>            Reporter: Andrew MacKinlay
>            Priority: Minor
>         Attachments: spark-aft-failure.txt
>
>
> If you try and use AFTSurvivalRegression and any label in your input data is 0.0, you
get coefficients of 0.0 returned, and in many cases, errors like this:
> {{17/01/16 15:10:50 ERROR StrongWolfeLineSearch: Encountered bad values in function evaluation.
Decreasing step size to NaN}}
> Zero should, I think, be an allowed value for survival analysis. I don't know if this
is a pathological case for AFT specifically as I don't know enough about it, but this behaviour
is clearly undesirable. If you have any labels of 0.0, you get either a) obscure error messages,
with no knowledge of the cause and coefficients which are all zero or b) no errors messages
at all and coefficients of zero (arguably worse, since you don't even have console output
to tell you something's gone awry). If AFT doesn't work with zero-valued labels, Spark should
fail fast and let the developer know why. If it does, we should get results here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message