spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joseph K. Bradley (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (SPARK-14567) Add instrumentation logs to MLlib training algorithms
Date Tue, 17 Jan 2017 23:41:26 GMT

     [ https://issues.apache.org/jira/browse/SPARK-14567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Joseph K. Bradley resolved SPARK-14567.
---------------------------------------
       Resolution: Fixed
    Fix Version/s: 2.2.0

> Add instrumentation logs to MLlib training algorithms
> -----------------------------------------------------
>
>                 Key: SPARK-14567
>                 URL: https://issues.apache.org/jira/browse/SPARK-14567
>             Project: Spark
>          Issue Type: Umbrella
>          Components: ML, MLlib
>            Reporter: Timothy Hunter
>            Assignee: Timothy Hunter
>             Fix For: 2.2.0
>
>
> In order to debug performance issues when training mllib algorithms,
> it is useful to log some metrics about the training dataset, the training parameters,
etc.
> This ticket is an umbrella to add some simple logging messages to the most common MLlib
estimators. There should be no performance impact on the current implementation, and the output
is simply printed in the logs.
> Here are some values that are of interest when debugging training tasks:
> * number of features
> * number of instances
> * number of partitions
> * number of classes
> * input RDD/DF cache level
> * hyper-parameters



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message