spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Patrick Wendell (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (SPARK-1622) Expose input split(s) accessed by a task in UI or logs
Date Sun, 27 Apr 2014 02:12:16 GMT

    [ https://issues.apache.org/jira/browse/SPARK-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13982163#comment-13982163
] 

Patrick Wendell edited comment on SPARK-1622 at 4/27/14 2:12 AM:
-----------------------------------------------------------------

I think it would be good to have a general mechanism for RDD implementations to add contextual
information for each task that ends up `TaskInfo` and ultimately in the UI. This could include
values pinned to a specific task but also counters which can be aggregated across all the
tasks in a stage.


was (Author: pwendell):
I think it would be good to have a general mechanism for RDD implementations to add contextual
information for each task that ends up `TaskInfo`. This could include values pinned to a specific
task but also counters which can be aggregated across all the tasks in a stage.

> Expose input split(s) accessed by a task in UI or logs
> ------------------------------------------------------
>
>                 Key: SPARK-1622
>                 URL: https://issues.apache.org/jira/browse/SPARK-1622
>             Project: Spark
>          Issue Type: Improvement
>            Reporter: Matei Zaharia
>
> Right now it's hard to debug which input files or blocks therein have invalid data. The
InputSplit for a HadoopRDD is not even exposed programmatically in Scala/Java (it's private[spark]).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message