spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Apache Spark (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (SPARK-21882) OutputMetrics doesn't count written bytes correctly in the saveAsHadoopDataset function
Date Mon, 04 Sep 2017 03:55:01 GMT

     [ https://issues.apache.org/jira/browse/SPARK-21882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Apache Spark reassigned SPARK-21882:
------------------------------------

    Assignee: Apache Spark

> OutputMetrics doesn't count written bytes correctly in the saveAsHadoopDataset function
> ---------------------------------------------------------------------------------------
>
>                 Key: SPARK-21882
>                 URL: https://issues.apache.org/jira/browse/SPARK-21882
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.6.1, 2.2.0
>            Reporter: linxiaojun
>            Assignee: Apache Spark
>            Priority: Minor
>         Attachments: SPARK-21882.patch
>
>
> The first job called from saveAsHadoopDataset, running in each executor, does not calculate
the writtenBytes of OutputMetrics correctly (writtenBytes is 0). The reason is that we did
not initialize the callback function called to find bytes written in the right way. As usual,
statisticsTable which records statistics in a FileSystem must be initialized at the beginning
(this will be triggered when open SparkHadoopWriter). The solution for this issue is to adjust
the order of callback function initialization. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message