spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "linxiaojun (JIRA)" <j...@apache.org>
Subject [jira] [Created] (SPARK-21882) OutputMetrics doesn't count written bytes correctly in the saveAsHadoopDataset function
Date Thu, 31 Aug 2017 07:39:00 GMT
linxiaojun created SPARK-21882:
----------------------------------

             Summary: OutputMetrics doesn't count written bytes correctly in the saveAsHadoopDataset
function
                 Key: SPARK-21882
                 URL: https://issues.apache.org/jira/browse/SPARK-21882
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
    Affects Versions: 2.2.0, 1.6.1
            Reporter: linxiaojun
            Priority: Minor


The first job called from saveAsHadoopDataset, running in each executor, does not calculate
the writtenBytes of OutputMetrics correctly. The reason is that we did not initialize the
callback function called to find bytes written in the right way. As usual, statisticsTable
which records statistics in a FileSystem must be initialized at the beginning (this will be
triggered when open SparkHadoopWriter). The solution for this issue is to adjust the order
of callback function initialization. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message