spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jialin LIu (JIRA)" <>
Subject [jira] [Created] (SPARK-26261) Spark does not check completeness temporary file
Date Tue, 04 Dec 2018 03:00:01 GMT
Jialin LIu created SPARK-26261:

             Summary: Spark does not check completeness temporary file 
                 Key: SPARK-26261
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
    Affects Versions: 2.3.2
            Reporter: Jialin LIu

Spark does not check temporary files' completeness. When persisting to disk is enabled on
some RDDs, a bunch of temporary files will be created on blockmgr folder. Block manager is
able to detect missing blocks while it is not able detect file content being modified during

Our initial test shows that if we truncate the block file before being used by executors,
the program will finish without detecting any error, but the result content is totally wrong.

We believe there should be a file checksum on every RDD file block and these files should
be protected by checksum.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message