spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Raymond Liu (JIRA)" <>
Subject [jira] [Created] (SPARK-2275) More general Storage Interface for Shuffle / Spill etc.
Date Wed, 25 Jun 2014 07:26:24 GMT
Raymond Liu created SPARK-2275:

             Summary: More general Storage Interface for Shuffle / Spill etc.
                 Key: SPARK-2275
             Project: Spark
          Issue Type: Improvement
          Components: Block Manager, Shuffle
            Reporter: Raymond Liu

Problem 1:

In the current design, when shuffle / spill is involved, A File based interface is assumed
for various classes. While this might not been true when we have disk store implemented base
on something else other than FileSystem ( e.g. an kv/object based NVM device) . And also if
we try to utilize memory or off heap store for shuffle, the File interface also do not work.

Possible approaching : 

So my general idea here is to hide the File Interface, instead using a general ObjectId to
represent the object been written to external store, And pass around this ObjectId to various
class to access the data. 


For Write path: DiskBlockObjectWritter( could be rename to FileBlockObjectWritter) now take
in ObjectId instead of File and do mapping to File internally

For Read path , A InputStream Interface is suppose to be able to retrieved from the ObjectId
by specific store, Thus  various read operation do not need to rely on the assumption that
the lower level storage is a filesystem and rely on File to build their own FileInputStream

In this way, the current File base diskStore could still using File to implement it's internal
 storage, while other solution could be easily been plug in with other low level implementation
and just mapping to the ObjectId for other module to interact with.

Problem 2 :

At present, In shuffle write path, the shuffle block manager manage the mapping from some
blockID to a FileSegment for the benefit of consolidate shuffle, this way it bypass the block
store's blockId based access mode. Then in the read path, when read a shuffle block data,
disk store query shuffleBlockManager to hack the normal blockId to file mapping in order to
correctly read data from file. This really rend to a lot of bi-directional dependencies between
modules and the code logic is some how messed up. None of the shuffle block manager and blockManager/Disk
Store fully control the read path. They are tightly coupled in low level code modules. And
it make it hard to implement other shuffle manager logics. e.g. a sort based shuffle which
might merge all output from one map partition to a single file. This will need to hack more
into the diskStore/diskBlockManager etc to find out the right data to be read.

Possible approaching:

So I think it might be better that we expose an object + offset based read interface  for
BlockStore, ( or at least for DiskStore), e.g.  a getObjectData(objectId, offset, length)
 in addition to the current blockID based interface.

Then those mapping blockId to object and offset code logic can all reside in the specific
shuffle manager, if they do need to merge data into one single object(File here in current
diskStore implementation) they take care of the mapping logic in both read/write path and
take the responsibility of read / write shuffle data ( since they already take care of write
data,  then read data also go through them instead of go through blockmanager is also reasonable,
they can further use the object+offset based read interface for actual read work )

The BlockStore itself should just take care of read/write as required, it should not involve
into the data mapping logic at all. This might make the interface between modules more clear
and decouple each other in a more clean way. 

This message was sent by Atlassian JIRA

View raw message