spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jin xing (JIRA)" <>
Subject [jira] [Commented] (SPARK-19659) Fetch big blocks to disk when shuffle-read
Date Tue, 11 Apr 2017 16:23:41 GMT


jin xing commented on SPARK-19659:

*bytesShuffleToMemory* is different from *bytesInFlight*. *bytesInFlight* is for only one
*ShuffleBlockFetcherIterator* and get decreased when the remote blocks is received. *bytesShuffleToMemory*
is for all ShuffleBlockFetcherIterators and get decreased only when reference count of ByteBuf
is zero(though the memory maybe still inside cache and not really destroyed).

If *spark.reducer.maxReqsInFlight* is only for memory control, I think *spark.reducer.maxBytesShuffleToMemory*
is an improvement. In the current PR, I want to simplify the logic and the memory is tracked
by *bytesShuffleToMemory* and memory usage is not tracked by MemoryManager.

> Fetch big blocks to disk when shuffle-read
> ------------------------------------------
>                 Key: SPARK-19659
>                 URL:
>             Project: Spark
>          Issue Type: Improvement
>          Components: Shuffle
>    Affects Versions: 2.1.0
>            Reporter: jin xing
>         Attachments: SPARK-19659-design-v1.pdf, SPARK-19659-design-v2.pdf
> Currently the whole block is fetched into memory(offheap by default) when shuffle-read.
A block is defined by (shuffleId, mapId, reduceId). Thus it can be large when skew situations.
If OOM happens during shuffle read, job will be killed and users will be notified to "Consider
boosting spark.yarn.executor.memoryOverhead". Adjusting parameter and allocating more memory
can resolve the OOM. However the approach is not perfectly suitable for production environment,
especially for data warehouse.
> Using Spark SQL as data engine in warehouse, users hope to have a unified parameter(e.g.
memory) but less resource wasted(resource is allocated but not used),
> It's not always easy to predict skew situations, when happen, it make sense to fetch
remote blocks to disk for shuffle-read, rather than
> kill the job because of OOM. This approach is mentioned during the discussion in SPARK-3019,
by [~sandyr] and [~mridulm80]

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message