hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinod K V (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-1115) Support hierarchical pools of directories to use for intermediate MapReduce data files (for SSD drives)
Date Thu, 15 Oct 2009 08:04:31 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765957#action_12765957
] 

Vinod K V commented on MAPREDUCE-1115:
--------------------------------------

As of now, any file written to via _LocalDirAllocator_ is completely written to a single chosen
disk. First we select where a file-path will be written to and then simply write there. There
is no hopping across disks when writing a single file if we find half way through that enough
space isn't left.

So, with the current model, we cannot spill files across disks. If you are suggesting that
we use the configured SSDs first before normal disks, it can still be done with the current
configuration - simply put the SSD names in the beginning of _mapred.local.dir_.

> Support hierarchical pools of directories to use for intermediate MapReduce data files
(for SSD drives)
> -------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1115
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1115
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: tasktracker
>            Reporter: Amr Awadallah
>            Priority: Minor
>
> Some initial benchmarking shows that SSDs can help a lot for local data files (for shuffle
and other intermediate files). 
> Currently mapred.local.dir just round-robins over the provided directories, it would
be nice to allocate a set of SSD directories to round-robin across first, then spill over
to normal drives if the SSD directories are full.
> -- amr

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message