hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jim Brennan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-15548) Randomize local dirs
Date Thu, 28 Jun 2018 22:16:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-15548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16526850#comment-16526850
] 

Jim Brennan commented on HADOOP-15548:
--------------------------------------

[~eepayne] thanks for the review!  I've uploaded a new patch that adds a check to ensure
we are not always selecting the next dir, which is what it used to do.

 

> Randomize local dirs
> --------------------
>
>                 Key: HADOOP-15548
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15548
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Jim Brennan
>            Assignee: Jim Brennan
>            Priority: Minor
>         Attachments: HADOOP-15548.001.patch, HADOOP-15548.002.patch
>
>
> shuffle LOCAL_DIRS, LOG_DIRS and LOCAL_USER_DIRS when launching container. Some applications
will process these in exactly the same way in every container (e.g. roundrobin) which can
cause disks to get unnecessarily overloaded (e.g. one output file written to first entry specified
in the environment variable).
> There are two paths for local dir allocation, depending on whether the size is unknown
or known.  The unknown path already uses a random algorithm.  The known path initializes
with a random starting point, and then goes round-robin after that.  When selecting a dir,
it increments the last used by one and then checks sequentially until it finds a dir that
satisfies the request.  Proposal is to increment by a random value of between 1 and num_dirs
- 1, and then check sequentially from there.  This should result in a more random selection
in all cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message