hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-6690) Limit the number of resources a single map reduce job can submit for localization
Date Tue, 24 May 2016 21:46:12 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15298990#comment-15298990
] 

Jason Lowe commented on MAPREDUCE-6690:
---------------------------------------

Thanks for the patch, Chris!  Initial comments:

Is this intended to apply to all distributed cache items or only those that need to be uploaded
during job submission?  Some comments in the JIRA and the property descriptions imply it also
should apply to items in the distributed cache that already reside in HDFS, but it doesn't
look like the patch does that.  The changes are to JobResourceUploader which AFAIK only gets
involved on files that potentially need to be copied to the staging area before job submission.
 I'm not seeing how this affects items already in HDFS elsewhere before job submission (i.e.:
items already in mapreduce.job.cache.*)

Speaking of mapreduce.job.cache.*, it would be nice if the properties used that same prefix
since it's related to the distributed cache.  Also I'd personally prefer something like mapreduce.job.cache.limit.max-files,
mapreduce.job.cache.limit.max-file-mb, and mapreduce.job.cache.limit.max-total-mb if it's
supposed to apply to the entire distributed cache.

The TotalNumberOfFilesAndSize API is verbose and error-prone -- is there ever a valid reason
to call incrementTotalSize without also calling incrementTotalNumberOfFiles and findMaxFileSize?
 Probably does the wrong thing if the client doesn't call them all for each file.  IMHO there
should just be two APIs, addFile(long filesize) and checkLimit().  Or maybe just one if it's
OK to throw during addFile() directly.

Suggestion: TotalNumberOfFilesAndSize might be easier to comprehend (and type) if named something
like LimitsChecker.  Also its constructor can just be passed a Configuration.  Then it can
hide all the confs and other implementation details related to the dist cache limits, and
a predicate function like hasLimits() can be used to do the early-out checks.  Or maybe we
just pass it the files directly and it can decide internally whether to visit the paths or
early-out.

I think it would be very helpful if the file path was shown in the error message when something
exceeds the single-file limit, otherwise the user has to manually track it down among all
the files involved.

Nit: Javadocs listing the parameters to a method but no description for any of those parameters
isn't useful.


> Limit the number of resources a single map reduce job can submit for localization
> ---------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-6690
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6690
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>            Reporter: Chris Trezzo
>            Assignee: Chris Trezzo
>         Attachments: MAPREDUCE-6690-trunk-v1.patch, MAPREDUCE-6690-trunk-v2.patch
>
>
> Users will sometimes submit a large amount of resources to be localized as part of a
single map reduce job. This can cause issues with YARN localization that destabilize the cluster
and potentially impact other user jobs. These resources are specified via the files, libjars,
archives and jobjar command line arguments or directly through the configuration (i.e. distributed
cache api). The resources specified could be too large in multiple dimensions:
> # Total size
> # Number of files
> # Size of an individual resource (i.e. a large fat jar)
> We would like to encourage good behavior on the client side by having the option of enforcing
resource limits along the above dimensions.
> There should be a separate effort to enforce limits at the YARN layer on the server side,
but this jira is only covering the map reduce layer on the client side. In practice, having
these client side limits will get us a long way towards preventing these localization anti-patterns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org


Mime
View raw message