hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "M. C. Srivas (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-1901) Jobs should not submit the same jar files over and over again
Date Tue, 17 Aug 2010 06:16:22 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899296#action_12899296
] 

M. C. Srivas commented on MAPREDUCE-1901:
-----------------------------------------

>[ from dhruba ]
> It means that a central authority stores the mapping of active callbacks and their associated
clients (and files). if a client dies prematurely, the central authority should >have the
option to recover that callback and hand it over to a newly requesting client. are you proposing
that the NN and/or JT be this central authority?

Well, the mtime is a poor-man's version number that is getting checked on every access to
see if the file at the server is newer.  Adding a callback should reduce this load significantly.

To the point of the question, yes, the NN should be able to revoke the callback whenever it
feels like, at which point the client should get it back before reusing items in its cache.
The client, on reboot (of itself or of the NN), must re-establish the callbacks it cares about.
Note that the callback is not a lock, but a notification mechanism -- many clients can hold
callbacks on the same file -- so it is not necessary for the NN to revoke a callback from
one client in order to hand out a callback for the same file to another client.  When a file
changes, all outstanding callbacks for it are revoked so clients can discard/refresh their
caches.

But the above is moot. Why does a "bulk-mtime" not work, esp given the manner the "bulk-get-md5-signatures"
is supposed to work as in Joydeep's proposal? They seem to be equally onerous (or not).

> Jobs should not submit the same jar files over and over again
> -------------------------------------------------------------
>
>                 Key: MAPREDUCE-1901
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1901
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: Joydeep Sen Sarma
>         Attachments: 1901.PATCH
>
>
> Currently each Hadoop job uploads the required resources (jars/files/archives) to a new
location in HDFS. Map-reduce nodes involved in executing this job would then download these
resources into local disk.
> In an environment where most of the users are using a standard set of jars and files
(because they are using a framework like Hive/Pig) - the same jars keep getting uploaded and
downloaded repeatedly. The overhead of this protocol (primarily in terms of end-user latency)
is significant when:
> - the jobs are small (and conversantly - large in number)
> - Namenode is under load (meaning hdfs latencies are high and made worse, in part, by
this protocol)
> Hadoop should provide a way for jobs in a cooperative environment to not submit the same
files over and again. Identifying and caching execution resources by a content signature (md5/sha)
would be a good alternative to have available.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message