Subject [jira] Commented: (HADOOP-3578) mapred.system.dir should be accessible only to hadoop daemons
Date Wed, 15 Apr 2009 12:37:15 GMT

Amar Kamat commented on HADOOP-3578:

Some more details
# The jobclient requests the jobtracker for a new job id
# Along with the libs/archives, the jobclient also uploads the job.jar to the DistributedCache
and creates a symlink to it (here the TaskRunner will localize the jars). With HADOOP-4990
(and security in distributed cache), the taskrunner will run under the user permission and
hence will be able to securely localize the job jar
# The jobclient now starts the transaction with the jobtracker by passing the jobconf to the
jobtracker. We expect the jobconf is be lightweight and hence pass it completely over the
  ## If the job (jobconf) fails the checks (acls etc) at the jobtracker, this job is ignored
  ## The jobtracker now maintains the jobid to user mapping for this job. This is done to
make sure that only the user who owns the job can upload/add the splits
  ## finally the jt localizes the job to system-dir/jobid/job.xml so that the tasks are able
to load the conf.
# The jobclient now uploads the job splits (in chunks of 1000 splits) to the jobtracker
  ## The jobtracker will check if the user is the owner of the job
  ## The jobtracker will maintain a mapping from jobid to the (split) file handle for that
  ## This split file is opened as system-dir/jobid/job.split
  ## The jobtracker will stream all the splits passed by the client to this file
# The jobclient now finishes the transaction by invoking submitJob().
  ## The jobtracker will first close the open file handle for the jobsplit 
  ## jt will cleanup the structures maintained for the transaction
  ## do what is done today upon a submit job (note that by now job.split and job.jar are both
present in the system dir)

Questions :
# What if the jobconf is of large size? Do we need to page it too?
# How many files(job-split) to support in parallel (as number of open file handles can lead
 to issues)?
   ## One way to do it would be to cap it 200 uploads in parallel
# How to take care of dead jobclients?
  ## Start a expiry thread that will cleanup dead/hung job submissions (every 5 mins)
# How to prevent the jobclients from passing more splits (say 1,00,000 splits) in one rpc
  ## Looks like this should be capped at the rpc level. I am not sure if there is any provision
for something like this. For now we can leave it as it as.


> mapred.system.dir should be accessible only to hadoop daemons 
> --------------------------------------------------------------
>                 Key: HADOOP-3578
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3578
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
> Currently the jobclient accesses the {{mapred.system.dir}} to add job details. Hence
the {{mapred.system.dir}} has the permissions of {{rwx-wx-wx}}. This could be a security loophole
where the job files might get overwritten/tampered after the job submission. 

