hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amar Kamat (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-1543) Log messages of JobACLsManager should use security logging of HADOOP-6586
Date Thu, 11 Mar 2010 10:08:27 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12843992#action_12843992
] 

Amar Kamat commented on MAPREDUCE-1543:
---------------------------------------

bq. Is the format also similar to HDFS audit logs ? 
The format for audit logs used by hdfs is hdfs friendly. They log 
{noformat}
ugi
remote IP
command
src path
dst path (optional)
permissions (optional)
{noformat}

We might try to come up with a  model which both can use (and add it to commons). So here
is how the mapping from hdfs audit-log-format to mapreduce audit-log-format might look like
||hdfs||mapreduce||
|ugi|agent|
|remote-ip|-|
|command|operation|
|src-path|target|
|dst path|-|
|permissions|-|
|-|result|
|-|reason|

So here is a straight forward merge :
{noformat}
<agent> <remote-ip> <operation> <target> <permissions*> <result*>
<reason*>
* means optional
{noformat}

So for hdfs, target will be src-path:dest-path. And for mapreduce, we could skip permissions
or print acls. But the only point that doesnt fit this model for mapreduce is the job-initialization
event. For job-initialization, what should be the value of remote-ip?

Not sure if we are doing an overfit. So for now I think we can keep it simple and have different
models for hdfs and mapreduce. 

bq. Do we need to include host IP of the requestor ? I don't even know if it is possible to
get this information though.
I am not sure how that will help. I think username should suffice. It is possible to get the
IP of the caller using _o.a.h.ipc.Server.getRemoteIp()_.

bq. One concern with implementation is - if some of this logging is happening under the jobtracker
lock, it might impact performance adversely. Can we plan to handle this ?
The idea here is to replace  LOG.* statements with AUDIT_LOG.*. So in terms of logging overhead,
it should be same. In my initial implementation exercise, I have not seen a case where I had
to add extra log lines. Let me check if this needs to be addressed. 


> Log messages of JobACLsManager should use security logging of HADOOP-6586
> -------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1543
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1543
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: security
>            Reporter: Vinod K V
>             Fix For: 0.22.0
>
>
> {{JobACLsManager}} added in MAPREDUCE-1307 logs the successes and failures w.r.t job-level
authorization in the corresponding Daemons' logs. The log messages should instead use security
logging of HADOOP-6586.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message