hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcelo Vanzin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3680) Allows customized audit logging in HDFS FSNamesystem
Date Wed, 01 Aug 2012 21:54:04 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13426944#comment-13426944

Marcelo Vanzin commented on HDFS-3680:

Answers inline.

bq. With this approach, should namenode run if for some reason we have removed all the audit
loggers from the list? My answer would be no, given the importance of audit log.
bq. Does the system need a mechanism to add/remove audit loggers? When a failed logger is
fixed, do we need a way to refresh the audit logger so it is picked up by the Namenode again?

Being of the opinion that making it a list wasn't really needed to start with (I can't really
see a scenario where you'd have more than one custom logger, which is what my original patch
did), I don't think the NameNode should be stopped. Removing the audit logger from the list
means the audit logger is buggy, which means that we'd be stopping the NameNode because code
outside the control of the NameNode did something bad, which goes back to previous discussions
about how this can end up being blamed on the HDFS code while it's not HDFS's fault.

I could, though, always fall back to having the default logger in the list if it ever becomes
empty. That would still not generate any audit logs if the logging system is not configured
for it.

bq. When you have multiple audit loggers, is there a need to keep them in sync or out of sync
audit loggers okay?

Not sure what you mean by "in sync", but my answer here is the same regardless: there is no
need to do anything other than what's already being done. Custom loggers, once called, do
what they want with the data, and it's out of the NameNode's control at that point.

bq. Alternatively, should we consider a separate daemon that runs off of the audit log written
to disk and updates other syncs instead of doing it inline in the namenode code?

See my comment on 20/Jul/12 . I think that's overkill and creates more problems than it solves.

> Allows customized audit logging in HDFS FSNamesystem
> ----------------------------------------------------
>                 Key: HDFS-3680
>                 URL: https://issues.apache.org/jira/browse/HDFS-3680
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>    Affects Versions: 2.0.0-alpha
>            Reporter: Marcelo Vanzin
>            Assignee: Marcelo Vanzin
>            Priority: Minor
>         Attachments: accesslogger-v1.patch, accesslogger-v2.patch, hdfs-3680-v3.patch,
hdfs-3680-v4.patch, hdfs-3680-v5.patch
> Currently, FSNamesystem writes audit logs to a logger; that makes it easy to get audit
logs in some log file. But it makes it kinda tricky to store audit logs in any other way (let's
say a database), because it would require the code to implement a log appender (and thus know
what logging system is actually being used underneath the fa├žade), and parse the textual
log message generated by FSNamesystem.
> I'm attaching a patch that introduces a cleaner interface for this use case.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message