hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcelo Vanzin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3680) Allows customized audit logging in HDFS FSNamesystem
Date Fri, 20 Jul 2012 18:26:35 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419413#comment-13419413

Marcelo Vanzin commented on HDFS-3680:

Thanks for the comments everyone. Good to know FSNamesystem is a singleton, so no need to
worry about that issue.

As for queuing / blocking, I understand the concerns, but I don't see how they're any different
than today. To do something like this today, you'd do one of the following:

(i) Process logs post-facto, by tailing the HDFS log file or something along those lines.

This would be the "completely off process" model, not affecting the NN operation.

(ii) Use a custom log appender that parses log messages inside the NN.

This is almost the same as what my patch does; except it's tied to the log system implementation.

Both cases suffer from turning a log message into something expected to be a "stable" interface;
the second approach (which is doable today, just to make that clear) adds on top of that all
the concerns you guys listed.

Does anyone know how the different log systems behave when using file loggers, which I guess
would be the vast majority of cases for this code? Do they do queuing, do they block waiting
for the message to be written, what happens when they flush buffers, what if the log file
is on NFS, etc? Lots of the concerns raised here are similar to those questions.

I agree that implementations of this interface can do all sorts of bad things, but I don't
see how that's any worse than today. Unless you guys want to forgo using a log system at all
for audit logging, and force writing to files as the only option, having your own custom code
to do it and avoid as many of the issues discussed here as possible.

The code could definitely force queuing on this code path; since not everybody may need that
(the current log approach being the example), I'm wary of turning that into a requirement.

So, those out of the way, a few comments about other things:
. audit logging under the namesystem lock: that can be hacked around. One ugly way would be
to store the audit data in a thread local, and flush it in the unlock() methods.

. using the interface for the existing log: that can be easily done; my goal with not changing
that part was to not change the existing behavior. I could use the "AUDITLOG access logger"
as the default one, that would be very easy to do. A custom access logger would replace it
(or we could make the config option a list, this allowing the use of both again).
> Allows customized audit logging in HDFS FSNamesystem
> ----------------------------------------------------
>                 Key: HDFS-3680
>                 URL: https://issues.apache.org/jira/browse/HDFS-3680
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>    Affects Versions: 2.0.0-alpha
>            Reporter: Marcelo Vanzin
>            Assignee: Marcelo Vanzin
>            Priority: Minor
>         Attachments: accesslogger-v1.patch, accesslogger-v2.patch
> Currently, FSNamesystem writes audit logs to a logger; that makes it easy to get audit
logs in some log file. But it makes it kinda tricky to store audit logs in any other way (let's
say a database), because it would require the code to implement a log appender (and thus know
what logging system is actually being used underneath the fa├žade), and parse the textual
log message generated by FSNamesystem.
> I'm attaching a patch that introduces a cleaner interface for this use case.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message