hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daryn Sharp (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-5241) Provide alternate queuing audit logger to reduce logging contention
Date Wed, 25 Sep 2013 18:16:05 GMT

    [ https://issues.apache.org/jira/browse/HDFS-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13777846#comment-13777846
] 

Daryn Sharp commented on HDFS-5241:
-----------------------------------

bq. That is, by the time response is returned, auditlog is updated.

Yes and no...  Yes if the logging is running ok.  No if anything goes wrong - log4j apparently
decides to swallow all exceptions so you think you are logging when you aren't.

{quote}
bq. It's a configurable undocumented option for now since the audit log becomes prone to data
loss and slight offset of timestamps.
{quote}

bq. With this change, it is possible to return a response to the client and auditlog may not
have been updated. Is that not a concern?

Yes, that is absolutely correct.  It's up to the admin to decide his comfort level with speed
vs. security (audit entry loss).  We're to the point of tipping towards speed because we've
got big NNs that are very idle during heavy loads.  Audit logs are blocking any other optimizations
from having a measurable effect.
                
> Provide alternate queuing audit logger to reduce logging contention
> -------------------------------------------------------------------
>
>                 Key: HDFS-5241
>                 URL: https://issues.apache.org/jira/browse/HDFS-5241
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: namenode
>    Affects Versions: 2.0.0-alpha, 3.0.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>         Attachments: HDFS-5241.patch
>
>
> The default audit logger has extremely poor performance.  The internal synchronization
of log4j causes massive contention between the call handlers (100 by default) which drastically
limits the throughput of the NN.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message