flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Sammer <esam...@cloudera.com>
Subject Re: Log4J appender
Date Thu, 01 Dec 2011 18:58:13 GMT
On Wed, Nov 30, 2011 at 8:36 PM, Srinivasan Subramanian <
ssrini_vasan@hotmail.com> wrote:

>  Hi Eric
>
> Thanks for that.  I will look at integrating Log4J appender for flume for
> sure.  Couple of additional questions.
>
> 1. From a performance standpoint, does Log4J appender have any significant
> advantages over tailing the log file?
>

The log4j appender should be more reliable and safer to use than tail as it
communicates directly via RPC with well defined semantics. The tail source
has some issues with race conditions around quickly truncated files and
failure recovery. With respect to performance, they're probably close but
it's hard to say. Tail requires disk IO which can be slow but the log4j
appender uses Avro rpc which isn't blazing fast either.


> 2. It would be ideal if the Log4J appender also allows to put in some meta
> data that I need to use for output bucketing.  Any ideas how that can be
> achieved?
>

I don't believe there's any way to inject metadata into the event generated
by the appender. Someone did some work to make the log4j appender
understand the MDC / NDC stuff (that I know very little about) but I never
had time to review / integrate the patch, sadly. You should just take a
look at the appender source; it's really simple.


>
> Regards
> Srini
>
>
>
>
> ------------------------------
> Date: Wed, 30 Nov 2011 10:28:41 -0800
> Subject: Re: Log4J appender
> From: esammer@cloudera.com
> To: flume-user@incubator.apache.org
>
>
> Srini:
>
> On Wed, Nov 30, 2011 at 12:23 AM, Srinivasan Subramanian <
> ssrini_vasan@hotmail.com> wrote:
>
>  I was evaluating the log4j appender provided with Flume.  But there is
> one aspect I dont understand:
>
> The log4j appender makes a connection to teh flume-agent and retries a
> maximum of 10 times (default - configurable) if the connection is not made
> successfully.
>
> Questions:
>
> 1. When will the connection fail?  If the agent is not running on the
> node?  In that case given that the default implementation waits for 1
> second before each retry for a total of 10 retries, would this mean that
> each logging call from the application would be delayed by 10 seconds?
>  That would affect performance right?
>
>
> Almost certainly, yes, assuming log4j is synchronous (I'm 99.9% sure it
> is). Of course, synchronous logging is the only way to guarantee event
> delivery in this context; if the application were to log the event and move
> on without waiting for a response an event could get dropped and no one
> would be responsible for retrying the send.
>
>
>
> 2. What happens to the log message when the agent is not available?  Is it
> lost?
>
>
> If the log4j appender runs out of retries I believe I wrote it to throw an
> exception. This would be the equivalent of using a standard file appender
> and running out of disk space. In other words, the log call failed and
> should be handled by the application.
>
> Let me know if you have any other questions!
>
>
> I am a little confused with the implementation and any help in explaining
> this is appreciated.
>
> Regards
> Srini
>
>
>
>
>
> --
> Eric Sammer
> twitter: esammer
> data: www.cloudera.com
>



-- 
Eric Sammer
twitter: esammer
data: www.cloudera.com

Mime
View raw message