logging-log4j-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Smith <psm...@aconex.com>
Subject Re: network logging performance
Date Fri, 05 Sep 2008 02:44:26 GMT
[oops, I originally sent this reply direct to Johannes.  Copying again  
the list]

On 04/09/2008, at 4:19 PM, Johannes Gutleber wrote:

> Dear Paul,
>
> On Sep 4, 2008, at 03:31 , Paul Smith wrote:
>
>> If anyone out there is using "Logging over the network" of any form  
>> (socket, JMS, Multicast, syslog appenders etc), this topic is for  
>> you. I'm wondering whether people could comment on their experience  
>> setting up high performance logging of application data over the  
>> network.  The cost of shipping an event over the wire compared with  
>> writing it to a local file is obviously higher, and so one can  
>> notice user-visible impact if the logging volume is high when using  
>> network-based logging (unless one wraps it in AsyncAppenders).
>
> In fact at CERN we're working with a 2000 computer, 10'000  
> applications systems now for about 2 years in which we deploy also  
> logging over the network using a
> combination of log4cplus and log4j using the XML data format.
>

fantastic.  awesome to hear about this sort of environment.  Really  
appreciate you taking the time to respond in depth here.

>
> The experience with the tools at hand is rather disappointing in  
> that the problem is not the log emitter, but the task of log  
> collection. For that purpose we use a JAVA/Tomcat based log  
> collector that
> has the capability to forward the collected logs to multiple  
> outputs: SocketHub appender, OracleDB, JMS, etc. The collector was  
> supposed to implement filtering and prioritizing capabilities, but
> those have practically all been dropped since they just added to the  
> performance bottleneck instead of avoiding it (we experienced that  
> Java is simply not performant enough to do this job
> for a size of system that we have). People went on about a  
> hierarchical layout of log collectors, but that would not resolve  
> the bottlenecks.
>

Are you able to articulate where you thought the bottleneck within the  
Java process was ?  What JRE were you using?  Was it GC overhead?  Was  
it log4j itself, or the custom built collector that was the problem?


> To that log collector we attach JMS subscribers as well as Chainsaw  
> log viewers for online viewing.
>
> Both the log-collector and the Chainsaw claints fail frequently due  
> to performance problems of Java and extensive memory usage.
>

Which version of Chainsaw did you use? Was it the one inbuilt into  
log4j, or the newer Chainsaw v2?  The latter uses a cyclic buffer so  
as not to consume too much memory.  I do agree though that even  
Chainsaw v2 is far from perfect.. :)

>
> So finally after about 2.5 years we stepped back to write to local  
> files and collect/inspect them when needed. We replaced on-line  
> logging with an error propagation system
> based on an XML protocol and WS-Eventing publisher subscriber. All  
> implemented in C++, no Java anywhere. A testing phase  of 2 month  
> has proven successful and we
> deploy the system since 1 month without major problems.
>
> Of course my mail reflects the situation of a very peculiar, large  
> size distributed soft-real time system that is probably not so  
> common in other fields.
> In addition we wanted to use on-line log viewing and analysis, which  
> is also not the common case in other domains.


My vision, ridiculous as this may sound, is that log4j should be able  
to support an environment in the new cloud environments.  Imagine the  
Hadoop clusters that are out there and making sense of wtf happened  
during the dissemination of the MapReduce portions, with thousands of  
host nodes logging and, somehow, an engineer able to 'see' what  
happened on his job centrally (even if it's not real time).

With a Many->1 collector the collector host is going to have be one  
grunty box....

Pinpoints current design uses ActiveMQ internally.  The Receivers  
inside accept the remote event and place it on the local log4j bus  
(just like it was placed on the remote bus that went to the network  
appender).  Internally a local  'appender' routes the event to an  
internal JMS topic, with ActiveMQ buffering the events to a local  
persistent store temporarily.  A single topic listener then tries to  
chew through the received events to index them (indexing is always  
going to be expensive hence a producer/consumer pattern).  This  
temporary JMS buffer I am hoping will allow the receiving of events  
much faster than they can be consumed which is sort of what JMS (and  
ActiveMQ) is designed to handle, although it'll be interesting to see  
how it translate into practice.  Where I work that is driving me to  
develop Pinpoint has no where near the load environment that yours has  
though.

I'd love to hear some actual logging numbers, that is, how many events  
(log lines would do) that each host on average is producing and their  
combined total (even a peak load for a given hour or something would  
be interesting).  Your environment really is exactly what I had in  
mind to support with Pinpoint, and I'm very much focussed on making it  
as non-intrusive as possible.

cheers,

Paul Smith


---------------------------------------------------------------------
To unsubscribe, e-mail: log4j-user-unsubscribe@logging.apache.org
For additional commands, e-mail: log4j-user-help@logging.apache.org


Mime
View raw message