cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Berin Loritsch <blorit...@apache.org>
Subject Re: log analysis
Date Tue, 26 Feb 2002 15:36:55 GMT
Bert Van Kets wrote:
> With all the tools that are built into cocoon lately, is it possible to 
> generate XML from web server log files?  I'd like to parse log files 
> from Apache and IIS.
> How would I do this?
> Bert


Personally, I find that PERL is the language of choice for log munging.
You are going to find that it is the most efficient language for the
job.

Here is what you will discover:

1) Apache/IIS logs get really long really quickly.
2) It will take approx. 30-120 seconds to process a long log file
3) It should only be done periodically (i.e. chron job).  This may open
    security holes so be careful.  Some gray hats I know strongly advise
    against using the Chron daemon--discuss with the admin of the
    machine.

I assisted a friend of mine with a perl program to munge the logs from
Squid to see where students were trying to go on the web.  The reports
were broken down into 1 month periods.  He had one script to break the
log file apart based on a begin and end date.  Another to do the actual
munging.

The logs were used to find any inapropriate sites that the proxy wasn't
already blocking.  It's amazing how popular Britney Spears is with
teenagers.  It also narrowed down the proxy request to the machine
making it.  That way the teachers were able to help kids with certain
"problems".  (Considering it was a church educating the students it was
entirely appropriate--no comments from bleeding heart liberals please).

Hope this helps.

-- 

"They that give up essential liberty to obtain a little temporary safety
  deserve neither liberty nor safety."
                 - Benjamin Franklin


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Mime
View raw message