httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Randy Terbush <ra...@zyzzyva.com>
Subject Re: namelookups and databases
Date Wed, 26 Jul 1995 02:06:44 GMT
>  
> > HaHa! <-(PeeWee Herman laugh)
> > 
> > I've been attempting to shove my log data into Postgres and am
> > coming to a sobering realization. It has taken 9 hours to process
> > 15,000 requests.....  
> 
> What are you running it on, a 8086 PC ?

With a Z80 chip....

:-) Seriously, 486/66.

> There's something really wrong if it is taking that long.
> Is it reindexing on each entry ?.. that'd be a braindead approach.

Agreed.  I need to get a bit more intimate with the API for
Postgres (assuming I stick with it).  I am reading a line from
the log, formating it, doing a name lookup, and sending an 
INSERT query to the database. It would be nice to figure out
how to lock the database, shove all of the INSERTS in, and
re-index.

As I mentioned in this mail, eliminating the lookup reduced it
to 1 hour (not 2).  I have not determined if getnamebyaddr()
caches the lookup with the lockup nameserver or not. Anyone
know?  If not, I would be wise to include a simple cache in
my perl program.

> It shouldn't take more than a couple of minutes of perl time to
> swallow a *100,000* request access log and be ready to do some neat
> tricks with it.

Reading in the log file is not the issue.  Looking up the names is
the real bottle neck.  As Brian suggests in later reply, having 
Apache do the lookup might be wise, but I am in the mode to minimize
the load on the HTTP server anticipating more traffic.

RST says he is most CPU poor... NOT!  My server is running on
a Sparc 1+....  This is the one that is currently handling about
10,000/day.

> What exactly are you doing with the data ?

I am wanting to create an accounting system that can be easily
queried for bytes transfered, with specifics for servername and
URL. I am also wanting to accomplish a more space efficient way
of storying this data. I can't even imagine what a site that is
getting 100,000 requests per day let alone 500,000 is doing with
this data.  Log to /dev/null?

-Randy




Mime
View raw message