lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jon Baer <jonb...@gmail.com>
Subject Re: Solr configuration to enable indexing/searching webapp log files
Date Thu, 29 Apr 2010 17:58:30 GMT
Good question, +1 on finding answer, my take ...

Depending on how large of log files you are talking about it might be better off to do this
w/ HDFS / Hadoop (and a script language like Pig) (or Amazon EMR)

http://developer.amazonwebservices.com/connect/entry.jspa?externalID=873

Theoretically you could split the logs to fields, use a dataimporter and search / sort w/
something like LineEntityProcessor.

http://wiki.apache.org/solr/DataImportHandler#LineEntityProcessor

I've tried to use Solr as a log analytics tool (before dataimporthandler) and it was not worth
the disk space or practical but I'd love to hear otherwise.  In general you could flush daily
logs to an index but working w/ the data in another context if you had to seems better fit
for HDFS use (I think).

- Jon

On Apr 29, 2010, at 1:46 PM, Stefan Maric wrote:

> 
> I thought i remembered seeing some information about this, but have been
> unable to find it
> 
> Does anyone know if there is a configuration / module that would allow us to
> setup Solr to take in the (large) log files generated by our web/app
> servers, so that we can query for things like peak time requests or most
> frequently requested web page etc
> 
> Thanks
> Stefan Maric
> 


Mime
View raw message