lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@apache.org>
Subject Re: Lucene Challenge - sum, count, avg, etc.
Date Thu, 01 Apr 2010 15:46:03 GMT
Have you looked at Solr's StatsComponent?

On Mar 31, 2010, at 9:17 PM, Michel Nadeau wrote:

> Hi,
> 
> We're currently in the process of switching many of our screens from MySQL
> to Lucene because MySQL simply dies because we have too much data and it's
> becoming too long to generate the stats we need.
> 
> So here's one MySQL query that we use to find out our Top 10 Affiliates :
> 
> SELECT SUM(sale_amount) AS total_sales, affialiate_id FROM sales WHERE
> sale_date>='2010-03-01' AND sale_date<='2010-03-31' GROUP BY affialiate_id
> ORDER BY total_sales DESC LIMIT 10;
> 
> We currently have our "sales" index, containing all sales and all fields -
> and it's one big index (over 10M records). We could fetch all documents
> within the date range, loop them and add up the total_sales, but it would be
> just crazy to do this all the time (we have a high volume of search).
> 
> We made several tests with Solr (Facets, and even the beta CollapseFields),
> but nothing is really helping us. We could pre-generate the total_sales for
> all possible date ranges... but that would be quite crazy too as the date
> range possibilities quickly become endless.
> 
> So - is there any known way to efficiently do SUM(), COUNT() (and even AVG()
> ) using Lucene/Solr/others? I also checked Bobo Browse but it doesn't seem
> to offer what I need either.
> 
> Thanks for any hints!!!
> 
> - Mike
> akaris@gmail.com

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem using Solr/Lucene: http://www.lucidimagination.com/search


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message