lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Lu <chris...@gmail.com>
Subject Re: Lucene Challenge - sum, count, avg, etc.
Date Thu, 01 Apr 2010 17:30:30 GMT
Hi, Michel,

You can use DBSight free version to test it out.
However, it's a whole solution since you will need to configure it 
first. Like specifying which column you want to do the counting before 
the actual search.

BTW: DBSight also support MIN and MAX, in addition to SUM,AVG.

--
Chris Lu
-------------------------
Instant Scalable Full-Text Search On Any Database/Application
site: http://www.dbsight.net
demo: http://search.dbsight.com
Lucene Database Search in 3 minutes: http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes
DBSight customer, a shopping comparison site, (anonymous per request) got 2.6 Million Euro
funding!



Michel Nadeau wrote:
> @Ken: yeah we thought about it - but we have a HUGE amount of data (sales,
> affiliates, etc.) - so pre-calculating everything isn't really an option.
> Plus I don't know how we would sort.. let's say I get the totals for
> affiliate X, loop totals from day 1 to X (range), sum up, great: I can do
> this for all affiliates and have the totals, but how will I sort by that
> total?
>
> @Chris: we don't want to switch to a whole new platform - is it possible to
> use that DBSight module only? It doesn't seem to be an opensource project so
> I can't really consider it.
>
> - Mike
> akaris@gmail.com
>
>
> On Thu, Apr 1, 2010 at 5:00 AM, Chris Lu <chris.lu@gmail.com> wrote:
>
>   
>> Hi, Michel,
>>
>> This has already been implemented in DBSight. Check it out!
>> http://www.dbsight.net
>>
>> You can get sum, avg for Facet searches. And count is included in Facet
>> search directly.
>>
>> --
>> Chris Lu
>> -------------------------
>> Instant Scalable Full-Text Search On Any Database/Application
>> site: http://www.dbsight.net
>> demo: http://search.dbsight.com
>> Lucene Database Search in 3 minutes:
>> http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes
>> DBSight customer, a shopping comparison site, (anonymous per request) got
>> 2.6 Million Euro funding!
>>
>>
>>
>>
>> Michel Nadeau wrote:
>>
>>     
>>> Hi,
>>>
>>> We're currently in the process of switching many of our screens from MySQL
>>> to Lucene because MySQL simply dies because we have too much data and it's
>>> becoming too long to generate the stats we need.
>>>
>>> So here's one MySQL query that we use to find out our Top 10 Affiliates :
>>>
>>> SELECT SUM(sale_amount) AS total_sales, affialiate_id FROM sales WHERE
>>> sale_date>='2010-03-01' AND sale_date<='2010-03-31' GROUP BY affialiate_id
>>> ORDER BY total_sales DESC LIMIT 10;
>>>
>>> We currently have our "sales" index, containing all sales and all fields -
>>> and it's one big index (over 10M records). We could fetch all documents
>>> within the date range, loop them and add up the total_sales, but it would
>>> be
>>> just crazy to do this all the time (we have a high volume of search).
>>>
>>> We made several tests with Solr (Facets, and even the beta
>>> CollapseFields),
>>> but nothing is really helping us. We could pre-generate the total_sales
>>> for
>>> all possible date ranges... but that would be quite crazy too as the date
>>> range possibilities quickly become endless.
>>>
>>> So - is there any known way to efficiently do SUM(), COUNT() (and even
>>> AVG()
>>> ) using Lucene/Solr/others? I also checked Bobo Browse but it doesn't seem
>>> to offer what I need either.
>>>
>>> Thanks for any hints!!!
>>>
>>> - Mike
>>> akaris@gmail.com
>>>
>>>
>>>
>>>       
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>>     
>
>   

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message