lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michel Nadeau <aka...@gmail.com>
Subject Re: Lucene Challenge - sum, count, avg, etc.
Date Thu, 01 Apr 2010 13:46:04 GMT
Hi,

Here's an example of raw data that would be in my Sales index:

*Affiliate / SaleDate / SaleAmount*
* mike / 2010-03-01 / 10.00
* john / 2010-03-01 / 10.00
* mike / 2010-03-02 / 15.00
* john / 2010-03-02 /  5.00
* mike / 2010-03-03 / 20.00
* john / 2010-03-03 /  1.00
* mike / 2010-03-04 / 10.00
* john / 2010-03-04 / 10.00
* mike / 2010-03-05 / 15.00
* john / 2010-03-05 /  5.00
* mike / 2010-03-06 / 20.00
* john / 2010-03-06 /  1.00

So our 2 affiliates mike and john made 6 sales each between 2010-03-01 and
2010-03-06. My ultimate query should return this for a query between
2010-03-01 and 2010-03-06 :

*Affiliate / TotalSales*
* mike / 90.00
* john / 32.00

So it's exactly like -

SELECT Affiliate, sum(SaleAmount) as TotalSales FROM Sales
  WHERE SaleDate >= '2010-03-01' AND SaleDate <= '2010-03-06'
  GROUP BY Affiliate
  ORDER BY TotalSales DESC;

- Mike
akaris@gmail.com


On Thu, Apr 1, 2010 at 8:11 AM, prasenjit mukherjee <prasen.bea@gmail.com>wrote:

> Not sure what you mean by "joining" in lucene , since conceptually
> there is only 1 table ( with many field aka columns ) in lucene. A
> representative query would be good to know the use case.
>
> Again didn't get the "sorting" part.  SUM() will return only 1
> aggregated value, so what do you want to sort it on ?
>
> -Prasen
>
> On Thu, Apr 1, 2010 at 7:44 AM, Michel Nadeau <akaris@gmail.com> wrote:
> > Are you planning to be able to sort by these SUMs? A SpanQuery would work
> > great to get the integers... then you would loop and sum up... but what
> > about "joining" with your other data and sorting?
> >
> > - Mike
> > akaris@gmail.com
> >
> >
> > On Wed, Mar 31, 2010 at 9:23 PM, prasenjit mukherjee
> > <prasen.bea@gmail.com>wrote:
> >
> >> I too am trying to achieve something.
> >>
> >> I am thinking of storing the integer values in  payloads and then
> >> using spanquery classes to compute the respective SUMs
> >>
> >> -Prasen
> >>
> >> On Thu, Apr 1, 2010 at 6:47 AM, Michel Nadeau <akaris@gmail.com> wrote:
> >> > Hi,
> >> >
> >> > We're currently in the process of switching many of our screens from
> >> MySQL
> >> > to Lucene because MySQL simply dies because we have too much data and
> >> it's
> >> > becoming too long to generate the stats we need.
> >> >
> >> > So here's one MySQL query that we use to find out our Top 10
> Affiliates :
> >> >
> >> > SELECT SUM(sale_amount) AS total_sales, affialiate_id FROM sales WHERE
> >> > sale_date>='2010-03-01' AND sale_date<='2010-03-31' GROUP BY
> >> affialiate_id
> >> > ORDER BY total_sales DESC LIMIT 10;
> >> >
> >> > We currently have our "sales" index, containing all sales and all
> fields
> >> -
> >> > and it's one big index (over 10M records). We could fetch all
> documents
> >> > within the date range, loop them and add up the total_sales, but it
> would
> >> be
> >> > just crazy to do this all the time (we have a high volume of search).
> >> >
> >> > We made several tests with Solr (Facets, and even the beta
> >> CollapseFields),
> >> > but nothing is really helping us. We could pre-generate the
> total_sales
> >> for
> >> > all possible date ranges... but that would be quite crazy too as the
> date
> >> > range possibilities quickly become endless.
> >> >
> >> > So - is there any known way to efficiently do SUM(), COUNT() (and even
> >> AVG()
> >> > ) using Lucene/Solr/others? I also checked Bobo Browse but it doesn't
> >> seem
> >> > to offer what I need either.
> >> >
> >> > Thanks for any hints!!!
> >> >
> >> > - Mike
> >> > akaris@gmail.com
> >> >
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message