Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 6636 invoked from network); 1 Apr 2010 15:46:38 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 1 Apr 2010 15:46:38 -0000 Received: (qmail 55552 invoked by uid 500); 1 Apr 2010 15:46:35 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 55521 invoked by uid 500); 1 Apr 2010 15:46:35 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 55513 invoked by uid 99); 1 Apr 2010 15:46:35 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 Apr 2010 15:46:35 +0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=FREEMAIL_FROM,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of gsiasf@gmail.com designates 74.125.83.176 as permitted sender) Received: from [74.125.83.176] (HELO mail-pv0-f176.google.com) (74.125.83.176) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 Apr 2010 15:46:27 +0000 Received: by pvg7 with SMTP id 7so477621pvg.35 for ; Thu, 01 Apr 2010 08:46:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:sender:content-type :mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to:x-mailer; bh=FhFkYgK+ve6ypCYlyapZ08NYPN59SR4TMJrrk5G8C8Q=; b=xOJ/ZUQCZKhEF5Jv+gaKTwCvzf7PIE2B8Obp586jH1ePKjhVNjZKIKZg87NlVqnQJe tSJgRFg/c7rn7fkdDRM9TdXgn1tdlccyNm7OFAC09zcyOAzBW2dK1vEs2UTA08ANrlQk e5uLA0sjDWoowapTTXoH8+m9mWNKH98m/lHJ8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=sender:content-type:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to:x-mailer; b=mGi2VNQg1BcF/Uk0jyMFsRsLiqCYPSPcrL1J0aYVucCLlxOedOfNihlVDp9pS7tD5p AVrMisUmjo7mUlEFmKThfyenjzSjjh/di/5wUo8oaBBUH32/IqV1LMyS2LBDSmZXhQ19 1XUy7Gvznux/1p/sl088RUw0tAFWBE1+pip0o= Received: by 10.142.1.40 with SMTP id 40mr317708wfa.229.1270136766188; Thu, 01 Apr 2010 08:46:06 -0700 (PDT) Received: from [10.0.0.77] (adsl-065-013-152-164.sip.rdu.bellsouth.net [65.13.152.164]) by mx.google.com with ESMTPS id cm22sm8392154ibb.11.2010.04.01.08.46.04 (version=TLSv1/SSLv3 cipher=RC4-MD5); Thu, 01 Apr 2010 08:46:05 -0700 (PDT) Sender: Grant Ingersoll Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Apple Message framework v1078) Subject: Re: Lucene Challenge - sum, count, avg, etc. From: Grant Ingersoll In-Reply-To: Date: Thu, 1 Apr 2010 11:46:03 -0400 Content-Transfer-Encoding: quoted-printable Message-Id: References: To: java-user@lucene.apache.org X-Mailer: Apple Mail (2.1078) X-Virus-Checked: Checked by ClamAV on apache.org Have you looked at Solr's StatsComponent? On Mar 31, 2010, at 9:17 PM, Michel Nadeau wrote: > Hi, >=20 > We're currently in the process of switching many of our screens from = MySQL > to Lucene because MySQL simply dies because we have too much data and = it's > becoming too long to generate the stats we need. >=20 > So here's one MySQL query that we use to find out our Top 10 = Affiliates : >=20 > SELECT SUM(sale_amount) AS total_sales, affialiate_id FROM sales WHERE > sale_date>=3D'2010-03-01' AND sale_date<=3D'2010-03-31' GROUP BY = affialiate_id > ORDER BY total_sales DESC LIMIT 10; >=20 > We currently have our "sales" index, containing all sales and all = fields - > and it's one big index (over 10M records). We could fetch all = documents > within the date range, loop them and add up the total_sales, but it = would be > just crazy to do this all the time (we have a high volume of search). >=20 > We made several tests with Solr (Facets, and even the beta = CollapseFields), > but nothing is really helping us. We could pre-generate the = total_sales for > all possible date ranges... but that would be quite crazy too as the = date > range possibilities quickly become endless. >=20 > So - is there any known way to efficiently do SUM(), COUNT() (and even = AVG() > ) using Lucene/Solr/others? I also checked Bobo Browse but it doesn't = seem > to offer what I need either. >=20 > Thanks for any hints!!! >=20 > - Mike > akaris@gmail.com -------------------------- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem using Solr/Lucene: = http://www.lucidimagination.com/search --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org