Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 439669B4B for ; Fri, 6 Jan 2012 00:55:03 +0000 (UTC) Received: (qmail 79934 invoked by uid 500); 6 Jan 2012 00:55:01 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 79825 invoked by uid 500); 6 Jan 2012 00:55:00 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 79816 invoked by uid 99); 6 Jan 2012 00:55:00 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 06 Jan 2012 00:55:00 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of erickerickson@gmail.com designates 209.85.214.176 as permitted sender) Received: from [209.85.214.176] (HELO mail-tul01m020-f176.google.com) (209.85.214.176) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 06 Jan 2012 00:54:52 +0000 Received: by obcwn14 with SMTP id wn14so1836502obc.35 for ; Thu, 05 Jan 2012 16:54:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; bh=q2LQx7TZpYHtVJEwaud+t0TlETNLLxcXQEqWEq8i1iM=; b=LQ3gpWGo3jZpb2evzOqpi3bd1fQ3MW/5gzJCwvMHvG1zF5pRAAapgelfeyXHbn1lHh +cxWIv+33hN6ZFDrCPHag1Jgq1ZGahCkG498wqITYAZvhc7LSZfCq/qZ/schviFJP7WQ cle0s5dXh2IbQM+cc8FkZIZ2r8y0p7M8TKPmk= MIME-Version: 1.0 Received: by 10.182.52.8 with SMTP id p8mr3294865obo.24.1325811271069; Thu, 05 Jan 2012 16:54:31 -0800 (PST) Received: by 10.182.43.4 with HTTP; Thu, 5 Jan 2012 16:54:31 -0800 (PST) In-Reply-To: References: Date: Thu, 5 Jan 2012 19:54:31 -0500 Message-ID: Subject: Re: frequent keyword computation within a search ( and timeinterval ) From: Erick Erickson To: java-user@lucene.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Hmmm, guess you're right, the stats component does return that data. It's been a long day... Although I still question whether this is a *good* use of Solr, I'd still re-examine my approach whenever I found myself trying to translate SQL queries into Solr.... But if, after that examination I still required SUM, stats would do it. Erick On Thu, Jan 5, 2012 at 7:23 PM, Jason Rutherglen wrote: >> Short answer is that no, there isn't an aggregate >> function. And you shouldn't even try > > If that is the case why does a 'stats' component exist for Solr with > the SUM function built in? > > http://wiki.apache.org/solr/StatsComponent > > On Thu, Jan 5, 2012 at 1:37 PM, Erick Erickson = wrote: >> You will encounter endless grief until you stop >> thinking of Solr/Lucene as a replacement for >> an RDBMS. It is a *text search engine*. >> Whenever you start asking "how do I implement >> a SQL statement in Solr", you have to stop >> and reconsider *why* you are trying to do that. >> Then recast the question in terms of searching. >> >> Short answer is that no, there isn't an aggregate >> function. And you shouldn't even try. >> >> Best >> Erick >> >> On Thu, Jan 5, 2012 at 12:53 PM, prasenjit mukherjee >> wrote: >>> Thanks Eric for the response. >>> >>> Will lucene/solr provide me aggregations ( of field vaues ) satisying >>> a query criteria ? e.g. SELECT SUM(price) WHERE item=3Dfruits >>> >>> Or I need to use hitCollector to achieve that ? >>> >>> Any sample solr/lucene query to compte aggregates ( like SUM ) will be = great. >>> >>> -Thanks, >>> Prasenjit >>> >>> On Thu, Jan 5, 2012 at 7:10 PM, Erick Erickson wrote: >>>> the time interval is just a RangeQuery in the Lucene >>>> world. The rest is pretty standard search stuff. >>>> >>>> You probably want to have a look at the NRT >>>> (near real time) stuff in trunk. >>>> >>>> Your reads/writes are pretty high, so you'll need >>>> some experimentation to size your site >>>> correctly. >>>> >>>> Best >>>> Erick >>>> >>>> On Wed, Jan 4, 2012 at 12:17 AM, prasenjit mukherjee >>>> wrote: >>>>> I have a requirement where reads and writes are quite high ( @ 100-50= 0 >>>>> per-sec ). A document has the following fields : timestamp, >>>>> unique-docid, =A0content-text, keyword. Average content-text length i= s ~ >>>>> 20 bytes, there is only 1 keyword for a given docid. >>>>> >>>>> At runtime, given a query-term ( which could be null ) and a >>>>> time-interval, =A0I need to find out top-k frequent keywords which >>>>> contains the query-term ( optional if its null ) =A0in its context-te= xt >>>>> field within that time-interval. I can purge the data every day, henc= e >>>>> no need for me to have more than a days data. >>>>> >>>>> I have quite a few options here : Starting with MySQL, NoSQLs ( >>>>> Cassandra, Mongo, Couch, Riak, Redis ) , Search-Engine based ( >>>>> lucene/solr ) each having its own pros/cons. >>>>> >>>>> In MySQL we can achieve this via : GROUP-BY/COUNT =A0clause >>>>> In NoSQL I can probably write a map/reduce task to query these >>>>> numbers. Although I am not very sure about the query response time. >>>>> Not sure of we can achieve it via lucene/solr OOB. >>>>> >>>>> Any suggestions on what would be a good choice for this use case ? >>>>> >>>>> -Thanks, >>>>> prasenjit >>>>> >>>>> --------------------------------------------------------------------- >>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >>>>> For additional commands, e-mail: java-user-help@lucene.apache.org >>>>> >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >>>> For additional commands, e-mail: java-user-help@lucene.apache.org >>>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >>> For additional commands, e-mail: java-user-help@lucene.apache.org >>> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >> For additional commands, e-mail: java-user-help@lucene.apache.org >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org