lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: frequent keyword computation within a search ( and timeinterval )
Date Fri, 06 Jan 2012 00:54:31 GMT
Hmmm, guess you're right, the stats component
does return that data. It's been a long day...

Although I still question whether this is a *good*
use of Solr, I'd still re-examine my approach
whenever I found myself trying to translate
SQL queries into Solr....

But if, after that examination I still required
SUM, stats would do it.

Erick

On Thu, Jan 5, 2012 at 7:23 PM, Jason Rutherglen
<jason.rutherglen@gmail.com> wrote:
>> Short answer is that no, there isn't an aggregate
>> function. And you shouldn't even try
>
> If that is the case why does a 'stats' component exist for Solr with
> the SUM function built in?
>
> http://wiki.apache.org/solr/StatsComponent
>
> On Thu, Jan 5, 2012 at 1:37 PM, Erick Erickson <erickerickson@gmail.com> wrote:
>> You will encounter endless grief until you stop
>> thinking of Solr/Lucene as a replacement for
>> an RDBMS. It is a *text search engine*.
>> Whenever you start asking "how do I implement
>> a SQL statement in Solr", you have to stop
>> and reconsider *why* you are trying to do that.
>> Then recast the question in terms of searching.
>>
>> Short answer is that no, there isn't an aggregate
>> function. And you shouldn't even try.
>>
>> Best
>> Erick
>>
>> On Thu, Jan 5, 2012 at 12:53 PM, prasenjit mukherjee
>> <prasen.bea@gmail.com> wrote:
>>> Thanks Eric for the response.
>>>
>>> Will lucene/solr provide me aggregations ( of field vaues ) satisying
>>> a query criteria ? e.g. SELECT SUM(price) WHERE item=fruits
>>>
>>> Or I need to use hitCollector to achieve that ?
>>>
>>> Any sample solr/lucene query to compte aggregates ( like SUM ) will be great.
>>>
>>> -Thanks,
>>> Prasenjit
>>>
>>> On Thu, Jan 5, 2012 at 7:10 PM, Erick Erickson <erickerickson@gmail.com>
wrote:
>>>> the time interval is just a RangeQuery in the Lucene
>>>> world. The rest is pretty standard search stuff.
>>>>
>>>> You probably want to have a look at the NRT
>>>> (near real time) stuff in trunk.
>>>>
>>>> Your reads/writes are pretty high, so you'll need
>>>> some experimentation to size your site
>>>> correctly.
>>>>
>>>> Best
>>>> Erick
>>>>
>>>> On Wed, Jan 4, 2012 at 12:17 AM, prasenjit mukherjee
>>>> <prasen.bea@gmail.com> wrote:
>>>>> I have a requirement where reads and writes are quite high ( @ 100-500
>>>>> per-sec ). A document has the following fields : timestamp,
>>>>> unique-docid,  content-text, keyword. Average content-text length is
~
>>>>> 20 bytes, there is only 1 keyword for a given docid.
>>>>>
>>>>> At runtime, given a query-term ( which could be null ) and a
>>>>> time-interval,  I need to find out top-k frequent keywords which
>>>>> contains the query-term ( optional if its null )  in its context-text
>>>>> field within that time-interval. I can purge the data every day, hence
>>>>> no need for me to have more than a days data.
>>>>>
>>>>> I have quite a few options here : Starting with MySQL, NoSQLs (
>>>>> Cassandra, Mongo, Couch, Riak, Redis ) , Search-Engine based (
>>>>> lucene/solr ) each having its own pros/cons.
>>>>>
>>>>> In MySQL we can achieve this via : GROUP-BY/COUNT  clause
>>>>> In NoSQL I can probably write a map/reduce task to query these
>>>>> numbers. Although I am not very sure about the query response time.
>>>>> Not sure of we can achieve it via lucene/solr OOB.
>>>>>
>>>>> Any suggestions on what would be a good choice for this use case ?
>>>>>
>>>>> -Thanks,
>>>>> prasenjit
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message