incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Victor K." <>
Subject Re: solandra or pig or....?
Date Tue, 21 Jun 2011 19:04:12 GMT
If I may ask Sasha, what exactly are you trying to achieve using SolR 
(or Solandra, I guess it's about the same) ?
Because from what I understood of your problem you need to do statistics 
on your matches, players etc... Or do you just want to retrieve 
information that are already been computed ?
If it is the first thing you are trying to achieve (data aggregation, 
statistics, etc...) SolR won't be of a big use because it is not meant 
to do statistics. If you want to achieve the second then SolR is just 
the tool for you.

On 6/21/2011 2:47 PM, Sasha Dolgy wrote:
> Without getting overly complicated and long winded ... are there
> practical references / examples I can review that demonstrate the
> cassandra/solandra benefits....i had a quick look at
> and it wasn't
> dead obvious to me....
> On Tue, Jun 21, 2011 at 8:19 PM, Jake Luciani<>  wrote:
>> Solandra can answer the question you used as an example and it's more of a
>> fit for low-latency ad-hoc reporting then PIG.  Pig queries will take
>> minutes not seconds.
>> On Tue, Jun 21, 2011 at 12:12 PM, Sasha Dolgy<>  wrote:
>>> Folks,
>>> Simple question ... Assuming my current use case is the ability to log
>>> lots of trivial and seemingly useless sports statistics ... I want a
>>> user to be able to query / compare .... For example:
>>> -->  Show me all baseball players in cheektowaga and ontario,
>>> california who have hit a grandslam on tuesdays where it was just a
>>> leap year.
>>> Each baseball player is represented by a single row in a CF:
>>> player_uuid, fullname, hometown, game1, game2, game3, game4
>>> Game's are UUID's that are a reference to another row in the same CF
>>> that provides information about that game...
>>> location, final score, date (unix timestamp or ISO format) , and
>>> statitics which are represented as a new column timestamp:player_uuid
>>> I can use PIG, as I understand, to run a query to generate specific
>>> information about specific "things" and populate that data back into
>>> Cassandra in another CF ... similar to the hypothetical search
>>> the information is structured already, i assume PIG is the
>>> right tool for the job, but may not be ideal for a web application and
>>> enabling ad-hoc queries ... it could take anywhere from 2-....?
>>> seconds for that query to generate, populate, and return to the
>>> user...?
>>> On the other hand, I have started to read about Solr / Solandra /
>>> Lucandra .... can this provide similar functionality or better ?  or
>>> is it more geared towards full text search and indexing ...
>>> I don't want to get into the habit of guessing what my potential users
>>> want to search for ... trying to think of ways to offload this to
>>> them.
>>> --
>>> Sasha Dolgy
>> --

View raw message