incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Kerzner <>
Subject Re: solandra or pig or....?
Date Tue, 21 Jun 2011 18:26:27 GMT
Me too!

I would be interested to know how such queries are done in Solandra. I would
understand it if it creates a complete Lucene index of everything that's in
Cassandra, and adds the text search. Then your query goes against Lucene.

But if some data is found in column families in Cassandra, and some - in
Lucene, then how does the combined query work? Are there examples of its

Thank you,

On Tue, Jun 21, 2011 at 11:19 AM, Jake Luciani <> wrote:

> Solandra can answer the question you used as an example and it's more of a
> fit for low-latency ad-hoc reporting then PIG.  Pig queries will take
> minutes not seconds.
> On Tue, Jun 21, 2011 at 12:12 PM, Sasha Dolgy <> wrote:
>> Folks,
>> Simple question ... Assuming my current use case is the ability to log
>> lots of trivial and seemingly useless sports statistics ... I want a
>> user to be able to query / compare .... For example:
>> --> Show me all baseball players in cheektowaga and ontario,
>> california who have hit a grandslam on tuesdays where it was just a
>> leap year.
>> Each baseball player is represented by a single row in a CF:
>> player_uuid, fullname, hometown, game1, game2, game3, game4
>> Game's are UUID's that are a reference to another row in the same CF
>> that provides information about that game...
>> location, final score, date (unix timestamp or ISO format) , and
>> statitics which are represented as a new column timestamp:player_uuid
>> I can use PIG, as I understand, to run a query to generate specific
>> information about specific "things" and populate that data back into
>> Cassandra in another CF ... similar to the hypothetical search
>> the information is structured already, i assume PIG is the
>> right tool for the job, but may not be ideal for a web application and
>> enabling ad-hoc queries ... it could take anywhere from 2-....?
>> seconds for that query to generate, populate, and return to the
>> user...?
>> On the other hand, I have started to read about Solr / Solandra /
>> Lucandra .... can this provide similar functionality or better ?  or
>> is it more geared towards full text search and indexing ...
>> I don't want to get into the habit of guessing what my potential users
>> want to search for ... trying to think of ways to offload this to
>> them.
>> --
>> Sasha Dolgy
> --

View raw message