incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Riyad Kalla <rka...@gmail.com>
Subject Re: Secondary index issue, unable to query for records that should be there
Date Mon, 07 Nov 2011 23:31:14 GMT
Nate, is this all against a single Cassandra server, or do you have a ring
setup? If you do have a ring setup, what is your replicationfactor set to?
Also what ConsistencyLevel are you writing with when storing the values?

-R

On Mon, Nov 7, 2011 at 2:43 PM, Nate Sammons <NSammons@ften.com> wrote:

> Hello,****
>
> ** **
>
> I’m experimenting with Cassandra (DataStax Enterprise 1.0.3), and I’ve got
> a CF with several secondary indexes to try out some options.  Right now I
> have the following to create my CF using the CLI:****
>
> ** **
>
> create column family MyTest with****
>
>   key_validation_class = UTF8Type****
>
>   and comparator = UTF8Type****
>
>   and column_metadata = [****
>
>       -- absolute timestamp for this message, also indexed
> year/month/day/hour/minute****
>
>       -- index these as they are low cardinality****
>
>       {column_name:messageTimestamp, validation_class:LongType},****
>
>       {column_name:messageYear, validation_class:IntegerType, index_type:
> KEYS},****
>
>       {column_name:messageMonth, validation_class:IntegerType, index_type:
> KEYS},****
>
>       {column_name:messageDay, validation_class:IntegerType, index_type:
> KEYS},****
>
>       {column_name:messageHour, validation_class:IntegerType, index_type:
> KEYS},****
>
>       {column_name:messageMinute, validation_class:IntegerType,
> index_type: KEYS},****
>
> ** **
>
>                 … other non-indexed columns defined****
>
> ** **
>
>   ];****
>
> ** **
>
> ** **
>
> So when I insert data, I calculate a year/month/day/hour/minute and set
> these values on a Hector ColumnFamilyUpdater instance and update that way.
> Then later I can query from the command line with CQL such as:****
>
> ** **
>
>                 get MyTest where messageYear=2011 and messageMonth=6 and
> messageDay=1 and messageHour=13 and messageMinute=44;****
>
> ** **
>
> etc.  This generally works, however at some point queries that I know
> should return data no longer return any rows.****
>
> ** **
>
> So for instance, part way through my test (inserting 250K rows), I can
> query for what should be there and get data back such as the above query,
> but later that same query returns 0 rows.  Similarly, with fewer clauses in
> the expression, like this:****
>
> ** **
>
>                 get MyTest where messageYear=2011 and messageMonth=6;****
>
> ** **
>
> Will also return 0 rows.****
>
> ** **
>
> ** **
>
> ???????****
>
> Any idea what could be going wrong?  I’m not getting any exceptions in my
> client during the write, and I don’t see anything in the logs (no errors
> anyway).****
>
> ** **
>
> ** **
>
> ** **
>
> A second question – is what I’m doing insane?  I’m not sure that
> performance on CQL queries with multiple indexed columns is good (does
> Cassandra intelligently use all available indexes on these queries?)****
>
> ** **
>
> ** **
>
> ** **
>
> Thanks,****
>
> ** **
>
> -nate****
>
> ****
>

Mime
View raw message