cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nate McCall <n...@datastax.com>
Subject Re: Secondary index issue, unable to query for records that should be there
Date Wed, 09 Nov 2011 03:38:06 GMT
I think you wanted to use Int32Type instead of IntegerType for
creating the indexes. IntegerType is actually representative of
java.math.BigInteger.

On Tue, Nov 8, 2011 at 12:28 PM, Nate Sammons <NSammons@ften.com> wrote:
> Interesting…  if I switch the columns to be UTF8 instead of integers, like
> this:
>
>
>
> create column family IndexTest with
>
>   key_validation_class = UTF8Type
>
>   and comparator = UTF8Type
>
>   and column_metadata = [
>
>       {column_name:year, validation_class:UTF8Type, index_type: KEYS},
>
>       {column_name:month, validation_class:UTF8Type, index_type: KEYS},
>
>       {column_name:day, validation_class:UTF8Type, index_type: KEYS},
>
>       {column_name:hour, validation_class:UTF8Type, index_type: KEYS},
>
>       {column_name:minute, validation_class:UTF8Type, index_type: KEYS},
>
>       {column_name:data, validation_class:UTF8Type}
>
>   ];
>
>
>
>
>
> And change the hector code to use setString(…) instead of setInteger(…).
>
>
>
> Then everything works fine.   Is there a CQL bug with respect to non-string
> columns?
>
>
>
>
>
> Thanks,
>
>
>
> -nate
>
>
>
>
>
>
>
> From: Nate Sammons [mailto:NSammons@ften.com]
> Sent: Tuesday, November 08, 2011 11:14 AM
>
> To: user@cassandra.apache.org
> Subject: RE: Secondary index issue, unable to query for records that should
> be there
>
>
>
> Note that I had identical behavior using a fresh download of Cassandra 1.0.2
> as of today.
>
>
>
> Thanks,
>
>
>
> -nate
>
>
>
>
>
> From: Nate Sammons [mailto:NSammons@ften.com]
> Sent: Tuesday, November 08, 2011 10:20 AM
> To: user@cassandra.apache.org
> Subject: RE: Secondary index issue, unable to query for records that should
> be there
>
>
>
> I restarted with logging turned up to DEBUG, and after quite a bit of
> logging during startup, I re-ran a query:
>
>
>
>
>
> get IndexTest where year=2011 and month=1 and day=14 and hour=18 and
> minute=49;
>
>
>
>
>
> produced the following in the following:
>
>
>
> DEBUG [pool-2-thread-3] 2011-11-08 10:19:21,823 CassandraServer.java (line
> 728) scan
>
> DEBUG [pool-2-thread-3] 2011-11-08 10:19:21,824 StorageProxy.java (line
> 1017) restricted ranges for query [-1,-1] are
> [[-1,160425280223280959086247334056682279392],
> (160425280223280959086247334056682279392,-1]]
>
> DEBUG [pool-2-thread-3] 2011-11-08 10:19:21,824 StorageProxy.java (line
> 1104) scan ranges are
> [-1,160425280223280959086247334056682279392],(160425280223280959086247334056682279392,-1]
>
> DEBUG [pool-2-thread-3] 2011-11-08 10:19:21,825 ReadCallback.java (line 77)
> Blockfor/repair is 1/false; setting up requests to natebookpro/127.0.1.1
>
> DEBUG [pool-2-thread-3] 2011-11-08 10:19:21,826 StorageProxy.java (line
> 1131) reading org.apache.cassandra.db.IndexScanCommand@7bc203c from
> natebookpro/127.0.1.1
>
> DEBUG [ReadStage:46] 2011-11-08 10:19:21,827 KeysSearcher.java (line 96)
> Primary scan clause is minute
>
> DEBUG [ReadStage:46] 2011-11-08 10:19:21,827 KeysSearcher.java (line 109)
> Expanding slice filter to entire row to cover additional expressions
>
> DEBUG [ReadStage:46] 2011-11-08 10:19:21,827 KeysSearcher.java (line 151)
> Scanning index 'IndexTest.minute EQ 49' starting with
>
> DEBUG [ReadStage:46] 2011-11-08 10:19:21,827 CollationController.java (line
> 189) collectAllData
>
> DEBUG [ReadStage:46] 2011-11-08 10:19:21,828 KeysSearcher.java (line 163)
> fetched null
>
> DEBUG [ReadStage:46] 2011-11-08 10:19:21,828 IndexScanVerbHandler.java (line
> 46) Sending RangeSliceReply{rows=} to 808@natebookpro/127.0.1.1
>
> DEBUG [RequestResponseStage:21] 2011-11-08 10:19:21,829
> ResponseVerbHandler.java (line 44) Processing response on a callback from
> 808@natebookpro/127.0.1.1
>
> DEBUG [pool-2-thread-3] 2011-11-08 10:19:21,829 ReadCallback.java (line 77)
> Blockfor/repair is 1/false; setting up requests to natebookpro/127.0.1.1
>
> DEBUG [pool-2-thread-3] 2011-11-08 10:19:21,830 StorageProxy.java (line
> 1131) reading org.apache.cassandra.db.IndexScanCommand@6a25a21d from
> natebookpro/127.0.1.1
>
> DEBUG [ReadStage:47] 2011-11-08 10:19:21,831 KeysSearcher.java (line 96)
> Primary scan clause is minute
>
> DEBUG [ReadStage:47] 2011-11-08 10:19:21,832 KeysSearcher.java (line 109)
> Expanding slice filter to entire row to cover additional expressions
>
> DEBUG [ReadStage:47] 2011-11-08 10:19:21,832 KeysSearcher.java (line 151)
> Scanning index 'IndexTest.minute EQ 49' starting with
>
> DEBUG [ReadStage:47] 2011-11-08 10:19:21,832 CollationController.java (line
> 189) collectAllData
>
> DEBUG [ReadStage:47] 2011-11-08 10:19:21,832 KeysSearcher.java (line 163)
> fetched null
>
> DEBUG [ReadStage:47] 2011-11-08 10:19:21,833 IndexScanVerbHandler.java (line
> 46) Sending RangeSliceReply{rows=} to 809@natebookpro/127.0.1.1
>
> DEBUG [RequestResponseStage:22] 2011-11-08 10:19:21,834
> ResponseVerbHandler.java (line 44) Processing response on a callback from
> 809@natebookpro/127.0.1.1
>
>
>
>
>
>
>
> Whereas a direct read of a key using “get IndexTest[2011-1-14-18-49--1];”
> produced a result, and the following in the logs:
>
>
>
> DEBUG [pool-2-thread-2] 2011-11-08 10:11:20,153 CassandraServer.java (line
> 323) get_slice
>
> DEBUG [pool-2-thread-2] 2011-11-08 10:11:20,159 StorageProxy.java (line 623)
> Command/ConsistencyLevel is SliceFromReadCommand(table='Test',
> key='323031312d312d31342d31382d34392d2d31',
> column_parent='QueryPath(columnFamilyName='IndexTest',
> superColumnName='null', columnName='null')', start='', finish='',
> reversed=false, count=1000000)/ONE
>
> DEBUG [pool-2-thread-2] 2011-11-08 10:11:20,159 ReadCallback.java (line 77)
> Blockfor/repair is 1/true; setting up requests to natebookpro/127.0.1.1
>
> DEBUG [pool-2-thread-2] 2011-11-08 10:11:20,159 StorageProxy.java (line 639)
> reading data locally
>
> DEBUG [ReadStage:37] 2011-11-08 10:11:20,160 StorageProxy.java (line 792)
> LocalReadRunnable reading SliceFromReadCommand(table='Test',
> key='323031312d312d31342d31382d34392d2d31',
> column_parent='QueryPath(columnFamilyName='IndexTest',
> superColumnName='null', columnName='null')', start='', finish='',
> reversed=false, count=1000000)
>
> DEBUG [ReadStage:37] 2011-11-08 10:11:20,161 CollationController.java (line
> 189) collectAllData
>
> DEBUG [ReadStage:37] 2011-11-08 10:11:20,162 SliceQueryFilter.java (line
> 123) collecting 0 of 1000000: data:false:512@1320769510502017
>
> DEBUG [ReadStage:37] 2011-11-08 10:11:20,163 SliceQueryFilter.java (line
> 123) collecting 1 of 1000000: day:false:4@1320769510502014
>
> DEBUG [ReadStage:37] 2011-11-08 10:11:20,163 SliceQueryFilter.java (line
> 123) collecting 2 of 1000000: hour:false:4@1320769510502015
>
> DEBUG [ReadStage:37] 2011-11-08 10:11:20,163 SliceQueryFilter.java (line
> 123) collecting 3 of 1000000: minute:false:4@1320769510502016
>
> DEBUG [ReadStage:37] 2011-11-08 10:11:20,163 SliceQueryFilter.java (line
> 123) collecting 4 of 1000000: month:false:4@1320769510502013
>
> DEBUG [ReadStage:37] 2011-11-08 10:11:20,163 SliceQueryFilter.java (line
> 123) collecting 5 of 1000000: year:false:4@1320769510502012
>
> DEBUG [pool-2-thread-2] 2011-11-08 10:11:20,170 StorageProxy.java (line 689)
> Read: 10 ms.
>
>
>
>
>
>
>
> Note that a query for “get IndexTest where minute=49” (which also returns no
> records) results in the following logs:
>
>
>
>
>
> DEBUG [pool-2-thread-2] 2011-11-08 10:13:40,210 CassandraServer.java (line
> 728) scan
>
> DEBUG [pool-2-thread-2] 2011-11-08 10:13:40,211 StorageProxy.java (line
> 1017) restricted ranges for query [-1,-1] are
> [[-1,160425280223280959086247334056682279392],
> (160425280223280959086247334056682279392,-1]]
>
> DEBUG [pool-2-thread-2] 2011-11-08 10:13:40,211 StorageProxy.java (line
> 1104) scan ranges are
> [-1,160425280223280959086247334056682279392],(160425280223280959086247334056682279392,-1]
>
> DEBUG [pool-2-thread-2] 2011-11-08 10:13:40,212 ReadCallback.java (line 77)
> Blockfor/repair is 1/false; setting up requests to natebookpro/127.0.1.1
>
> DEBUG [pool-2-thread-2] 2011-11-08 10:13:40,212 StorageProxy.java (line
> 1131) reading org.apache.cassandra.db.IndexScanCommand@547d6c11 from
> natebookpro/127.0.1.1
>
> DEBUG [ReadStage:42] 2011-11-08 10:13:40,213 KeysSearcher.java (line 96)
> Primary scan clause is minute
>
> DEBUG [ReadStage:42] 2011-11-08 10:13:40,213 KeysSearcher.java (line 109)
> Expanding slice filter to entire row to cover additional expressions
>
> DEBUG [ReadStage:42] 2011-11-08 10:13:40,214 KeysSearcher.java (line 151)
> Scanning index 'IndexTest.minute EQ 49' starting with
>
> DEBUG [ReadStage:42] 2011-11-08 10:13:40,214 CollationController.java (line
> 189) collectAllData
>
> DEBUG [ReadStage:42] 2011-11-08 10:13:40,214 KeysSearcher.java (line 163)
> fetched null
>
> DEBUG [ReadStage:42] 2011-11-08 10:13:40,214 IndexScanVerbHandler.java (line
> 46) Sending RangeSliceReply{rows=} to 462@natebookpro/127.0.1.1
>
> DEBUG [RequestResponseStage:17] 2011-11-08 10:13:40,215
> ResponseVerbHandler.java (line 44) Processing response on a callback from
> 462@natebookpro/127.0.1.1
>
> DEBUG [pool-2-thread-2] 2011-11-08 10:13:40,216 ReadCallback.java (line 77)
> Blockfor/repair is 1/false; setting up requests to natebookpro/127.0.1.1
>
> DEBUG [pool-2-thread-2] 2011-11-08 10:13:40,216 StorageProxy.java (line
> 1131) reading org.apache.cassandra.db.IndexScanCommand@62132898 from
> natebookpro/127.0.1.1
>
> DEBUG [ReadStage:43] 2011-11-08 10:13:40,217 KeysSearcher.java (line 96)
> Primary scan clause is minute
>
> DEBUG [ReadStage:43] 2011-11-08 10:13:40,217 KeysSearcher.java (line 109)
> Expanding slice filter to entire row to cover additional expressions
>
> DEBUG [ReadStage:43] 2011-11-08 10:13:40,218 KeysSearcher.java (line 151)
> Scanning index 'IndexTest.minute EQ 49' starting with
>
> DEBUG [ReadStage:43] 2011-11-08 10:13:40,218 CollationController.java (line
> 189) collectAllData
>
> DEBUG [ReadStage:43] 2011-11-08 10:13:40,218 KeysSearcher.java (line 163)
> fetched null
>
> DEBUG [ReadStage:43] 2011-11-08 10:13:40,218 IndexScanVerbHandler.java (line
> 46) Sending RangeSliceReply{rows=} to 463@natebookpro/127.0.1.1
>
> DEBUG [RequestResponseStage:18] 2011-11-08 10:13:40,219
> ResponseVerbHandler.java (line 44) Processing response on a callback from
> 463@natebookpro/127.0.1.1
>
>
>
>
>
>
>
>
>
>
>
> Thanks,
>
>
>
> -nate
>
>
>
>
>
>
>
>
>
>
>
>
>
> From: Jake Luciani [mailto:jakers@gmail.com]
> Sent: Tuesday, November 08, 2011 8:56 AM
> To: user@cassandra.apache.org
> Subject: Re: Secondary index issue, unable to query for records that should
> be there
>
>
>
> Hi Nate,
>
>
>
> Could you try running it with debug enabled on the logs? it will give more
> insite into what's going on.
>
>
>
> -Jake
>
>
>
> On Tue, Nov 8, 2011 at 3:45 PM, Nate Sammons <NSammons@ften.com> wrote:
>
> This is against a single server, not a cluster.  Replication factor for the
> keyspace is set to 1, CL is the default for Hector, which I think is QUORUM.
>
>
>
> I’m trying to get a simple test together that shows this.  Does anyone know
> if multiple indexes like this are efficient?
>
>
>
> Thanks,
>
>
>
> -nate
>
>
>
>
>
> From: Riyad Kalla [mailto:rkalla@gmail.com]
> Sent: Monday, November 07, 2011 4:31 PM
> To: user@cassandra.apache.org
> Subject: Re: Secondary index issue, unable to query for records that should
> be there
>
>
>
> Nate, is this all against a single Cassandra server, or do you have a ring
> setup? If you do have a ring setup, what is your replicationfactor set to?
> Also what ConsistencyLevel are you writing with when storing the values?
>
>
>
> -R
>
> On Mon, Nov 7, 2011 at 2:43 PM, Nate Sammons <NSammons@ften.com> wrote:
>
> Hello,
>
>
>
> I’m experimenting with Cassandra (DataStax Enterprise 1.0.3), and I’ve got a
> CF with several secondary indexes to try out some options.  Right now I have
> the following to create my CF using the CLI:
>
>
>
> create column family MyTest with
>
>   key_validation_class = UTF8Type
>
>   and comparator = UTF8Type
>
>   and column_metadata = [
>
>       -- absolute timestamp for this message, also indexed
> year/month/day/hour/minute
>
>       -- index these as they are low cardinality
>
>       {column_name:messageTimestamp, validation_class:LongType},
>
>       {column_name:messageYear, validation_class:IntegerType, index_type:
> KEYS},
>
>       {column_name:messageMonth, validation_class:IntegerType, index_type:
> KEYS},
>
>       {column_name:messageDay, validation_class:IntegerType, index_type:
> KEYS},
>
>       {column_name:messageHour, validation_class:IntegerType, index_type:
> KEYS},
>
>       {column_name:messageMinute, validation_class:IntegerType, index_type:
> KEYS},
>
>
>
>                 … other non-indexed columns defined
>
>
>
>   ];
>
>
>
>
>
> So when I insert data, I calculate a year/month/day/hour/minute and set
> these values on a Hector ColumnFamilyUpdater instance and update that way.
> Then later I can query from the command line with CQL such as:
>
>
>
>                 get MyTest where messageYear=2011 and messageMonth=6 and
> messageDay=1 and messageHour=13 and messageMinute=44;
>
>
>
> etc.  This generally works, however at some point queries that I know should
> return data no longer return any rows.
>
>
>
> So for instance, part way through my test (inserting 250K rows), I can query
> for what should be there and get data back such as the above query, but
> later that same query returns 0 rows.  Similarly, with fewer clauses in the
> expression, like this:
>
>
>
>                 get MyTest where messageYear=2011 and messageMonth=6;
>
>
>
> Will also return 0 rows.
>
>
>
>
>
> ???????
>
> Any idea what could be going wrong?  I’m not getting any exceptions in my
> client during the write, and I don’t see anything in the logs (no errors
> anyway).
>
>
>
>
>
>
>
> A second question – is what I’m doing insane?  I’m not sure that performance
> on CQL queries with multiple indexed columns is good (does Cassandra
> intelligently use all available indexes on these queries?)
>
>
>
>
>
>
>
> Thanks,
>
>
>
> -nate
>
>
>
>
>
> --
> http://twitter.com/tjake

Mime
View raw message