incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: problems while TimeUUIDType-index-querying with two expressions
Date Tue, 15 Mar 2011 01:06:24 GMT
It's failing to when comparing two TimeUUID values because on of them is not properly formatted.
In this case it's comparing a stored value with the value passed in the get_indexed_slice()
query expression. 

I'm going to assume it's the value passed for the expression. 

When you create the IndexedSlicesQuery this is incorrect

IndexedSlicesQuery<String, byte[], byte[]> indexQuery = HFactory
		.createIndexedSlicesQuery(keyspace,
				stringSerializer, bytesSerializer, bytesSerializer);

Use a UUIDSerializer for the last param and then pass the UUID you want to build the expressing.
Rather than the string/byte thing you are passing

Hope that helps.
Aaron

On 15 Mar 2011, at 04:17, Johannes Hoerle wrote:

> Hi all,
>  
> in order to improve our queries, we started to use IndexedSliceQueries from the hector
project (https://github.com/zznate/hector-examples). I followed the instructions for creating
IndexedSlicesQuery with GetIndexedSlices.java. 
> I created the corresponding CF with in a keyspace called “Keyspace1” ( “create
keyspace  Keyspace1;”) with:
> "create column family Indexed1 with column_type='Standard' and comparator='UTF8Type'
and keys_cached=200000 and read_repair_chance=1.0 and rows_cached=20000 and column_metadata=[{column_name:
birthdate, validation_class: LongType, index_name: dateIndex, index_type: KEYS},{column_name:
birthmonth, validation_class: LongType, index_name: monthIndex, index_type: KEYS}];"
> and the example GetIndexedSlices.java worked fine. 
>  
> Output of CF Indexed1:
> ---------------------------------------
> [default@Keyspace1] list Indexed1;
> Using default limit of 100
> -------------------
> RowKey: fake_key_12
> => (column=birthdate, value=1974, timestamp=1300110485826059)
> => (column=birthmonth, value=0, timestamp=1300110485826060)
> => (column=fake_column_0, value=66616b655f76616c75655f305f3132, timestamp=1300110485826056)
> => (column=fake_column_1, value=66616b655f76616c75655f315f3132, timestamp=1300110485826057)
> => (column=fake_column_2, value=66616b655f76616c75655f325f3132, timestamp=1300110485826058)
> -------------------
> RowKey: fake_key_8
> => (column=birthdate, value=1974, timestamp=1300110485826039)
> => (column=birthmonth, value=8, timestamp=1300110485826040)
> => (column=fake_column_0, value=66616b655f76616c75655f305f38, timestamp=1300110485826036)
> => (column=fake_column_1, value=66616b655f76616c75655f315f38, timestamp=1300110485826037)
> => (column=fake_column_2, value=66616b655f76616c75655f325f38, timestamp=1300110485826038)
> -------------------
> ....
>  
>  
> Now to the problem:
> As we have another column format in our cluster (using TimeUUIDType as comparator in
CF definition) I adapted the application to our schema on a cassandra-0.7.3 cluster. 
> We use a manually defined UUID for a mandator id index (00000000-0000-1000-0000-000000000000)
and another one for a userid index (00000001-0000-1000-0000-000000000000). It can be created
with:
> "create column family ByUser with column_type='Standard' and comparator='TimeUUIDType'
and keys_cached=200000 and read_repair_chance=1.0 and rows_cached=20000 and column_metadata=[{column_name:
00000000-0000-1000-0000-000000000000, validation_class: BytesType, index_name: mandatorIndex,
index_type: KEYS}, {column_name: 00000001-0000-1000-0000-000000000000, validation_class: BytesType,
index_name: useridIndex, index_type: KEYS}];"
>  
>  
> which looks in the cluster using cassandra-cli like this:
>  
> [default@Keyspace1] describe keyspace;
> Keyspace: Keyspace1:
>   Replication Strategy: org.apache.cassandra.locator.SimpleStrategy
>     Replication Factor: 1
>   Column Families:
>     ColumnFamily: ByUser
>       Columns sorted by: org.apache.cassandra.db.marshal.TimeUUIDType
>       Row cache size / save period: 20000.0/0
>       Key cache size / save period: 200000.0/14400
>       Memtable thresholds: 0.2953125/63/1440
>       GC grace seconds: 864000
>       Compaction min/max thresholds: 4/32
>       Read repair chance: 0.01
>       Built indexes: [ByUser.mandatorIndex, ByUser.useridIndex]
>       Column Metadata:
>         Column Name: 00000001-0000-1000-0000-000000000000
>           Validation Class: org.apache.cassandra.db.marshal.BytesType
>           Index Name: useridIndex
>           Index Type: KEYS
>         Column Name: 00000000-0000-1000-0000-000000000000
>           Validation Class: org.apache.cassandra.db.marshal.BytesType
>           Index Name: mandatorIndex
>           Index Type: KEYS
>     ColumnFamily: Indexed1
>       Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type
>       Row cache size / save period: 20000.0/0
>       Key cache size / save period: 200000.0/14400
>       Memtable thresholds: 0.2953125/63/1440
>       GC grace seconds: 864000
>       Compaction min/max thresholds: 4/32
>       Read repair chance: 0.01
>       Built indexes: [Indexed1.dateIndex, Indexed1.monthIndex]
>       Column Metadata:
>         Column Name: birthmonth (birthmonth)
>           Validation Class: org.apache.cassandra.db.marshal.LongType
>           Index Name: monthIndex
>           Index Type: KEYS
>         Column Name: birthdate (birthdate)
>           Validation Class: org.apache.cassandra.db.marshal.LongType
>           Index Name: dateIndex
>           Index Type: KEYS
> [default@Keyspace1] list ByUser;
> Using default limit of 100
> -------------------
> RowKey: testMandator!!user01
> => (column=00000000-0000-1000-0000-000000000000, value=746573744d616e6461746f72, timestamp=1300111213321000)
> => (column=00000001-0000-1000-0000-000000000000, value=757365723031, timestamp=1300111213322000)
> => (column=f064b480-495e-11e0-abc4-0024e89fa587, value=3135, timestamp=1300111213561000)
>  
> 1 Row Returned.
>  
> the values of the index colums 00000000-0000-1000-0000-000000000000 and 00000001-0000-1000-0000-000000000000
represent "testMandator" and and "user01" as bytes 
> the third column is a randomly generated one with value "15" that are inserted in GetTimeUUIDIndexedSlices
app.
> I attached both source codes, GetIndexedSlices and GetTimeUUIDIndexedSlices. Currently
the second index expression for the userid index in GetTimeUUIDIndexedSlices.queryCf(...)
method 
>  
>             indexQuery.addEqualsExpression(asByteArray(MANDATOR_UUID), new StringSerializer().toBytes(mandator));
>         //indexQuery.addEqualsExpression(asByteArray(USERID_INDEX_UUID), new StringSerializer().toBytes(dummyUserId));
>  
> is commented out, so the GetTimeUUIDIndexedSlices will run. Using one IndexQuery works
perfectly fine but as soon as I add a second eq, gt, gte or lt expression I get an IndexOutOfBoundsException
(see below).
>  
> This issue can be easily reproduced by 
> - downloading the zznate example (https://github.com/zznate/hector-examples), 
> - mavenizing it to an eclipse project with "mvn clean eclipse:eclipse", 
> - importing it in eclipse and 
> - letting it run against a locally running cassandra instance (v0.7.3) which has the
default settings (no changes in the .yaml)
>  
> I hope that someone can help me with this issue ... after a couple of days it's driving
me bonkers.
>  
> Thx in advance,
> Johannes
>  
>  
> Exception:
> ERROR 14:47:56,842 Error in ThreadPoolExecutor
> java.lang.RuntimeException: java.lang.IndexOutOfBoundsException: 6
>         at org.apache.cassandra.service.IndexScanVerbHandler.doVerb(IndexScanVer
> bHandler.java:51)
>         at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.
> java:72)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExec
> utor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor
> .java:908)
>         at java.lang.Thread.run(Thread.java:619)
> Caused by: java.lang.IndexOutOfBoundsException: 6
>         at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:121)
>         at org.apache.cassandra.db.marshal.TimeUUIDType.compareTimestampBytes(Ti
> meUUIDType.java:56)
>         at org.apache.cassandra.db.marshal.TimeUUIDType.compare(TimeUUIDType.jav
> a:45)
>         at org.apache.cassandra.db.marshal.TimeUUIDType.compare(TimeUUIDType.jav
> a:29)
>         at org.apache.cassandra.db.ColumnFamilyStore.satisfies(ColumnFamilyStore
> .java:1608)
>         at org.apache.cassandra.db.ColumnFamilyStore.scan(ColumnFamilyStore.java
> :1552)
>         at org.apache.cassandra.service.IndexScanVerbHandler.doVerb(IndexScanVer
> bHandler.java:42)
>         ... 4 more
> ERROR 14:47:56,852 Fatal exception in thread Thread[ReadStage:14,5,main]
> java.lang.RuntimeException: java.lang.IndexOutOfBoundsException: 6
>         at org.apache.cassandra.service.IndexScanVerbHandler.doVerb(IndexScanVer
> bHandler.java:51)
>         at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.
> java:72)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExec
> utor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor
> .java:908)
> <GetIndexedSlices.java><GetTimeUUIDIndexedSlices.java>


Mime
View raw message