cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <>
Subject [jira] [Resolved] (CASSANDRA-7854) Unable to select partition keys directly using IN keyword (no replacement for multi row multiget in thrift)
Date Thu, 29 Jan 2015 09:41:34 GMT


Sylvain Lebresne resolved CASSANDRA-7854.
    Resolution: Duplicate

bq. The issue is not really a duplicate of 7855.

You're right, it's more of a duplicate of CASSANDRA-6875. At least the intent expressed by
Todd is.

What Todd was really asking about here was to be able use a IN on the partition keys and a
IN on the clustering columns (which was allowed by CASSANDRA-6875).  It is true that the made-up
syntax in the description uses a IN on both partition key and clustering columns, which is
not stricly equivalent to the former, but Todd clearly didn't intented that generality since
the ticket is in the context of replacing thrift multiget and the thrift multiget has never
supported the more general form implied by an IN on the full primary key.

So I'm closing again as duplicate to avoid the confusion of reusing it for something it wasn't
intented for.

We can create a separate issue for IN on full primary key *but* I'm not so sure it's a good
idea because it's yet another form of multi-partition query and we discourage those (as doing
multiple one-partition queries concurrently is a better idea in practice). We kind of had
to support as much as what multiget was giving us with thrift for political reasons, but adding
a new form that we'll spend our time discouraging while that form was never supported and
never asked for (to the best of my knowledge) doesn't sound too compeling.

> Unable to select partition keys directly using IN keyword (no replacement for multi row
multiget in thrift)
> -----------------------------------------------------------------------------------------------------------
>                 Key: CASSANDRA-7854
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Todd Nine
>            Assignee: Benjamin Lerer
> We're converting some old thrift CF's to CQL.  We aren't looking to change the underlying
physical structure, since this has proven effective in production.  In order to migrate, we
need full select via multi equivalent.  In thrift, the format was as follows.
> (scopeId, scopeType, nodeId, nodeType){ 0x00, timestamp }
> Where we have deliberately designed only 1 column per row.  To translate this to CQL,
I have defined the following table.
> {code}
> CREATE TABLE Graph_Marked_Nodes ( 
>  scopeId uuid,
>  scopeType varchar,
>  nodeId uuid,
>  nodeType varchar,
>  timestamp bigint,
>  PRIMARY KEY(scopeId, scopeType, nodeId, nodeType)
> )
> {code}
> I then try to select using the IN keyword.
> {code}
> select timestamp from Graph_Marked_Nodes WHERE (scopeId , scopeType , nodeId , nodeType)
 IN ( (5a391596-3181-11e4-a87e-600308a690e2, 'organization', 5a3a2708-3181-11e4-a87e-600308a690e2
,'test' ),(5a391596-3181-11e4-a87e-600308a690e2, 'organization', 5a3a2709-3181-11e4-a87e-600308a690e2
,'test' ),(5a391596-3181-11e4-a87e-600308a690e2, 'organization', 5a39fff7-3181-11e4-a87e-600308a690e2
,'test' ) )
> {code}
> Which results in the following stack trace
> {code}
> Caused by: com.datastax.driver.core.exceptions.InvalidQueryException: Multi-column relations
can only be applied to clustering columns: scopeid
> 	at com.datastax.driver.core.Responses$Error.asException(
> 	at com.datastax.driver.core.DefaultResultSetFuture.onSet(
> 	at com.datastax.driver.core.RequestHandler.setFinalResult(
> 	at com.datastax.driver.core.RequestHandler.onSet(
> 	at com.datastax.driver.core.Connection$Dispatcher.messageReceived(
> {code}
> This is still possible via the thrift API.  Apologies in advance if I've filed this erroneously.
 I can't find any examples of this type of query anywhere.
> Note that our size grows far too large to fit in a single physical partition (row) if
we use only scopeId and scopeType, so we need all 4 data elements to be part of our partition
key to ensure we have the distribution we need.

This message was sent by Atlassian JIRA

View raw message