cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bill Mitchell (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN
Date Tue, 27 May 2014 02:17:02 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14009075#comment-14009075
] 

Bill Mitchell edited comment on CASSANDRA-6875 at 5/27/14 2:16 AM:
-------------------------------------------------------------------

To try this out, I cobbled up a test case by accessing the TupleType directly on the client
side, as this feature is not yet supported in the Java driver.  My approach was to serialize
my two ordering column values, then use TupleType.buildValue() to concatenate them into a
single ByteBuffer, build a List of all these, then use serialize on a ListType<ByteBuffer>
instance to get a single ByteBuffer representing the entire list, and bind that using setBytesUnsafe().
 I'm not totally sure of all this, but it seems reasonable.  

My SELECT statement syntax followed the first of the three Tyler suggested: ... WHERE (c1,
c2) IN ?, as this allows the statement to be prepared only once, irrespective of the number
of compound keys provided.  

What I saw was the following traceback on the server:
14/05/26 14:33:09 ERROR messages.ErrorMessage: Unexpected exception during request
java.util.NoSuchElementException
	at java.util.LinkedHashMap$LinkedHashIterator.nextEntry(LinkedHashMap.java:396)
	at java.util.LinkedHashMap$ValueIterator.next(LinkedHashMap.java:409)
	at org.apache.cassandra.cql3.statements.SelectStatement.buildMultiColumnInBound(SelectStatement.java:941)
	at org.apache.cassandra.cql3.statements.SelectStatement.buildBound(SelectStatement.java:814)
	at org.apache.cassandra.cql3.statements.SelectStatement.getRequestedBound(SelectStatement.java:977)
	at org.apache.cassandra.cql3.statements.SelectStatement.makeFilter(SelectStatement.java:444)
	at org.apache.cassandra.cql3.statements.SelectStatement.getSliceCommands(SelectStatement.java:340)
	at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:210)
	at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:61)
	at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:158)
	at org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:309)
	at org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:132)
	at org.apache.cassandra.transport.Message$Dispatcher.messageReceived(Message.java:304)

Stepping through the code, it appears to have analyzed my statement correctly.  In BuildMultiColumnInBound,
splitInValues contains 1426 tuples, which is the number I intended to pass.  The names parameter
identifies two columns, createdate and emailcrypt.  The loop executes twice, but on the third
iteration there are no more elements in names, thus the exception. 

Moving the construction of the iterator within the loop fixed my Exception.  The code still
looks suspect, though, as it calculates a bound b based on whether the first column is reversed,
then uses bound, not b, in the following statement.  I've not researched which would be correct,
as this appears closely related to the fix Sylvain just developed for CASSANDRA-7105.  In
my test case, where the columns were declared as DESC, the code as fixed below did return
all the expected rows. 

{code}
        TreeSet<ByteBuffer> inValues = new TreeSet<>(isReversed ? cfDef.cfm.comparator.reverseComparator
: cfDef.cfm.comparator);
        for (List<ByteBuffer> components : splitInValues)
        {
            ColumnNameBuilder nameBuilder = builder.copy();
            for (ByteBuffer component : components)
                nameBuilder.add(component);

            Iterator<CFDefinition.Name> iter = names.iterator();
            Bound b = isReversed == isReversedType(iter.next()) ? bound : Bound.reverse(bound);
            inValues.add((bound == Bound.END && nameBuilder.remainingCount() >
0) ? nameBuilder.buildAsEndOfRange() : nameBuilder.build());
        }
        return new ArrayList<>(inValues);
{code}  

P.S. I changed my test configuration to declare the ordering columns as ASC instead of DESC
and reran the tests.  There was no failure with the code as changed.  So apparently the comparison
of bound == and not b == works fine, which should mean that both iter and b can be dropped.
 


was (Author: wtmitchell3):
To try this out, I cobbled up a test case by accessing the TupleType directly on the client
side, as this feature is not yet supported in the Java driver.  My approach was to serialize
my two ordering column values, then use TupleType.buildValue() to concatenate them into a
single ByteBuffer, build a List of all these, then use serialize on a ListType<ByteBuffer>
instance to get a single ByteBuffer representing the entire list, and bind that using setBytesUnsafe().
 I'm not totally sure of all this, but it seems reasonable.  

My SELECT statement syntax followed the first of the three Tyler suggested: ... WHERE (c1,
c2) IN ?, as this allows the statement to be prepared only once, irrespective of the number
of compound keys provided.  

What I saw was the following traceback on the server:
14/05/26 14:33:09 ERROR messages.ErrorMessage: Unexpected exception during request
java.util.NoSuchElementException
	at java.util.LinkedHashMap$LinkedHashIterator.nextEntry(LinkedHashMap.java:396)
	at java.util.LinkedHashMap$ValueIterator.next(LinkedHashMap.java:409)
	at org.apache.cassandra.cql3.statements.SelectStatement.buildMultiColumnInBound(SelectStatement.java:941)
	at org.apache.cassandra.cql3.statements.SelectStatement.buildBound(SelectStatement.java:814)
	at org.apache.cassandra.cql3.statements.SelectStatement.getRequestedBound(SelectStatement.java:977)
	at org.apache.cassandra.cql3.statements.SelectStatement.makeFilter(SelectStatement.java:444)
	at org.apache.cassandra.cql3.statements.SelectStatement.getSliceCommands(SelectStatement.java:340)
	at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:210)
	at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:61)
	at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:158)
	at org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:309)
	at org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:132)
	at org.apache.cassandra.transport.Message$Dispatcher.messageReceived(Message.java:304)

Stepping through the code, it appears to have analyzed my statement correctly.  In BuildMultiColumnInBound,
splitInValues contains 1426 tuples, which is the number I intended to pass.  The names parameter
identifies two columns, createdate and emailcrypt.  The loop executes twice, but on the third
iteration there are no more elements in names, thus the exception. 

Moving the construction of the iterator within the loop fixed my Exception.  The code still
looks suspect, though, as it calculates a bound b based on whether the first column is reversed,
then uses bound, not b, in the following statement.  I've not researched which would be correct,
as this appears closely related to the fix Sylvain just developed for CASSANDRA-7105.  In
my test case, where the columns were declared as DESC, the code as written did return all
the expected rows. 

{code}
        TreeSet<ByteBuffer> inValues = new TreeSet<>(isReversed ? cfDef.cfm.comparator.reverseComparator
: cfDef.cfm.comparator);
        for (List<ByteBuffer> components : splitInValues)
        {
            ColumnNameBuilder nameBuilder = builder.copy();
            for (ByteBuffer component : components)
                nameBuilder.add(component);

            Iterator<CFDefinition.Name> iter = names.iterator();
            Bound b = isReversed == isReversedType(iter.next()) ? bound : Bound.reverse(bound);
            inValues.add((bound == Bound.END && nameBuilder.remainingCount() >
0) ? nameBuilder.buildAsEndOfRange() : nameBuilder.build());
        }
        return new ArrayList<>(inValues);
{code}  

> CQL3: select multiple CQL rows in a single partition using IN
> -------------------------------------------------------------
>
>                 Key: CASSANDRA-6875
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6875
>             Project: Cassandra
>          Issue Type: Bug
>          Components: API
>            Reporter: Nicolas Favre-Felix
>            Assignee: Tyler Hobbs
>            Priority: Minor
>             Fix For: 2.0.9, 2.1 rc1
>
>
> In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is important
to support reading several distinct CQL rows from a given partition using a distinct set of
"coordinates" for these rows within the partition.
> CASSANDRA-4851 introduced a range scan over the multi-dimensional space of clustering
keys. We also need to support a "multi-get" of CQL rows, potentially using the "IN" keyword
to define a set of clustering keys to fetch at once.
> (reusing the same example\:)
> Consider the following table:
> {code}
> CREATE TABLE test (
>   k int,
>   c1 int,
>   c2 int,
>   PRIMARY KEY (k, c1, c2)
> );
> {code}
> with the following data:
> {code}
>  k | c1 | c2
> ---+----+----
>  0 |  0 |  0
>  0 |  0 |  1
>  0 |  1 |  0
>  0 |  1 |  1
> {code}
> We can fetch a single row or a range of rows, but not a set of them:
> {code}
> > SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ;
> Bad Request: line 1:54 missing EOF at ','
> {code}
> Supporting this syntax would return:
> {code}
>  k | c1 | c2
> ---+----+----
>  0 |  0 |  0
>  0 |  1 |  1
> {code}
> Being able to fetch these two CQL rows in a single read is important to maintain partition-level
isolation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message