cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-2878) Allow CQL-based map/reduce
Date Thu, 19 Jan 2012 14:54:40 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13189138#comment-13189138
] 

Jonathan Ellis commented on CASSANDRA-2878:
-------------------------------------------

So I'm attempting option (1) above, with a method like this:

{code}
. CqlResult execute_partitioned_cql_query(1:required binary query, 
                                          2:required string start_token, 
                                          3:required string end_token, 
                                          4:Compression compression)
{code}

The problem I'm running into is that paginating this CQL-side is enormously painful.  Suppose
the last resultset entry is for a composite column (first = a, second = b) in a row k.  Then
I need to request my next page as "... WHERE (key > k OR (key = key AND first >= a AND
second > b))."  QueryProcessor just isn't up to this task, even without the complexity
of throwing in other IndexExpressions.

Starting to think that we're better off with (4) for 1.1, after all.
                
> Allow CQL-based map/reduce
> --------------------------
>
>                 Key: CASSANDRA-2878
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2878
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Hadoop
>            Reporter: Mck SembWever
>            Assignee: Jonathan Ellis
>            Priority: Critical
>             Fix For: 1.1
>
>
> Currently, when running a MapReduce job against data in a Cassandra data store, it reads
through all the data for a particular ColumnFamily.  This could be optimized to only read
through those rows that have to do with the query.
> Adding CQL support to m/r will allow using an index more simply than trying to cram support
for more parameters into the job configuration.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message