cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Byron Clark (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-2894) add paging to get_count
Date Tue, 26 Jul 2011 19:32:09 GMT


Byron Clark commented on CASSANDRA-2894:

[^CASSANDRA-2894.patch] implements this behavior without the global setting for count_slice_size.

Behavior is based on the paging done in HintedHandoffManager from the 0.8 branch. The maximum
page size is 16384 columns, largely because I have no idea what a good value for that setting
would be.

Assumptions made:
- Using the last column returned from get_slice as the starting column for the next call to
get_slice won't skip over any columns.

> add paging to get_count
> -----------------------
>                 Key: CASSANDRA-2894
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: API
>            Reporter: Jonathan Ellis
>            Assignee: Vijay
>            Priority: Minor
>              Labels: lhf
>             Fix For: 1.0
>         Attachments: CASSANDRA-2894.patch
> It is non-intuitive that get_count materializes the entire slice-to-count on the coordinator
node (to perform read repair and > CL.ONE consistency).  Even experienced users have been
known to cause memory problems by requesting large counts.
> The user cannot page the count himself, because you need a start and stop column to do
that, and get_count only returns an integer.
> So the best fix is for us to do the paging under the hood, in CassandraServer.  Add a
limit to the slicepredicate they specify, and page through it.
> We could add a global setting for count_slice_size, and document that counts of more
columns than that will have higher latency (because they make multiple calls through StorageProxy
for the pages).

This message is automatically generated by JIRA.
For more information on JIRA, see:


View raw message