cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tupshin Harper (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-6167) Add end-slice termination predicate
Date Fri, 14 Feb 2014 15:06:21 GMT


Tupshin Harper commented on CASSANDRA-6167:

Sylvain: so given the complexity and ambiguity about best approach for my example above, I'll
just return to the simple:
Partition represents a slice of time-series events for a particular source
Client wants to know if value >= X appeared in the last 10,000 events (or since time Y).
Outlier detection is one reason. There are others.
Currently we would either need to do a single large slice, and retrieve all 10,000 events
even if the value was found in the first few, or get smaller batches and keep retrieving additional
batches until the target value is found.
No matter what the syntax or implementation is, I'm quite convinced in the utility of being
able to short circuit reading from a partition.

> Add end-slice termination predicate
> -----------------------------------
>                 Key: CASSANDRA-6167
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: API, Core
>            Reporter: Tupshin Harper
>            Priority: Minor
>              Labels: ponies
> When doing performing storage-engine slices, it would sometimes be beneficial to have
the slice terminate for other reasons other than number of columns or min/max cell name.
> Since we are able to look at the contents of each cell as we read it, this is potentially
doable with very little overhead. 
> Probably more challenging than the storage-engine implementation itself, is to come up
with appropriate CQL syntax (Thrift, should we decide to support it, would be trivial).
> Two possibilities ar
> 1) special where function:
> SELECT pk,event from cf WHERE pk IN (1,5,10,11) AND partition_predicate({predicate})
> or a bigger language change, but i think one I prefer. more like:
> 2) SELECT pk,event from cf where pk IN (1,5,10,11) UNTIL PARTITION event {predicate}
> Neither feels perfect, but I do like the fact that the second one at least clearly states
what it is intended to do.
> By using "UNTIL PARTITION", we could re-use the UNTIL keyword to handle other kinds of
early-termination of selects that the coordinator might be able to do, such as stop retrieving
additional rows from shards after a particular criterion was met.

This message was sent by Atlassian JIRA

View raw message