cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brandon Williams (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-4304) Add bytes-limit clause to queries
Date Fri, 15 Jun 2012 15:26:42 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13295722#comment-13295722
] 

Brandon Williams commented on CASSANDRA-4304:
---------------------------------------------

I think I do like the idea of limiting by bytes instead of count, as CASSANDRA-3911 does.
 However, I think that ticket has the right approach in that it should be the operator that
defines that limit, not clients, since they will still have the ability to abuse it and OOM
the server, and OOM is the operator's problem.
                
> Add bytes-limit clause to queries
> ---------------------------------
>
>                 Key: CASSANDRA-4304
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4304
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: API, Core
>            Reporter: Christian Spriegel
>             Fix For: 1.2
>
>         Attachments: TestImplForSlices.patch
>
>
> Idea is to add a second limit clause to (slice)queries. This would allow easy loading
of batches, even if content is variable sized.
> Imagine the following use case:
> You want to load a batch of XMLs, where each is between 100bytes and 5MB large.
> Currently you can load either
> - a large number of XMLs, but risk OOMs or timeouts
> or
> - a small number of XMLs, and do too many queries where each query usually retrieves
very little data.
> With cassandra being able to limit by size and not just count, we could do a single query
which would never OOM but always return a decent amount of data -- with no extra overhead
for multiple queries.
> Few thoughts from my side:
> - The limit should be a soft limit, not a hard limit. Therefore it will always return
at least one row/column, even if that one large than the limit specifies.
> - HintedHandoffManager:303 is already doing a InMemoryCompactionLimit/averageColumnSize
to avoid OOM. It could then simply use the new limit clause :-)
> - A bytes-limit on a range- or indexed-query should always return a complete row

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message