cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nathan Ziebart (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-13995) Don't fetch unnecessary data in SliceQueryFilter
Date Mon, 06 Nov 2017 18:01:00 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-13995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Nathan Ziebart updated CASSANDRA-13995:
---------------------------------------
    Fix Version/s:     (was: 2.2.12)
                   2.2.x

> Don't fetch unnecessary data in SliceQueryFilter
> ------------------------------------------------
>
>                 Key: CASSANDRA-13995
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13995
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Local Write-Read Paths
>            Reporter: Nathan Ziebart
>             Fix For: 2.2.x
>
>
> Link to patch branch on github: https://github.com/apache/cassandra/pull/170
> Slice queries currently fetch more data then necessary, when there is only one column
not part of the primary key. Specifically, SliceQueryFilter does not stop reading until it
has seen `limit + 1` live cells, even though in the case we can stop after seeing `limit`
live cells.
> We have a use case where we use wide rows to implement versioning, by including a timestamp
as part of the primary key. Every once in a while, we "garbage collect" old versions by deleting
them. This results in a single column containing the latest version, followed by many tombstones.
> We use a `LIMIT 1` query to select the latest version (which is the first column in the
row). However, because SliceQueryFilter does not stop until it has seen `limit + 1` live cells,
we have to read all the tombstones following the single live cell. Furthermore, if these tombstones
are covering data in other sstables, we have to read all the corresponding data when merging
the sstable iterators. This can be a massive performance hit, and is unexpectedly caused by
deleting data.
> This patch allows the `ColumnCounter` implementation to decide when it has seen enough
cells. For counters that don't require grouping, we can stop immediately after finding the
first cell.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org


Mime
View raw message