cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Carl Yeksigian (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-4885) Remove or rework per-row bloom filters
Date Tue, 26 Mar 2013 19:01:16 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-4885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13614438#comment-13614438
] 

Carl Yeksigian commented on CASSANDRA-4885:
-------------------------------------------

This won't upgrade properly.

The current IndexHelper.skipBloomFilter() only handles Type.SHA SSTable bloom filters. This
is because the number of bytes that are used depends on the scheme. If the schema version
is not SHA, we need to read the byte length from the output, then skip that many bytes. This
means that when the update occurs, if a row filter was written, we will not skip over it properly.
Upgrading from 1.2 to 2.0 will cause CorruptSSTableException since we aren't advancing far
enough.

I'm reopening and posting a patch which works for this case; it will not currently work with
the scrub test.
                
> Remove or rework per-row bloom filters
> --------------------------------------
>
>                 Key: CASSANDRA-4885
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4885
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jason Brown
>             Fix For: 2.0
>
>         Attachments: 0001-CASSANRDA-4885-Remove-per-row-bloom-filter.patch, 0002-CASSANRDA-4885-update-test.patch,
4885-indexhelper.patch, 4885-v1.patch, 4885-v2.patch
>
>
> Per-row bloom filters may be a misfeature.
> On small rows we don't create them.
> On large rows we essentially only do slice queries that can't take advantage of it.
> And on very large rows if we ever did deserialize it, the performance hit of doing so
would outweigh the benefit of skipping the actual read.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message