cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stu Hood (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-2988) Improve SSTableReader.load() when loading index files
Date Wed, 03 Aug 2011 22:39:27 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-2988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13079088#comment-13079088
] 

Stu Hood commented on CASSANDRA-2988:
-------------------------------------

bq. which totally applies to the index sampling both after a restart and when opening a new
streamed file.
Does it? If you are using and warming the keycache, then maybe you don't want the index to
be warm, but in any other case, the index should essentially be locked in ram. Also, the streaming
case is no longer linked to SSTableReader.load in trunk.

bq. loading a new, streamed sstable (probably don't)
Same comment as above: .load() isn't involved.

----

Melvin: either way, you should post the numbers that you collected.

> Improve SSTableReader.load() when loading index files
> -----------------------------------------------------
>
>                 Key: CASSANDRA-2988
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2988
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Melvin Wang
>
> * when we create BufferredRandomAccessFile, we pass skipCache=true. This hurts the read
performance because we always process the index files sequentially. Simple fix would be set
it to false.
> * multiple index files of a single column family can be loaded in parallel. This buys
a lot when you have multiple super large index files.
> * we may also change how we buffer. By using BufferredRandomAccessFile, for every read,
we need bunch of checking like
>   - do we need to rebuffer?
>   - isEOF()?
>   - assertions
>   These can be simplified to some extent.  We can blindly buffer the index file by chunks
and process the buffer until a key lies across boundary of a chunk. Then we rebuffer and start
from the beginning of the partially read key. Conceptually, this is same as what BRAF does
but w/o the overhead in the read**() methods in BRAF.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message