accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Keith Turner (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-550) Collocate rfile index entries within file
Date Fri, 29 Jun 2012 19:03:43 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13404130#comment-13404130
] 

Keith Turner commented on ACCUMULO-550:
---------------------------------------

I do not think having the blocks > level 0 sprinkled through between level 0 blocks is
a problem for sequentially reading the index.  The blocks > level 0 are read so infrequently
compared to the level 0 blocks that I suspect the cost of the occasional random read for a
level 1 block is amortized away.
                
> Collocate rfile index entries within file
> -----------------------------------------
>
>                 Key: ACCUMULO-550
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-550
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: tserver
>            Reporter: Keith Turner
>            Assignee: Keith Turner
>             Fix For: 1.5.0, 1.4.1
>
>
> Before multi-level indexes were introduced, when an an rfile was written its entire index
was held in memory and written out then the file was closed.  With the introduction of multilevel
index each index block is written when it fills up as the file is being written.  This was
done to handle the case where the index may not fit into memory.  This leads to index blocks
being sprinkled through the file.   So any operation that iterates over the entire index can
be slow because it turns into a lot of random accesses.   
> One possible solution is to buffer lots of index blocks up to some some threshold and
write out alot of index blocks at once.  This would make a scan of the index much faster as
it would turn into a set of sequential reads of large chunks of data.
> Could buffer all block at a particular level and write them out when the parent index
block fills up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message