accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Keith Turner (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-501) RFile should store the key count in metadata
Date Thu, 29 Mar 2012 12:42:24 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13241184#comment-13241184
] 

Keith Turner commented on ACCUMULO-501:
---------------------------------------

One thing we have discussed before is storing a count in the index for each block.  Using
this a scan of the index for the region of the tablet that overlaps the tablet will give a
fairly accurate count.
                
> RFile should store the key count in metadata
> --------------------------------------------
>
>                 Key: ACCUMULO-501
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-501
>             Project: Accumulo
>          Issue Type: Improvement
>            Reporter: Eric Newton
>            Assignee: Eric Newton
>             Fix For: 1.5.0
>
>
> BulkImport estimates the number of keys in a file to be zero.  We store the largest and
smallest key in metadata, I think we can afford to store the key count use it to provide an
estimate when we load it into the tablet.  Perhaps if we know the start key is "a" and the
end key is "z" and the tablets range is "a->m" we can just estimate 50% of the key count.
> When a bulk file fits completely in a range, the key count estimate will be accurate.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message