hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HBASE-1200) Add bloomfilters
Date Fri, 07 May 2010 23:05:51 GMT

     [ https://issues.apache.org/jira/browse/HBASE-1200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

stack updated HBASE-1200:
-------------------------

    Attachment: Bloom_Filters_in_HBase.pdf

Doc as PDF.

Here's some Nicolas answers to a few questions on doc:

{code}
15:41 < St^Ack> So, what you do your hashing w/?
15:42 < nspiegelberg> I do murmur hash with combinatoral generation
15:43 < nspiegelberg> it's cache miss, but only need to compute the murmur twice, no
matter the hashKey count
15:44  * St^Ack excellent
15:44 < St^Ack> So, its in the LRU cache.. whats that mean?
15:45 < nspiegelberg> every call to bloom.contain calls getMetaBlock(BF_DATA), which
is LRU cache
15:45 < nspiegelberg> so CFs that aren't used don't have their blooms cached
15:46 < St^Ack> excellent
{code}

> Add bloomfilters
> ----------------
>
>                 Key: HBASE-1200
>                 URL: https://issues.apache.org/jira/browse/HBASE-1200
>             Project: Hadoop HBase
>          Issue Type: Task
>    Affects Versions: 0.20.5
>            Reporter: stack
>            Assignee: Nicolas Spiegelberg
>             Fix For: 0.20.5
>
>         Attachments: Bloom Filters in HBase.docx, Bloom_Filters_in_HBase.pdf, HBASE-1200-0.20.5.patch,
ryan_bloomfilter.patch
>
>
> Add bloomfiltering to hfile.  Can be enabled on a family-level basis.  Ability to configure
a row vs row+col level bloom.  We size the bloomfilter with the number of entries we are about
to flush which seems like usually we'd be making a filter too big, so our implementation needs
to take that into account.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message