asterixdb-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chen Luo (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (ASTERIXDB-2128) Bloom Filter for LSMBTree is totally useless
Date Fri, 06 Oct 2017 23:26:00 GMT

     [ https://issues.apache.org/jira/browse/ASTERIXDB-2128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Chen Luo updated ASTERIXDB-2128:
--------------------------------
    Description: 
We use bloom filters on primary index (as LSMBTree) to skip unnecessary pk lookups on some
disk components. The idea is that if the bloom filter reports false on some pk, we totally
skip searching that disk component.

However, in our current complicated BTree/LSMBTree software stack, the bloom filter is completely
useless (in terms of saving I/Os). Take a look at LSMBTreePointSearchCursor. At line 74, we
first call btreeAccessors[i].search(rangeCursors[i], predicate), and then call BloomFilterAwareBTreePointSearchCursor.hasNext()
which performs bloom filter check. However, the call sequence is wrong. Actually, btreeAccessors[i].search
has already searched the primary key against the BTree, from the root down to the leaf, rendering
bloom filter completely useless in this case.


  was:
We use bloom filters on primary index (as LSMBTree) to skip unnecessary pk lookups on some
disk components. The idea is that if the bloom filter reports false on some pk, we totally
skip searching that disk component.

However, in our current complicated BTree/LSMBTree software stack, the bloom filter is completely
useless (in terms of saving I/Os). Take a look at LSMBTreePointSearchCursor. At line 74, we
first call btreeAccessors[i].search(rangeCursors[i], predicate), and then call BloomFilterAwareBTreePointSearchCursor.hasNext()
which performs bloom filter check. However, the call sequence is wrong. Actually, btreeAccessors[i].search
has already searched the primary key against the BTree, from the root down to the leaf, rending
bloom filter completely useless in this case.



> Bloom Filter for LSMBTree is totally useless
> --------------------------------------------
>
>                 Key: ASTERIXDB-2128
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2128
>             Project: Apache AsterixDB
>          Issue Type: Bug
>          Components: IDX - Indexes, STO - Storage
>            Reporter: Chen Luo
>            Assignee: Chen Luo
>
> We use bloom filters on primary index (as LSMBTree) to skip unnecessary pk lookups on
some disk components. The idea is that if the bloom filter reports false on some pk, we totally
skip searching that disk component.
> However, in our current complicated BTree/LSMBTree software stack, the bloom filter is
completely useless (in terms of saving I/Os). Take a look at LSMBTreePointSearchCursor. At
line 74, we first call btreeAccessors[i].search(rangeCursors[i], predicate), and then call
BloomFilterAwareBTreePointSearchCursor.hasNext() which performs bloom filter check. However,
the call sequence is wrong. Actually, btreeAccessors[i].search has already searched the primary
key against the BTree, from the root down to the leaf, rendering bloom filter completely useless
in this case.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message