asterixdb-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (Jira)" <>
Subject [jira] [Commented] (ASTERIXDB-2708) Optimize Primary Point Searches Via Batching and Stateful Cursors
Date Sat, 11 Apr 2020 16:31:00 GMT


ASF subversion and git services commented on ASTERIXDB-2708:

Commit 6e6b3427d82a5b8b10aadb8aff095259cbfb9535 in asterixdb's branch refs/heads/master from
[;h=6e6b342 ]

[ASTERIXDB-2708] Introduce batch and stateful point cursors

- user model changes: no
- storage format changes: no
- interface changes: no

- Add a stateful btree point search cursor that uses previous search history
and exponential search algorithm to optimize point search performance
- Add a batching LSM btree point search cursor to perform point searches for
a batch of keys. Search states are cleared after each batch.

Change-Id: I0b0ade723895bcd71463df7a9703fe78a238e6c7
Contrib: Jenkins <>
Integration-Tests: Jenkins <>
Tested-by: Jenkins <>
Reviewed-by: Luo Chen <>
Reviewed-by: Murtadha Hubail <>

> Optimize Primary Point Searches Via Batching and Stateful Cursors
> -----------------------------------------------------------------
>                 Key: ASTERIXDB-2708
>                 URL:
>             Project: Apache AsterixDB
>          Issue Type: Improvement
>          Components: RT - Runtime, STO - Storage
>            Reporter: Chen Luo
>            Assignee: Chen Luo
>            Priority: Major
> Currently, the primary index point searches can be expensive, especially when a query
is not selecitivity, for a few reasons:
> * Enter and exit LSM components for each search key
> * Always traverse from root to leaf when searching a key
> To optimize primary point searches, we introduce a number of optimizations here:
> * Introduce a batched point search cursor that enters an LSM index for a batch of keys
to amortize the cost
> * Introduce a stateful BTree search algorithm that reuses the previous search history
to speedup subsequently searches. Specifically, we keep track of the last leaf page ID and
the last key index. For the next search key, if it still exists in the last leaf page, we
do not have to traverse from root to leaf again. Moreover, instead of using binary search,
we use exponential search to reduce the search cost in case there are a lot of keys.

This message was sent by Atlassian Jira

View raw message