db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Knut Anders Hatlen (JIRA)" <j...@apache.org>
Subject [jira] Commented: (DERBY-642) SELECT MAX doesn't use indices optimally
Date Wed, 16 Feb 2011 08:39:57 GMT

    [ https://issues.apache.org/jira/browse/DERBY-642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12995216#comment-12995216

Knut Anders Hatlen commented on DERBY-642:

Hi Dag,

Yes, that's what I was thinking. To be precise, we reposition to the row that was immediately
to the right when we started waiting. When we wake up, there may be many rows between the
row we waited for and the row we reposition to. In the current patch, we reposition on the
row we're waiting for and then move one row to the right. These two approaches should be equivalent
in the case where no potential phantom rows have been inserted while we were waiting.

There may be a simpler solution, though. The existing code doesn't take previous key locks
in the max scan, with comments saying that previous key locks aren't necessary for a max scan
(which is true as long as the scan is restarted once we have to wait for a lock, like the
existing code does). But it should be possible to perform previous key locking while moving
backwards too by always locking the row immediately to the left of the one we're interested
in. Then we'd always be one row ahead with the locking in serializable transactions, and even
if we have to wait for a lock, the current position will be protected by the lock we acquired
in the previous iteration, so it should be safe to reposition back to that position.

Unfortunately, it's somewhat difficult to test whether this works as long as the optimizer
insists on table locking for BTreeMaxScan in serializable transactions. I haven't yet found
out exactly where this decision is made.

> SELECT MAX doesn't use indices optimally
> ----------------------------------------
>                 Key: DERBY-642
>                 URL: https://issues.apache.org/jira/browse/DERBY-642
>             Project: Derby
>          Issue Type: Improvement
>          Components: Store
>    Affects Versions:
>            Reporter: Knut Anders Hatlen
>            Assignee: Knut Anders Hatlen
>            Priority: Minor
>         Attachments: derby-642-1a.diff, derby-642-1b-withTests.diff
> I tried running SELECT MAX on an indexed column in a big (8 GB)
> table. It took 12 minutes, which is about 12 minutes more than I
> expected.
> After a bit of investigation, I found out that a full index scan was
> performed because all the rows referenced from the rightmost B-tree
> node were actually deleted.
> Possible improvements:
>   1. Implement backwards scan in the B-trees (this is also suggested
>      in the comments in BTreeMaxScan).
>   2. Go up to the parent node and down to the next leaf node on the
>      left side, and continue until a valid max value is found. In
>      Derby, traversing up in a B-tree is not allowed, but would it be
>      ok to go up if the latches were kept on the higher-level nodes in
>      the tree? Of course, this would have negative impact on
>      concurrency.
>   3. Right-to-left traversal on the leaf level is possible (using
>      ControlRow.getLeftSibling()), but will throw an exception if the
>      page cannot be latched without waiting. It is therefore possible
>      to try a right-to-left search for the max value, and only fall
>      back to the left-to-right approach if a conflict arises.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message