accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Keith Turner (JIRA)" <>
Subject [jira] [Commented] (ACCUMULO-4066) Conditional mutation processing performance could be improved.
Date Fri, 29 Jan 2016 00:07:39 GMT


Keith Turner commented on ACCUMULO-4066:

bq. To confirm, the server doesnt' rely on the sorted order, just hopes for it for performance


bq. I see a lot of changes in IteratorUtil (I assume to your point about loading iterators
from the table config). How did this used to work

I needed parse table iter config once, cache that and then later merge condition iterators.
 I changed IteratorUtil to support this use case.  Used to it would parse the table iterator
config for each condition.  I also modified the code to support caching class name to classes
(Instead of going to the VFS classloader for each condition to load a class).  

Adding some test for IteratorUtil is a good idea.

> Conditional mutation processing performance could be improved.
> --------------------------------------------------------------
>                 Key: ACCUMULO-4066
>                 URL:
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: tserver
>    Affects Versions: 1.6.4, 1.7.0
>            Reporter: Keith Turner
>            Assignee: Keith Turner
>             Fix For: 1.6.5, 1.7.1, 1.8.0
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
> When processing conditional mutations tablets reads are done.   The way the current implementation
does tablet reads has a lot of overhead.   For each condition the following is done :
>  * Opens and reserves iterators files.
>  * Parse table iterators from table config (involves scanning and filtering entire table
>  * Merges condition iterators and table iterators
>  * Constructs iterator stack.
> I created a branch where these operations (except for constructing iterator stack) are
done per tablet and/or per batch of conditional mutations.   Doing this I am seeing a 3x speed
up in conditional mutation processing rates when data is cached.

This message was sent by Atlassian JIRA

View raw message