accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Keith Turner (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (ACCUMULO-519) support in-memory compactions
Date Wed, 13 Feb 2013 19:54:13 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13577870#comment-13577870
] 

Keith Turner edited comment on ACCUMULO-519 at 2/13/13 7:52 PM:
----------------------------------------------------------------

Two things to consider

 * Isolation : need a strategy for handling this.  Currently isolated reads keep all rfiles
they are currently reading a row from around, even rfiles there were compacted away.
 * Memory allocation : need a new strategy for deciding maximum memory that can be used by
in memory maps since will have to read and write to memory for these in memory compactions.
                
      was (Author: kturner):
    Two things to consider

 * Isolation : need a strategy for handling this.  Currently isolated reads keep all rfiles
they are currently reading a row from around.
 * Memory allocation : need a new strategy for deciding maximum memory that can be used by
in memory maps since will have to read and write to memory for these in memory compactions.
                  
> support in-memory compactions
> -----------------------------
>
>                 Key: ACCUMULO-519
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-519
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: tserver
>            Reporter: Adam Fuchs
>            Assignee: Adam Fuchs
>
> There are several factors that influence how big to make the in-memory write buffer (tserver.memory.maps.max)
for Accumulo. Two dominant factors that conflict with each other are:
> # Overall disk I/O depends somewhat on the log of the ratio of tablet size to initial
file size. Bigger write buffer leads to bigger initial files, and can lead to less overall
disk I/O.
> # Aggregation, versioning, and deleting take place in the iterator tree, which only applies
during compactions and scans. The in-memory write buffer can buffer many versions of a given
key, and scans can be slow if compactions are infrequent.
> One solution would be to run some sort of stepped compaction in-memory, in which the
iterator tree is applied in some sort of log-structured fashion. We can consider the minor
compaction to be two pipelined steps: serialization of map entries, and writing the serialized
form to disk. After we have written the serialized form to disk, we can free up the write-ahead
logs associated with that data.
> I propose the following:
> # We should buffer the serialized RFile form in-memory instead of writing it to disk
(call it a micro-compaction).
> # We should implement a merging step for merging existing buffered RFiles with newly
serialized buffers, using the same algorithm that we use for major compaction file selection.
> # The in-memory buffer should be micro-compacted aggressively (whenever we have a thread
free, with some minimum allocation of CPU and memory I/O resources to this task).
> # The current triggers that we use for minor compactions should be used to select buffered
RFiles from memory and dump them to disk, at which point we can drop the write-ahead log references.
> Overall this will allow users to keep the initial files generated by minor compactions
large while alleviating the second concern of buffering too many versions of the same key.
Two use cases that will benefit greatly for this are ACCUMULO-348 (lots of updates to the
default tablet info in the !METADATA table), and aggregation in which there are a small number
of keys. Other considerations that also affect this space are:
> # RFiles are column-oriented (with locality groups), while the in-memory map is only
row oriented. Moving to a column-oriented structure sooner would benefit some queries.
> # RFiles are optimized for sequential access while the in-memory write buffer requires
lots of random memory access to read a stream of key/value pairs in key order.
> # RFiles use configurable compression, while the in-memory map only uses hierarchical
organization. RFiles generally get better compression.
> # Currently, writing a column-oriented RFile requires scanning the entire in-memory map
for each locality group. Bigger in-memory maps can take a long time to re-order for minor
compaction.
> # Memory fragmentation and garbage collection in the JVM are big concerns that a lot
of work has gone into. We need to be considerate of those factors in implementing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message