accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Newton (JIRA)" <>
Subject [jira] [Commented] (ACCUMULO-201) tablet server runs out of memory performing a major compaction
Date Wed, 02 May 2012 13:22:54 GMT


Eric Newton commented on ACCUMULO-201:

> tablet server runs out of memory performing a major compaction
> --------------------------------------------------------------
>                 Key: ACCUMULO-201
>                 URL:
>             Project: Accumulo
>          Issue Type: Bug
>          Components: tserver
>            Reporter: Eric Newton
>            Assignee: Eric Newton
>             Fix For: 1.5.0
> An accumulo user watched their cluster slowly shrink: one tablet server would fail every
8-10 minutes.
> We determined that a major compaction of a single tablet would cause the tablet server
to run out of memory.  That tablet would then be sent to a new server, which would schedule
a major compaction, and it would die as well.
>  # it was harder than it should have been to identify the tablet causing the problem
>  # the tablet had a combination of several large existing files and a few bulk loaded
files with a few very large key/values
>  # large key/values were between *10 and 100 megabytes each*, the tablet server had a
1G memory limit
>  # the next key for each file will sit in memory while performing the merge-sort
> There exists a Constraint which can limit the size of mutations during normal ingest.
 However, there is no constraint or check on the size of mutations that may be bulk loaded.
> The tablet server should log the key extent (range) of a tablet prior to attempting a
major compaction.
> Large key values (those that approach a significant portion of the working memory of
the JVM) might need to go into a separate merge file, or might result in multi-stage merges
just to defend against an out-of-memory failure.
> Tablet servers could mark tablets during a major compaction attempt.  Tablets with multiple
markers could use a multi-pass merge to attempt to survive the merge.  Alternatively, the
master could refuse to assign tablets with too many markers.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message