accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Newton (JIRA)" <>
Subject [jira] [Created] (ACCUMULO-3901) tserver.tablet.split.midpoint.files.max default value is probably too small
Date Fri, 12 Jun 2015 14:15:00 GMT
Eric Newton created ACCUMULO-3901:

             Summary: tserver.tablet.split.midpoint.files.max default value is probably too
                 Key: ACCUMULO-3901
             Project: Accumulo
          Issue Type: Bug
          Components: tserver
            Reporter: Eric Newton
            Assignee: Eric Newton
            Priority: Minor
             Fix For: 1.8.0

On a large cluster, 50K files were bulk loaded into a single tablet.

This is bad, and not a result of "normal" ingest.

Each file was fairly small (50-100K).

Once loaded, the tablet server decided to try and split the tablet.  Due to the number of
files, the tablet server attempted to determine the split files using multiple passes.  This
was taking a very long time, and held a tablet lock, preventing additional bulk imports.

In desperation, we set tserver.tablet.split.midpoint.files.max and restarted the tablet server.
The tablet was re-hosted elsewhere, and the multi-pass approach was not used.  In a few minutes,
the tablet was examined and split.

So, using tserver.tablet.split.midpoint.files.max=55000 works perfectly well. Of course this
is on production nodes, and we tend to make the default settings appropriate for a single-node
development system.

Suggest that we update the default for this setting to be at least 300 without concern.

I spoke offline with [~kturner], who confirms that the original default was arbitrarily chosen.

Examining other production systems, the multi-pass approach is being used more often than
expected, probably as a result of depending on massive numbers of bulk imports.

This message was sent by Atlassian JIRA

View raw message