accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brian Loss (JIRA)" <>
Subject [jira] [Reopened] (ACCUMULO-727) Bulk Import retry time needs to be longer/configurable
Date Thu, 14 Mar 2013 17:30:15 GMT


Brian Loss reopened ACCUMULO-727:

We don't need only configurable timeout, but need to retry slower.  If a tserver dies, bulk
can try and fail on the dead tserver before the master ever has a chance to reassign the tablet.
> Bulk Import retry time needs to be longer/configurable
> ------------------------------------------------------
>                 Key: ACCUMULO-727
>                 URL:
>             Project: Accumulo
>          Issue Type: Bug
>          Components: tserver
>    Affects Versions: 1.4.1
>            Reporter: Brian Loss
>            Assignee: Eric Newton
>             Fix For: 1.5.0
> Bulk import retries way too fast (at least under some circumstances).  We had a tablet
server that the master killed (we were overloading it with ingest and the hold time got too
big so the master killed it).  At the same time, a bulk import operation had begun and several
map files were assigned to the server that was just killed.  The bulk import retried three
times in an 8 second span, each time failing with a connection refused error, and then gave
up, failing the file completely.  Meanwhile, it took the master about 1m 20s to reassign the
tablet to another server.
> The bulk import process should account for this possibility.  Either it needs to recognize
that it can't connect to a tablet server so it must be down and the tablet will be reassigned
somewhere else, or it should wait longer (such that the default max wait time is > the
average tablet reassignment time).  In the latter case, the retry interval should be made
into a configurable option at the same time.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message