accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-727) Bulk Import retry time needs to be longer/configurable
Date Sat, 16 Mar 2013 00:52:14 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13604050#comment-13604050
] 

Hudson commented on ACCUMULO-727:
---------------------------------

Integrated in Accumulo-1.5 #34 (See [https://builds.apache.org/job/Accumulo-1.5/34/])
    ACCUMULO-727 retry more times (Revision 1457059)
ACCUMULO-727 add exponential back-off when bulk loading files (Revision 1457055)

     Result = ABORTED
ecn : 
Files : 
* /accumulo/branches/1.5/core/src/main/java/org/apache/accumulo/core/conf/Property.java

ecn : 
Files : 
* /accumulo/branches/1.5/server/src/main/java/org/apache/accumulo/server/client/BulkImporter.java

                
> Bulk Import retry time needs to be longer/configurable
> ------------------------------------------------------
>
>                 Key: ACCUMULO-727
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-727
>             Project: Accumulo
>          Issue Type: Bug
>          Components: tserver
>    Affects Versions: 1.4.1
>            Reporter: Brian Loss
>            Assignee: Eric Newton
>             Fix For: 1.5.0
>
>
> Bulk import retries way too fast (at least under some circumstances).  We had a tablet
server that the master killed (we were overloading it with ingest and the hold time got too
big so the master killed it).  At the same time, a bulk import operation had begun and several
map files were assigned to the server that was just killed.  The bulk import retried three
times in an 8 second span, each time failing with a connection refused error, and then gave
up, failing the file completely.  Meanwhile, it took the master about 1m 20s to reassign the
tablet to another server.
> The bulk import process should account for this possibility.  Either it needs to recognize
that it can't connect to a tablet server so it must be down and the tablet will be reassigned
somewhere else, or it should wait longer (such that the default max wait time is > the
average tablet reassignment time).  In the latter case, the retry interval should be made
into a configurable option at the same time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message