accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Newton (JIRA)" <j...@apache.org>
Subject [jira] [Created] (ACCUMULO-3193) bulkImport file rename is a bottleneck
Date Thu, 02 Oct 2014 19:23:33 GMT
Eric Newton created ACCUMULO-3193:
-------------------------------------

             Summary: bulkImport file rename is a bottleneck
                 Key: ACCUMULO-3193
                 URL: https://issues.apache.org/jira/browse/ACCUMULO-3193
             Project: Accumulo
          Issue Type: Bug
          Components: master
    Affects Versions: 1.6.1, 1.6.0, 1.5.2, 1.5.1, 1.5.0
         Environment: very large cluster
            Reporter: Eric Newton
            Assignee: Eric Newton
             Fix For: 1.5.3, 1.6.2, 1.7.0


On a very large cluster, importing a few thousand files takes several minutes.  Most of that
time is spent renaming the user's files into the accumulo bulk-load directory.  In this case,
the master is competing against the other demands on the NN.  The master could adopt the same
strategy as the file GC, and run the renames in parallel, to push more operations into the
NN at one time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message