Return-Path: X-Original-To: apmail-accumulo-notifications-archive@minotaur.apache.org Delivered-To: apmail-accumulo-notifications-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E05EA17F0C for ; Fri, 3 Oct 2014 20:41:34 +0000 (UTC) Received: (qmail 69956 invoked by uid 500); 3 Oct 2014 20:41:34 -0000 Delivered-To: apmail-accumulo-notifications-archive@accumulo.apache.org Received: (qmail 69920 invoked by uid 500); 3 Oct 2014 20:41:34 -0000 Mailing-List: contact notifications-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: jira@apache.org Delivered-To: mailing list notifications@accumulo.apache.org Received: (qmail 69906 invoked by uid 99); 3 Oct 2014 20:41:34 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 03 Oct 2014 20:41:34 +0000 Date: Fri, 3 Oct 2014 20:41:34 +0000 (UTC) From: "Eric Newton (JIRA)" To: notifications@accumulo.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Resolved] (ACCUMULO-3193) bulkImport file rename is a bottleneck MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/ACCUMULO-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Newton resolved ACCUMULO-3193. ----------------------------------- Resolution: Fixed > bulkImport file rename is a bottleneck > -------------------------------------- > > Key: ACCUMULO-3193 > URL: https://issues.apache.org/jira/browse/ACCUMULO-3193 > Project: Accumulo > Issue Type: Improvement > Components: master > Affects Versions: 1.5.0, 1.5.1, 1.5.2, 1.6.0, 1.6.1 > Environment: very large cluster > Reporter: Eric Newton > Assignee: Eric Newton > Fix For: 1.7.0 > > Time Spent: 10m > Remaining Estimate: 0h > > On a very large cluster, importing a few thousand files takes several minutes. Most of that time is spent renaming the user's files into the accumulo bulk-load directory. In this case, the master is competing against the other demands on the NN. The master could adopt the same strategy as the file GC, and run the renames in parallel, to push more operations into the NN at one time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)