Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id BE1B6312B for ; Fri, 6 May 2011 04:29:43 +0000 (UTC) Received: (qmail 60060 invoked by uid 500); 6 May 2011 04:29:43 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 60032 invoked by uid 500); 6 May 2011 04:29:43 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 60013 invoked by uid 99); 6 May 2011 04:29:42 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 06 May 2011 04:29:42 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 06 May 2011 04:29:41 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 489D5C3ED1 for ; Fri, 6 May 2011 04:29:03 +0000 (UTC) Date: Fri, 6 May 2011 04:29:03 +0000 (UTC) From: "stack (JIRA)" To: issues@hbase.apache.org Message-ID: <1924242507.27104.1304656143293.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <429496083.25457.1301600525730.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Resolved] (HBASE-3721) Speedup LoadIncrementalHFiles MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-3721. -------------------------- Resolution: Fixed Fix Version/s: 0.92.0 Hadoop Flags: [Reviewed] Committed to TRUNK. Thanks for the patch Ted (Thanks Adam for testing). > Speedup LoadIncrementalHFiles > ----------------------------- > > Key: HBASE-3721 > URL: https://issues.apache.org/jira/browse/HBASE-3721 > Project: HBase > Issue Type: Improvement > Components: util > Reporter: Ted Yu > Assignee: Ted Yu > Fix For: 0.92.0 > > Attachments: 3721-v2.txt, 3721-v3.txt, 3721-v4.txt, 3721-v6.patch, 3721.txt, LoadIncrementalHFiles.java > > > From Adam Phelps: > from the logs it looks like <1% of the hfiles we're loading have to be split. Looking at the code for LoadIncrementHFiles (hbase v0.90.1), I'm actually thinking our problem is that this code loads the hfiles sequentially. Our largest table has over 2500 regions and the data being loaded is fairly well distributed across them, so there end up being around 2500 HFiles for each load period. At 1-2 seconds per HFile that means the loading process is very time consuming. > Currently server.bulkLoadHFile() is a blocking call. > We can utilize ExecutorService to achieve better parallelism on multi-core computer. > New configuration parameter "hbase.loadincremental.threads.max" is introduced which sets the maximum number of threads for parallel bulk load. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira