Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4DCD6648C for ; Tue, 12 Jul 2011 05:11:29 +0000 (UTC) Received: (qmail 38577 invoked by uid 500); 12 Jul 2011 05:11:29 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 38442 invoked by uid 500); 12 Jul 2011 05:11:27 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 38425 invoked by uid 99); 12 Jul 2011 05:11:26 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 12 Jul 2011 05:11:26 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 12 Jul 2011 05:11:23 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 62F9A4A016 for ; Tue, 12 Jul 2011 05:11:01 +0000 (UTC) Date: Tue, 12 Jul 2011 05:11:01 +0000 (UTC) From: "jiraposter@reviews.apache.org (JIRA)" To: issues@hbase.apache.org Message-ID: <1397745518.5017.1310447461402.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1354461066.33245.1304973363147.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HBASE-3871) Speedup LoadIncrementalHFiles by parallelizing HFile splitting MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HBASE-3871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13063719#comment-13063719 ] jiraposter@reviews.apache.org commented on HBASE-3871: ------------------------------------------------------ bq. On 2011-07-12 04:49:26, Michael Stack wrote: bq. > Patch looks fine to me but are you addressing Andrew's comment that perhaps futures not needed? Good stuff. bq. bq. Ted Yu wrote: bq. CountDownLatch ctor is passed the total number of items (HFiles in our case). tryLoad() decides which HFile's to split, making number of items dynamic. bq. This is why I didn't use CountDownLatch. bq. bq. With patch v2, we wouldn't spend much time waiting for any HFile to finish splitting. bq. OK. +1 ship it. - Michael ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/704/#review1033 ----------------------------------------------------------- On 2011-07-09 01:54:46, Ted Yu wrote: bq. bq. ----------------------------------------------------------- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/704/ bq. ----------------------------------------------------------- bq. bq. (Updated 2011-07-09 01:54:46) bq. bq. bq. Review request for hbase and Michael Stack. bq. bq. bq. Summary bq. ------- bq. bq. This JIRA complements HBASE-3721 by parallelizing HFile splitting which was done in the main thread. bq. bq. From Adam w.r.t. HFile splitting: bq. There's actually a good number of messages of that type (HFile no longer fits inside a single region), unfortunately I didn't take a timestamp on just when I was running with the patched jars vs the regular ones, however from the logs I can say that this is occurring fairly regularly on this system. The cluster I tested this on is our backup cluster, the mapreduce jobs on our production cluster output HFiles which are copied to the backup and then loaded into HBase on both. Since the regions may be somewhat different on the backup cluster I would expect it to have to split somewhat regularly. bq. bq. bq. This addresses bug HBASE-3871. bq. https://issues.apache.org/jira/browse/HBASE-3871 bq. bq. bq. Diffs bq. ----- bq. bq. /src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java 1144493 bq. bq. Diff: https://reviews.apache.org/r/704/diff bq. bq. bq. Testing bq. ------- bq. bq. TestHFileOutputFormat and TestLoadIncrementalHFiles passed with this patch. bq. bq. bq. Thanks, bq. bq. Ted bq. bq. > Speedup LoadIncrementalHFiles by parallelizing HFile splitting > -------------------------------------------------------------- > > Key: HBASE-3871 > URL: https://issues.apache.org/jira/browse/HBASE-3871 > Project: HBase > Issue Type: Improvement > Components: mapreduce > Affects Versions: 0.90.2 > Reporter: Ted Yu > Assignee: Ted Yu > Attachments: 3871.patch > > > From Adam w.r.t. HFile splitting: > There's actually a good number of messages of that type (HFile no longer fits inside a single region), unfortunately I didn't take a timestamp on just when I was running with the patched jars vs the regular ones, however from the logs I can say that this is occurring fairly regularly on this system. The cluster I tested this on is our backup cluster, the mapreduce jobs on our production cluster output HFiles which are copied to the backup and then loaded into HBase on both. Since the regions may be somewhat different on the backup cluster I would expect it to have to split somewhat regularly. > This JIRA complements HBASE-3721 by parallelizing HFile splitting which is done in the main thread. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira