Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 4BADF200BA6 for ; Tue, 18 Oct 2016 13:54:00 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 4A6A1160AF7; Tue, 18 Oct 2016 11:54:00 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 86E2A160ACC for ; Tue, 18 Oct 2016 13:53:59 +0200 (CEST) Received: (qmail 92394 invoked by uid 500); 18 Oct 2016 11:53:58 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 92369 invoked by uid 99); 18 Oct 2016 11:53:58 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 18 Oct 2016 11:53:58 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 752AA2C4C72 for ; Tue, 18 Oct 2016 11:53:58 +0000 (UTC) Date: Tue, 18 Oct 2016 11:53:58 +0000 (UTC) From: "Tim Robertson (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-12596) bulkload needs to follow locality MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 18 Oct 2016 11:54:00 -0000 [ https://issues.apache.org/jira/browse/HBASE-12596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15585264#comment-15585264 ] Tim Robertson commented on HBASE-12596: --------------------------------------- Do any of the original committers recall if there was a technical reason why this could not be applied to 1.0.0? I'm about to try and produce a patched implementation for a running 1.0 (CDH 5.4.10) installation but if someone knows this is doomed it would be nice to know. > bulkload needs to follow locality > --------------------------------- > > Key: HBASE-12596 > URL: https://issues.apache.org/jira/browse/HBASE-12596 > Project: HBase > Issue Type: Improvement > Components: HFile, regionserver > Affects Versions: 0.98.8 > Environment: hadoop-2.3.0, hbase-0.98.8, jdk1.7 > Reporter: Victor Xu > Assignee: Victor Xu > Fix For: 2.0.0, 0.98.14, 1.3.0 > > Attachments: HBASE-12596-0.98-v1.patch, HBASE-12596-0.98-v2.patch, HBASE-12596-0.98-v3.patch, HBASE-12596-0.98-v4.patch, HBASE-12596-0.98-v5.patch, HBASE-12596-0.98-v6.patch, HBASE-12596-branch-1-v1.patch, HBASE-12596-branch-1-v2.patch, HBASE-12596-master-v1.patch, HBASE-12596-master-v2.patch, HBASE-12596-master-v3.patch, HBASE-12596-master-v4.patch, HBASE-12596-master-v5.patch, HBASE-12596-master-v6.patch, HBASE-12596.patch > > > Normally, we have 2 steps to perform a bulkload: 1. use a job to write HFiles to be loaded; 2. Move these HFiles to the right hdfs directory. However, the locality could be loss during the first step. Why not just write the HFiles directly into the right place? We can do this easily because StoreFile.WriterBuilder has the "withFavoredNodes" method, and we just need to call it in HFileOutputFormat's getNewWriter(). > This feature is enabled by default, and we could use 'hbase.bulkload.locality.sensitive.enabled=false' to disable it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)