Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 5E5B7200D09 for ; Tue, 12 Sep 2017 11:07:05 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 5D1261609C7; Tue, 12 Sep 2017 09:07:05 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id A63211609C6 for ; Tue, 12 Sep 2017 11:07:04 +0200 (CEST) Received: (qmail 38333 invoked by uid 500); 12 Sep 2017 09:07:02 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 38321 invoked by uid 99); 12 Sep 2017 09:07:02 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 12 Sep 2017 09:07:02 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id E7D6618B687 for ; Tue, 12 Sep 2017 09:07:01 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id Lh4mMJnEatuN for ; Tue, 12 Sep 2017 09:07:01 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id F04BD60FC8 for ; Tue, 12 Sep 2017 09:07:00 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 7717DE0663 for ; Tue, 12 Sep 2017 09:07:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 316C824147 for ; Tue, 12 Sep 2017 09:07:00 +0000 (UTC) Date: Tue, 12 Sep 2017 09:07:00 +0000 (UTC) From: "Xiang Li (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-15482) Provide an option to skip calculating block locations for SnapshotInputFormat MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 12 Sep 2017 09:07:05 -0000 [ https://issues.apache.org/jira/browse/HBASE-15482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16162699#comment-16162699 ] Xiang Li commented on HBASE-15482: ---------------------------------- Hi, this JIRA is still valid now? I plan to work on it if it is still valid. > Provide an option to skip calculating block locations for SnapshotInputFormat > ----------------------------------------------------------------------------- > > Key: HBASE-15482 > URL: https://issues.apache.org/jira/browse/HBASE-15482 > Project: HBase > Issue Type: Improvement > Components: mapreduce > Reporter: Liyin Tang > Priority: Minor > > When a MR job is reading from SnapshotInputFormat, it needs to calculate the splits based on the block locations in order to get best locality. However, this process may take a long time for large snapshots. > In some setup, the computing layer, Spark, Hive or Presto could run out side of HBase cluster. In these scenarios, the block locality doesn't matter. Therefore, it will be great to have an option to skip calculating the block locations for every job. That will super useful for the Hive/Presto/Spark connectors. -- This message was sent by Atlassian JIRA (v6.4.14#64029)