Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 84B25200D1A for ; Mon, 25 Sep 2017 05:11:05 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 832E11609E8; Mon, 25 Sep 2017 03:11:05 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id C85CB1609E6 for ; Mon, 25 Sep 2017 05:11:04 +0200 (CEST) Received: (qmail 90103 invoked by uid 500); 25 Sep 2017 03:11:03 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 90082 invoked by uid 99); 25 Sep 2017 03:11:03 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 25 Sep 2017 03:11:03 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id CF96FC0402 for ; Mon, 25 Sep 2017 03:11:02 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id JO8FoY_JoQAv for ; Mon, 25 Sep 2017 03:11:02 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id EA6205F6C3 for ; Mon, 25 Sep 2017 03:11:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 7CCBCE0526 for ; Mon, 25 Sep 2017 03:11:01 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 153F324215 for ; Mon, 25 Sep 2017 03:11:00 +0000 (UTC) Date: Mon, 25 Sep 2017 03:11:00 +0000 (UTC) From: "Huafeng Wang (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HDFS-12534) Provide logical BlockLocations for EC files for better split calculation MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Mon, 25 Sep 2017 03:11:05 -0000 [ https://issues.apache.org/jira/browse/HDFS-12534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16178479#comment-16178479 ] Huafeng Wang commented on HDFS-12534: ------------------------------------- Hi [~andrew.wang], I have a question here. {quote} Applications depend on HDFS BlockLocation to understand where the split points are. {quote} I think currently the returned logical BlockLocation per block group has all the data block and parity block's locations. Isn't these information enough? What's the difference between splitting a single block group and multiple logical block locations here? > Provide logical BlockLocations for EC files for better split calculation > ------------------------------------------------------------------------ > > Key: HDFS-12534 > URL: https://issues.apache.org/jira/browse/HDFS-12534 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding > Affects Versions: 3.0.0-beta1 > Reporter: Andrew Wang > Labels: hdfs-ec-3.0-must-do > > I talked to [~vanzin] and [~alex.behm] some more about split calculation with EC. It turns out HDFS-12222 was resolved prematurely. Applications depend on HDFS BlockLocation to understand where the split points are. The current scheme of returning one BlockLocation per block group loses this information. > We should change this to provide logical blocks. Divide the file length by the block size and provide suitable BlockLocations to match, with virtual offsets and lengths too. > I'm not marking this as incompatible, since changing it this way would in fact make it more compatible from the perspective of applications that are scheduling against replicated files. Thus, it'd be good for beta1 if possible, but okay for later too. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org