Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 6F7B9200CD0 for ; Tue, 25 Jul 2017 20:21:08 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 6E0A716742C; Tue, 25 Jul 2017 18:21:08 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id B548B167429 for ; Tue, 25 Jul 2017 20:21:07 +0200 (CEST) Received: (qmail 36886 invoked by uid 500); 25 Jul 2017 18:21:06 -0000 Mailing-List: contact issues-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list issues@hive.apache.org Received: (qmail 36676 invoked by uid 99); 25 Jul 2017 18:21:05 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 25 Jul 2017 18:21:05 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id BA0891834F1 for ; Tue, 25 Jul 2017 18:21:04 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id DrtW4ix8A6TV for ; Tue, 25 Jul 2017 18:21:02 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 9A1095FBB1 for ; Tue, 25 Jul 2017 18:21:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id B796BE00C7 for ; Tue, 25 Jul 2017 18:21:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 3F54723F05 for ; Tue, 25 Jul 2017 18:21:00 +0000 (UTC) Date: Tue, 25 Jul 2017 18:21:00 +0000 (UTC) From: "George Smith (JIRA)" To: issues@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HIVE-17146) Spark on Hive - Exception while joining tables - "Requested replication factor of 10 exceeds maximum of x" MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 25 Jul 2017 18:21:08 -0000 [ https://issues.apache.org/jira/browse/HIVE-17146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16100502#comment-16100502 ] George Smith commented on HIVE-17146: ------------------------------------- @[~ruili] The point is that the current code is ignoring {{dfs.replication}} (if value is <10) and the only known workaround is to set {{dfs.replication.max}} (which is not obvious until you spend a day with solving this problem...) > Spark on Hive - Exception while joining tables - "Requested replication factor of 10 exceeds maximum of x" > ----------------------------------------------------------------------------------------------------------- > > Key: HIVE-17146 > URL: https://issues.apache.org/jira/browse/HIVE-17146 > Project: Hive > Issue Type: Bug > Components: Hive > Affects Versions: 2.1.1, 3.0.0 > Reporter: George Smith > Assignee: Ashutosh Chauhan > > We found a bug in the current implementation of [org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/SparkHashTableSinkOperator.java] > The *magic number 10* for minReplication factor can cause the exception when the configuration parameter _dfs.replication_ is lower than 10. > Consider these [properties configuration|https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml] on our cluster (with less than 10 nodes): > {code} > dfs.namenode.replication.min=1 > dfs.replication=2 > dfs.replication.max=512 (that's the default value) > {code} > The current implementation counts target file replication as follows (relevant snippets of the code): > {code} > private int minReplication = 10; > ... > int dfsMaxReplication = hconf.getInt(DFS_REPLICATION_MAX, minReplication); > // minReplication value should not cross the value of dfs.replication.max > minReplication = Math.min(minReplication, dfsMaxReplication); > ... > FileSystem fs = path.getFileSystem(htsOperator.getConfiguration()); > short replication = fs.getDefaultReplication(path); > ... > int numOfPartitions = replication; > replication = (short) Math.max(minReplication, numOfPartitions); > //use replication value in fs.create(path, replication); > {code} > With a current code the used replication value is 10 and the config value _dfs.replication_ is not used at all. > There are probably more (easy) ways to fix it: > # Set field {code}private int minReplication = 1 ; {code} I don't see any obvious reason for the value 10. or > # Init minReplication from config value _dfs.namenode.replication.min_ with a default value 1. or > # Count replication this way: {code}replication = Math.min(numOfPartitions, dfsMaxReplication);{code} or > # Use replication = numOfPartitions; directly > Config value _dfs.replication_ has a default value 3 which is supposed to be always lower than "dfs.replication.max", no checking is probably needed. > Any suggestions which option to choose? > As a *workaround* for this issue we had to set dfs.replication.max=2, but obviously _dfs.replication_ value should NOT be ignored and the problem should be resolved. -- This message was sent by Atlassian JIRA (v6.4.14#64029)