Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id EFAE3200C3F for ; Tue, 14 Feb 2017 20:52:45 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id EE893160B5F; Tue, 14 Feb 2017 19:52:45 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 563B2160B6D for ; Tue, 14 Feb 2017 20:52:45 +0100 (CET) Received: (qmail 10502 invoked by uid 500); 14 Feb 2017 19:52:44 -0000 Mailing-List: contact issues-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list issues@hive.apache.org Received: (qmail 10493 invoked by uid 99); 14 Feb 2017 19:52:44 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 14 Feb 2017 19:52:44 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 28F17C0D33 for ; Tue, 14 Feb 2017 19:52:44 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -1.999 X-Spam-Level: X-Spam-Status: No, score=-1.999 tagged_above=-999 required=6.31 tests=[KAM_LAZY_DOMAIN_SECURITY=1, RP_MATCHES_RCVD=-2.999] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id K9HHOpSFOKxS for ; Tue, 14 Feb 2017 19:52:43 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id C47B85FAE6 for ; Tue, 14 Feb 2017 19:52:42 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 00FD4E0410 for ; Tue, 14 Feb 2017 19:52:41 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id A4A672411B for ; Tue, 14 Feb 2017 19:52:41 +0000 (UTC) Date: Tue, 14 Feb 2017 19:52:41 +0000 (UTC) From: "Xuefu Zhang (JIRA)" To: issues@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HIVE-15893) Followup on HIVE-15671 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 14 Feb 2017 19:52:46 -0000 [ https://issues.apache.org/jira/browse/HIVE-15893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15866506#comment-15866506 ] Xuefu Zhang commented on HIVE-15893: ------------------------------------ [~lirui], I didn't mean HIVE-15860 will provide a solution that solves the problem described here, which is about detecting issues in the driver. I was saying that with the job monitoring thread monitor jobs submitted to the driver and the fix here, maybe the problem is mitigated or avoided. If this is true, then we might not need the proposal here. This needs further investigation though. > Followup on HIVE-15671 > ---------------------- > > Key: HIVE-15893 > URL: https://issues.apache.org/jira/browse/HIVE-15893 > Project: Hive > Issue Type: Improvement > Components: Spark > Affects Versions: 2.2.0 > Reporter: Xuefu Zhang > Assignee: Xuefu Zhang > > In HIVE-15671, we fixed a type where server.connect.timeout is used in the place of client.connect.timeout. This might solve some potential problems, but the original problem reported in HIVE-15671 might still exist. (Not sure if HIVE-15860 helps). Here is the proposal suggested by Marcelo: > {quote} > bq: server detecting a driver problem after it has connected back to the server. > Hmm. That is definitely not any of the "connect" timeouts, which probably means it isn't configured and is just using netty's default (which is probably no timeout?). Would probably need something using io.netty.handler.timeout.IdleStateHandler, and also some periodic "ping" so that the connection isn't torn down without reason. > {quote} > We will use this JIRA to track the issue. -- This message was sent by Atlassian JIRA (v6.3.15#6346)