Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 93DAE200C88 for ; Thu, 18 May 2017 19:07:08 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 9259B160B9D; Thu, 18 May 2017 17:07:08 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id DA2A3160BB5 for ; Thu, 18 May 2017 19:07:07 +0200 (CEST) Received: (qmail 46710 invoked by uid 500); 18 May 2017 17:07:07 -0000 Mailing-List: contact issues-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list issues@hive.apache.org Received: (qmail 46690 invoked by uid 99); 18 May 2017 17:07:07 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 18 May 2017 17:07:07 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id ABA89D0D15 for ; Thu, 18 May 2017 17:07:06 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.201 X-Spam-Level: X-Spam-Status: No, score=-99.201 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100, WEIRD_PORT=0.001] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id AkKPsW0-xL9E for ; Thu, 18 May 2017 17:07:06 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id AAA635FB98 for ; Thu, 18 May 2017 17:07:05 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 0D9E0E0DAF for ; Thu, 18 May 2017 17:07:05 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 45FF821B60 for ; Thu, 18 May 2017 17:07:04 +0000 (UTC) Date: Thu, 18 May 2017 17:07:04 +0000 (UTC) From: "Siddharth Seth (JIRA)" To: issues@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HIVE-16692) LLAP: Keep alive connection in shuffle handler should not be closed until entire data is flushed out MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Thu, 18 May 2017 17:07:08 -0000 [ https://issues.apache.org/jira/browse/HIVE-16692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated HIVE-16692: ---------------------------------- Status: Patch Available (was: Reopened) > LLAP: Keep alive connection in shuffle handler should not be closed until entire data is flushed out > ---------------------------------------------------------------------------------------------------- > > Key: HIVE-16692 > URL: https://issues.apache.org/jira/browse/HIVE-16692 > Project: Hive > Issue Type: Bug > Components: llap > Reporter: Rajesh Balamohan > Assignee: Rajesh Balamohan > Priority: Minor > Fix For: 3.0.0 > > Attachments: HIVE-16692.02.patch, HIVE-16692.1.patch, HIVE-16692.addendum.patch > > > In corner cases with keep-alive enabled, it is possible that the headers are written out in the response and downstream was able to read the headers. > But possible that the mapOutput construction took a lot longer time (due to disk or any other issue) in server side. In the mean time, keep alive timeout can kick in and close the connection from server side. In such cases, there is a possibility that downstream can get "connection reset". Ideally keep alive should kick in only after flushing entire response downstream. > e.g error msg in client side > {noformat} > java.net.SocketException: Connection reset > at java.net.SocketInputStream.read(SocketInputStream.java:209) ~[?:1.8.0_112] > at java.net.SocketInputStream.read(SocketInputStream.java:141) ~[?:1.8.0_112] > at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) ~[?:1.8.0_112] > at java.io.BufferedInputStream.read1(BufferedInputStream.java:286) ~[?:1.8.0_112] > at java.io.BufferedInputStream.read(BufferedInputStream.java:345) ~[?:1.8.0_112] > at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:704) ~[?:1.8.0_112] > at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:647) ~[?:1.8.0_112] > at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:675) ~[?:1.8.0_112] > at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1569) ~[?:1.8.0_112] > at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1474) ~[?:1.8.0_112] > at org.apache.tez.http.HttpConnection.getInputStream(HttpConnection.java:260) ~[tez-runtime-library-0.8.4.2.6.1.0-11.jar:0.8.4.2.6.1.0-11] > at org.apache.tez.runtime.library.common.shuffle.Fetcher.setupConnection(Fetcher.java:460) ~[tez-runtime-library-0.8.4.2.6.1.0-11.jar:0.8.4.2.6.1.0-11] > at org.apache.tez.runtime.library.common.shuffle.Fetcher.doHttpFetch(Fetcher.java:492) ~[tez-runtime-library-0.8.4.2.6.1.0-11.jar:0.8.4.2.6.1.0-11] > at org.apache.tez.runtime.library.common.shuffle.Fetcher.doHttpFetch(Fetcher.java:417) ~[tez-runtime-library-0.8.4.2.6.1.0-11.jar:0.8.4.2.6.1.0-11] > at org.apache.tez.runtime.library.common.shuffle.Fetcher.callInternal(Fetcher.java:215) ~[tez-runtime-library-0.8.4.2.6.1.0-11.jar:0.8.4.2.6.1.0-11] > at org.apache.tez.runtime.library.common.shuffle.Fetcher.callInternal(Fetcher.java:73) ~[tez-runtime-library-0.8.4.2.6.1.0-11.jar:0.8.4.2.6.1.0-11] > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) ~[tez-common-0.8.4.2.6.1.0-11.jar:0.8.4.2.6.1.0-11] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_112] > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_112] > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_112] > at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112] > {noformat} > This corner case handling was not pulled in earlier from MR handler fixes. -- This message was sent by Atlassian JIRA (v6.3.15#6346)