hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gabor Szadovszky (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-14714) Finishing Hive on Spark causes "java.io.IOException: Stream closed"
Date Thu, 15 Sep 2016 16:00:23 GMT

    [ https://issues.apache.org/jira/browse/HIVE-14714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15493760#comment-15493760
] 

Gabor Szadovszky commented on HIVE-14714:
-----------------------------------------

The original problem was the listed exception and that beeline exited only after 10s.

The root cause of the 10s delay was that in many cases the spark-submit process does not end
even in the case of the RemoteDriver has ended on the other side. Therefore, the driverThread.join(10000)
really waits for 10s and then we are interrupting it. Here comes the root cause of the logged
exception. If we are interrupting child.waitFor() the redirector threads gets IOExceptions
in the next readLine() as the related streams got closed.

I've redesigned the Redirector class therefore, it does not use any IO which might hang the
thread in case of interruption (e.g. BufferedReader.readLine() cannot be interrupted, it waits
for infinity if the related stream is open but no input appears). After this redesign we are
able to simply interrupt the driver thread and let it keep working in the background until
we have some outputs to be gathered or the related timeout occurs. We do not have to hang
the client side to wait for all the threads to be finished.

Then came the unit test failure. The root cause was that protocol.endSession() only sends
a job via rpc asynchronously to close the session on the other side. As there is no 10s delay
anymore the unit tests executed each after another run into the issue that the previous session
is not closed properly. Therefore I've implemented some trick make the end session synchronous.

Hope it describes my change properly and with my code comments makes it understandable.
Any comments here or on the review board are more than welcome. :)

> Finishing Hive on Spark causes "java.io.IOException: Stream closed"
> -------------------------------------------------------------------
>
>                 Key: HIVE-14714
>                 URL: https://issues.apache.org/jira/browse/HIVE-14714
>             Project: Hive
>          Issue Type: Bug
>          Components: HiveServer2
>    Affects Versions: 1.1.0
>            Reporter: Gabor Szadovszky
>            Assignee: Gabor Szadovszky
>         Attachments: HIVE-14714.2.patch, HIVE-14714.patch
>
>
> After execute hive command with Spark, finishing the beeline session or
> even switch the engine causes IOException. The following executed Ctrl-D to
> finish the session but "!quit" or even "set hive.execution.engine=mr;" causes
> the issue.
> From HS2 log:
> {code}
> 2016-09-06 16:15:12,291 WARN  org.apache.hive.spark.client.SparkClientImpl: [HiveServer2-Handler-Pool:
Thread-106]: Timed out shutting down remote driver, interrupting...
> 2016-09-06 16:15:12,291 WARN  org.apache.hive.spark.client.SparkClientImpl: [Driver]:
Waiting thread interrupted, killing child process.
> 2016-09-06 16:15:12,296 WARN  org.apache.hive.spark.client.SparkClientImpl: [stderr-redir-1]:
Error in redirector thread.
> java.io.IOException: Stream closed
>         at java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:162)
>         at java.io.BufferedInputStream.read1(BufferedInputStream.java:272)
>         at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
>         at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283)
>         at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325)
>         at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177)
>         at java.io.InputStreamReader.read(InputStreamReader.java:184)
>         at java.io.BufferedReader.fill(BufferedReader.java:154)
>         at java.io.BufferedReader.readLine(BufferedReader.java:317)
>         at java.io.BufferedReader.readLine(BufferedReader.java:382)
>         at org.apache.hive.spark.client.SparkClientImpl$Redirector.run(SparkClientImpl.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message