knox-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Larry McCay (Jira)" <j...@apache.org>
Subject [jira] [Updated] (KNOX-755) retry logic for replayBuffer limit errors is incorrect.
Date Wed, 11 Nov 2020 17:38:00 GMT

     [ https://issues.apache.org/jira/browse/KNOX-755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Larry McCay updated KNOX-755:
-----------------------------
    Fix Version/s:     (was: 1.5.0)
                   1.6.0

> retry logic for replayBuffer limit errors is incorrect.
> -------------------------------------------------------
>
>                 Key: KNOX-755
>                 URL: https://issues.apache.org/jira/browse/KNOX-755
>             Project: Apache Knox
>          Issue Type: Bug
>            Reporter: Sergey Shelukhin
>            Priority: Major
>             Fix For: 1.6.0
>
>
> Hive receives corrupted thrift requests when using Knox with Hive with a large query
and insufficient replayBuffer:
> {noformat}
> org.apache.thrift.transport.TTransportException
> 	at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
> 	at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
> 	at org.apache.thrift.protocol.TBinaryProtocol.readStringBody(TBinaryProtocol.java:354)
> 	at org.apache.thrift.protocol.TBinaryProtocol.readString(TBinaryProtocol.java:347)
> 	at org.apache.hive.service.cli.thrift.TExecuteStatementReq$TExecuteStatementReqStandardScheme.read(TExecuteStatementReq.java:618)
> ...
> {noformat}
> It seems that the retry logic for this error is incorrect, as follows (names changed
to generic):
> {noformat}
> 2016-10-05 15:25:51,104 DEBUG http.wire (Wire.java:wire(63)) - >> "[0x80][0x1][0x0][0x1][0x0][0x0][0x0][0x10]ExecuteStatement[0x0][0x0][0x0]...![0x88]SELECT
1 AS `number_of_records`,[\n]"
> ...
> 2016-10-05 15:25:51,117 DEBUG http.wire (Wire.java:wire(77)) - >> "  `tablename`.`columnn"
> 2016-10-05 15:25:51,118 DEBUG http.wire (Wire.java:wire(63)) - >> "[\r][\n]"
> ...
> 2016-10-05 15:25:51,119 INFO  client.DefaultHttpClient (DefaultRequestDirector.java:tryExecute(726))
- I/O exception (java.io.IOException) caught when processing request: Hit replay buffer max
limit
> 2016-10-05 15:25:51,120 DEBUG client.DefaultHttpClient (DefaultRequestDirector.java:tryExecute(731))
- Hit replay buffer max limit
> java.io.IOException: Hit replay buffer max limit
> 	at org.apache.hadoop.gateway.dispatch.CappedBufferHttpEntity$ReplayStream.read(CappedBufferHttpEntity.java:143)
> 	at java.io.InputStream.read(InputStream.java:101)
> 	at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1792)
> 	at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1769)
> 	at org.apache.commons.io.IOUtils.copy(IOUtils.java:1744)
> 	at org.apache.hadoop.gateway.dispatch.CappedBufferHttpEntity.writeTo(CappedBufferHttpEntity.java:93)
> 	at org.apache.http.entity.HttpEntityWrapper.writeTo(HttpEntityWrapper.java:98)
> {noformat}
> However, then it retries:
> {noformat}
> 2016-10-05 15:25:51,121 INFO  client.DefaultHttpClient (DefaultRequestDirector.java:tryExecute(733))
- Retrying request
> 2016-10-05 15:25:51,121 DEBUG client.DefaultHttpClient (DefaultRequestDirector.java:tryExecute(703))
- Reopening the direct connection.
> {noformat}
> After auth (for which the same incorrect request as below is sent, but not parsed due
to 401), it sends the thing again with correct auth header, as follows:
> {noformat}
> 2016-10-05 15:25:51,166 DEBUG client.DefaultHttpClient (DefaultRequestDirector.java:tryExecute(713))
- Attempt 3 to execute request
> 2016-10-05 15:25:51,166 DEBUG conn.DefaultClientConnection (DefaultClientConnection.java:sendRequestHeader(269))
- Sending request: POST /cliservice?doAs=... HTTP/1.1
> 2016-10-05 15:25:51,167 DEBUG http.wire (Wire.java:wire(63)) - >> "POST /cliservice?doAs=...
HTTP/1.1[\r][\n]"
> ...
> 2016-10-05 15:25:51,169 DEBUG http.wire (Wire.java:wire(63)) - >> "Authorization:
Negotiate ...
> 2016-10-05 15:25:51,170 DEBUG http.wire (Wire.java:wire(63)) - >> "[\r][\n]"
> ...
> 2016-10-05 15:25:51,172 DEBUG http.wire (Wire.java:wire(63)) - >> "1000[\r][\n]"
> 2016-10-05 15:25:51,173 DEBUG http.wire (Wire.java:wire(63)) - >> "[0x80][0x1][0x0][0x1][0x0][0x0][0x0][0x10]ExecuteStatement[0x0]
... ![0x88]SELECT 1 AS `number_of_records`,[\n]"
> ...
> 2016-10-05 15:25:51,186 DEBUG http.wire (Wire.java:wire(77)) - >> "  `tablename`.`columnn"
> 2016-10-05 15:25:51,187 DEBUG http.wire (Wire.java:wire(63)) - >> "[\r][\n]"
> 2016-10-05 15:25:51,187 DEBUG http.wire (Wire.java:wire(63)) - >> "1f3[\r][\n]"
> 2016-10-05 15:25:51,187 DEBUG http.wire (Wire.java:wire(63)) - >> "ther` AS `anothercolumnnameother`,[\n]"
> ... rest of the query
> {noformat}
> Note that there's a  gap at "columnn", where "columnname" should be.
> This results in the above error when reading the request, and error 500 on gateway side.
> I think the retry logic should be fixed to send the correct buffer, or removed for this
type of error.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message