hc-httpclient-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Henrich Kraemer <henrich.krae...@us.ibm.com>
Subject Premature EOF on socket read
Date Wed, 29 Aug 2007 22:15:02 GMT


I am using HttpClient 3.0.0 to download a resource to a file. In cases such
as dropped connections the download in progress file is kept in place. On
the next download attempt of the same resource only the remainder of
resource is downloaded using a range request header.

There is some evidence that users of our download code run into cases where
socket reads return -1 prematurely, especially for large downloads. In this
case our code validates the downloaded bytes and will notice that this is
bad content, but as it assumes it downloaded the entire file it will not
resume downloading from the current position but restart downloading from
the beginning.
Perhaps these users experience limitations imposed by a (nonstandard?)
proxy server. In any case our code should be robust enough to handle this
situation by noticing that it only received portions of the file.

In the debugger I can force a similar situation by waiting at a BP for a
couple of minutes after the transfer of a partial request has begun. The BP
is in a loop which reads the Socket Input Stream and writes the bytes to
the file. Below is a portion of a stack trace :

Thread [Download Thread 0] (Suspended (breakpoint at line 133 in
SocketInputStream))       owns: BufferedInputStream  (id=4954)
SocketInputStream.read(byte[], int, int) line: 133      SocketEvents
$SocketEventEmittingWrapper$SocketEventEmittingInputStream.read(byte[],
int, int) line: 298       BufferedInputStream.read1(byte[], int, int) line:
265       BufferedInputStream.read(byte[], int, int) line: 324
ContentLengthInputStream.read(byte[], int, int) line: 169
AutoCloseInputStream(FilterInputStream).read(byte[], int, int) line: 134
AutoCloseInputStream.read(byte[], int, int) line: 107    ...
In this case SocketInputStream.read returns -1, apparently before having
read all expected bytes.
ContentLengthInputStream has only seen
172278 bytes while according to the Content-Length response header
15219053 bytes are to be retrieved.
 << "Content-Length: 15219053[\r][\n]"
 << "Content-Range: bytes 11695824-26914876/26914877[\r][\n]"

Below is the relevant stack trace and variable values as shown in the
debugger:
That is in AutoCloseInputStream.close()
AutoCloseInputStream(FilterInputStream).close() line: 183 [local variables
unavailable]    AutoCloseInputStream.notifyWatcher() line: 176
AutoCloseInputStream.checkClose(int) line: 152    AutoCloseInputStream.read
(byte[], int, int) line: 108
this    AutoCloseInputStream  (id=4961)       in
ContentLengthInputStream  (id=4960)           closed    true
contentLength    15219053 [0xe8396d]           pos    172278 [0x2a0f6]
wrappedStream    BufferedInputStream  (id=4954)       selfClosed
false       streamOpen    true       watcher    HttpMethodBase$1
(id=4969)           this$0    GetMethod  (id=4970)
Subsequently no IOException is thrown. My reading of
http://www.mail-archive.com/httpclient-user@jakarta.apache.org/msg03923.html
 was that one can assume that the entire resource was downloaded when EOF
is encountered. Now I wonder how/where this situation should be handled?

One thought is that ContentLengthInputStream is in a position to know and
could throw some kind of premature EOF encountered exception. Presumably
the default retry logic would then retry such a request.

Or should this be handled by the application?

Thanks,

Henrich

Below are HttpClient TRACE wire trace excerpts:
00:18.45 DEBUG [Thread:ModalContext]
org.apache.commons.httpclient.params.DefaultHttpParams setParameter
 Set parameter http.useragent = Jakarta Commons-HttpClient/3.0
 Set parameter http.protocol.version = HTTP/1.1
 Set parameter http.connection-manager.class = class
org.apache.commons.httpclient.SimpleHttpConnectionManager
 Set parameter http.protocol.cookie-policy = rfc2109
 Set parameter http.protocol.element-charset = US-ASCII
 Set parameter http.protocol.content-charset = ISO-8859-1
 Set parameter http.method.retry-handler =
org.apache.commons.httpclient.DefaultHttpMethodRetryHandler@1dc21dc2
 Set parameter http.dateparser.patterns = [EEE, dd MMM yyyy HH:mm:ss zzz,
EEEE, dd-MMM-yy HH:mm:ss zzz, EEE MMM d HH:mm:ss yyyy, EEE, dd-MMM-yyyy
HH:mm:ss z, EEE, dd-MMM-yyyy HH-mm-ss z, EEE, dd MMM yy HH:mm:ss z, EEE
dd-MMM-yyyy HH:mm:ss z, EEE dd MMM yyyy HH:mm:ss z, EEE dd-MMM-yyyy
HH-mm-ss z, EEE dd-MMM-yy HH:mm:ss z, EEE dd MMM yy HH:mm:ss z,
EEE,dd-MMM-yy HH:mm:ss z, EEE,dd-MMM-yyyy HH:mm:ss z, EEE, dd-MM-yyyy
HH:mm:ss z]
00:18.51 DEBUG [Thread:ModalContext]
org.apache.commons.httpclient.HttpClient <clinit>
 Java version: 1.5.0
 Java vendor: IBM Corporation
 Java class path: C:\AD\Target\e_33GA\eclipse\plugins
\org.eclipse.equinox.launcher_1.0.0.v20070606.jar
 Operating system name: Windows XP
 Operating system architecture: x86
 Operating system version: 5.1 build 2600 Service Pack 2
 IBMJSSE2 1.5: IBM JSSE provider2 (implements IbmX509 key/trust factories,
SSLv3, TLSv1)
 IBMJCE 1.2: IBMJCE Provider implements the following: HMAC-SHA1, MD2, MD5,
MARS, SHA, MD2withRSA, MD5withRSA, SHA1withRSA, RSA, SHA1withDSA, RC2, RC4,
Seal)implements the following:
 Signature algorithms               : SHA1withDSA, SHA1withRSA, MD5withRSA,
MD2withRSA,
                                        SHA2withRSA,
SHA3withRSA,
SHA5withRSA
 Cipher algorithms                  : Blowfish, AES, DES, TripleDES,
PBEWithMD2AndDES,
                                        PBEWithMD2AndTripleDES,
PBEWithMD2AndRC2,
                                        PBEWithMD5AndDES,
PBEWithMD5AndTripleDES,
                                        PBEWithMD5AndRC2,
PBEWithSHA1AndDES

                                        PBEWithSHA1AndTripleDES,
PBEWithSHA1AndRC2
                                        PBEWithSHAAnd40BitRC2,
PBEWithSHAAnd128BitRC2
                                        PBEWithSHAAnd40BitRC4,
PBEWithSHAAnd128BitRC4
                                        PBEWithSHAAnd2KeyTripleDES,
PBEWithSHAAnd3KeyTripleDES
                                        Mars, RC2,
RC4, ARCFOUR
                                        RSA, Seal
 Message authentication code (MAC)  : HmacSHA1, HmacSHA256, HmacSHA384,
HmacSHA512, HmacMD2, HmacMD5
 Key agreement algorithm            : DiffieHellman
 Key (pair) generator               : Blowfish, DiffieHellman, DSA, AES,
DES, TripleDES, HmacMD5,
                                        HmacSHA1, Mars,
RC2, RC4, RSA,
Seal, ARCFOUR
 Message digest                     : MD2, MD5, SHA-1, SHA-256, SHA-384,
SHA-512
 Algorithm parameter generator      : DiffieHellman, DSA
 Algorithm parameter                : Blowfish, DiffieHellman, AES, DES,
TripleDES, DSA, Mars,
                                        PBEwithMD5AndDES,
RC2
 Key factory                        : DiffieHellman, DSA, RSA
 Secret key factory                 : Blowfish, AES, DES, TripleDES, Mars,
RC2, RC4, Seal, ARCFOUR
                                        PKCS5Key, PBKDF1
and PBKDF2
(PKCS5Derived Key).
 Certificate                        : X.509
 Secure random                      : IBMSecureRandom
 Key store                          : JCEKS, PKCS12KS (PKCS12),
JKS

 IBMJGSSProvider 1.5: IBMJGSSProvider supports Kerberos V5 Mechanism
 IBMCertPath 1.1: IBMCertPath Provider implements the following:
 CertificateFactory                : X.509
 CertPathValidator              : PKIX
 CertStore                      : Collection, LDAP
 CertPathBuilder                : PKIX

 IBMSASL 1.5: IBM SASL provider(implements client mechanisms for:
DIGEST-MD5, GSSAPI, EXTERNAL, PLAIN, CRAM-MD5; server mechanisms for:
DIGEST-MD5, GSSAPI, CRAM-MD5)
00:18.56 DEBUG [Thread:ModalContext]
org.apache.commons.httpclient.params.DefaultHttpParams setParameter
 Set parameter http.authentication.credential-provider =
com.ibm.cic.common.transports.httpclient.HttpCredentialsProvider@4ada4ada
 Set parameter http.connection-manager.timeout = 30000
 Set parameter http.socket.timeout = 30000
 Set parameter http.tcp.nodelay = true
 Set parameter http.connection-manager.max-per-host = {HostConfiguration
[]=1}
 Set parameter http.connection-manager.max-total = 20
 Set parameter http.method.retry-handler =
com.ibm.cic.common.transports.httpclient.HttpClientDownloadHandler
$MethodRetryHandler@60ce60ce
...
03:17.93 DEBUG [Thread:Download Thread 0]
org.apache.commons.httpclient.Wire wire
 >>
"HEAD /bluewhale/products/AllGAs/repository/plugins/com.ibm.process.config.rsm_7.0.0.v20061101.jar
 HTTP/1.1[\r][\n]"
03:17.93 DEBUG [Thread:Download Thread 0]
org.apache.commons.httpclient.HttpMethodBase addHostRequestHeader
 Adding Host request header
03:17.93 DEBUG [Thread:Download Thread 0]
org.apache.commons.httpclient.Wire wire
 >> "Accept-Language: en_US[\r][\n]"
 >> "User-Agent: Jakarta Commons-HttpClient/3.0[\r][\n]"
 >> "Host: constellation.beaverton.ibm.com[\r][\n]"
 >> "[\r][\n]"
 << "HTTP/1.1 200 OK[\r][\n]"
 << "Date: Tue, 28 Aug 2007 18:40:02 GMT[\r][\n]"
 << "Server: Apache/2.0.52 (Red Hat)[\r][\n]"
 << "Last-Modified: Tue, 12 Jun 2007 20:48:27 GMT[\r][\n]"
 << "ETag: "4f946f-19ab03d-9e8a9cc0"[\r][\n]"
 << "Accept-Ranges: bytes[\r][\n]"
 << "Content-Length: 26914877[\r][\n]"
 << "Connection: close[\r][\n]"
 << "Content-Type: text/plain; charset=UTF-8[\r][\n]"
03:17.95 DEBUG [Thread:Download Thread 0]
org.apache.commons.httpclient.HttpMethodBase shouldCloseConnection
 Should close connection in response to directive: close
03:27.56 DEBUG [Thread:Download Thread 0]
org.apache.commons.httpclient.HttpConnection releaseConnection
 Releasing connection back to connection manager.
03:27.56 DEBUG [Thread:Download Thread 0]
org.apache.commons.httpclient.MultiThreadedHttpConnectionManager.ConnectionPool
 freeConnection
 Freeing connection, hostConfig=HostConfiguration[host=
http://constellation.beaverton.ibm.com.]
03:27.56 DEBUG [Thread:Download Thread 0]
org.apache.commons.httpclient.util.IdleConnectionHandler add
 Adding connection at: 1188326399109
03:27.56 DEBUG [Thread:Download Thread 0]
org.apache.commons.httpclient.MultiThreadedHttpConnectionManager.ConnectionPool
 notifyWaitingThread
 Notifying no-one, there are no waiting threads
03:27.56 DEBUG [Thread:Download Thread 0]
org.apache.commons.httpclient.params.DefaultHttpParams setParameter
 Set parameter http.method.retry-handler =
com.ibm.cic.common.transports.httpclient.HttpClientDownloadHandler
$MethodRetryHandler@1db61db6
03:27.56 DEBUG [Thread:Download Thread 0]
org.apache.commons.httpclient.MultiThreadedHttpConnectionManager
getConnectionWithTimeout
 HttpConnectionManager.getConnection:  config = HostConfiguration[host=
http://constellation.beaverton.ibm.com.], timeout = 30000
03:27.59 DEBUG [Thread:Download Thread 0]
org.apache.commons.httpclient.MultiThreadedHttpConnectionManager.ConnectionPool
 getFreeConnection
 Getting free connection, hostConfig=HostConfiguration[host=
http://constellation.beaverton.ibm.com.]
03:27.59 DEBUG [Thread:Download Thread 0]
org.apache.commons.httpclient.HttpConnection open
 Open connection to constellation.beaverton.ibm.com:80
03:27.59 DEBUG [Thread:Download Thread 0]
org.apache.commons.httpclient.Wire wire
 >>
"GET /bluewhale/products/AllGAs/repository/plugins/com.ibm.process.config.rsm_7.0.0.v20061101.jar
 HTTP/1.1[\r][\n]"
03:27.59 DEBUG [Thread:Download Thread 0]
org.apache.commons.httpclient.HttpMethodBase addHostRequestHeader
 Adding Host request header
03:27.59 DEBUG [Thread:Download Thread 0]
org.apache.commons.httpclient.Wire wire
 >> "Range: bytes=11695824-[\r][\n]"
 >> "Accept-Language: en_US[\r][\n]"
 >> "User-Agent: Jakarta Commons-HttpClient/3.0[\r][\n]"
 >> "Host: constellation.beaverton.ibm.com[\r][\n]"
 >> "[\r][\n]"
 << "HTTP/1.1 206 Partial Content[\r][\n]"
 << "Date: Tue, 28 Aug 2007 18:40:12 GMT[\r][\n]"
 << "Server: Apache/2.0.52 (Red Hat)[\r][\n]"
 << "Last-Modified: Tue, 12 Jun 2007 20:48:27 GMT[\r][\n]"
 << "ETag: "4f946f-19ab03d-9e8a9cc0"[\r][\n]"
 << "Accept-Ranges: bytes[\r][\n]"
 << "Content-Length: 15219053[\r][\n]"
 << "Content-Range: bytes 11695824-26914876/26914877[\r][\n]"
 << "Connection: close[\r][\n]"
 << "Content-Type: text/plain; charset=UTF-8[\r][\n]"
54:38.48 DEBUG [Thread:Download Thread 0]
org.apache.commons.httpclient.HttpMethodBase shouldCloseConnection
 Should close connection in response to directive: close
55:26.04 DEBUG [Thread:Download Thread 0]
org.apache.commons.httpclient.HttpConnection releaseConnection
 Releasing connection back to connection manager.
55:26.04 DEBUG [Thread:Download Thread 0]
org.apache.commons.httpclient.MultiThreadedHttpConnectionManager.ConnectionPool
 freeConnection
 Freeing connection, hostConfig=HostConfiguration[host=
http://constellation.beaverton.ibm.com.]
55:26.04 DEBUG [Thread:Download Thread 0]
org.apache.commons.httpclient.util.IdleConnectionHandler add
 Adding connection at: 1188329517593
55:26.04 DEBUG [Thread:Download Thread 0]
org.apache.commons.httpclient.MultiThreadedHttpConnectionManager.ConnectionPool
 notifyWaitingThread
 Notifying no-one, there are no waiting threads
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message