hc-httpclient-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Oleg Kalnichevski <ol...@apache.org>
Subject Re: HttpClient 3.1 to 4.0 migration
Date Wed, 02 Sep 2009 18:51:55 GMT
Gerald Turner wrote:
> Hello HttpClient Users List, I have spent the last couple days upgrading
> a dozen applications from HttpClient 3.1 to 4.0.
> 
> First off, I must say that I'm very pleased that
> MultiThreadedHttpConnectionManager (now ThreadSafeClientConnManager) is
> using synchronization rather than thread interrupts.  Bugs like
> HTTPCLIENT-633 and HTTPCLIENT-731 have been a problem with one
> application in particular for a long time.
> 
> Also X509HostnameVerifier and MultihomePlainSocketFactory look
> interesting and should help with projects in the future.
> 
> The rest of this email is a few critiques and wishlist requests,
> experiences while migrating, maybe helpful information for other people.
> Also I may be incorrect in some cases and would like feedback before
> going live with the new versions of these applications.
> 
> 
> • Releasing connections has increased the amount of code (finally block
> logic) that client applications need:
> 
>   HttpClient 3.x:
> 
>     HttpMethod method = null;
>     try {
>       …
>     }
>     finally {
>       if (method != null) {
>         try {
>           method.releaseConnection()
>         }
>         catch (Exception e) {
>           log.warn("Failed to release connection: "
>                    + e.getMessage(), e);
>         }
>       }
>     }
> 
>   HttpClient 4.0:
> 
>     HttpUriRequest request = null;
>     HttpResponse response = null;
>     try {
>       …
>     }
>     finally {
>       boolean released = false;
> 
>       // Try consumeContent first, so that connection may be
>       // kept-alive and reused.  Note there is no response entity
>       // for HTTP codes like 304
>       if (response != null) {
>         try {
>           HttpEntity entity = response.getEntity();
>           if (entity != null) {
>             entity.consumeContent();
>             released = true;
>           }
>         }
>         catch (Exception e) {
>           log.warn("Failed to consume HttpEntity: "
>                    + e.getMessage(), e);
>         }
>       }
> 
>       // Otherwise abort the connection, seems allow the
>       // connection to be kept-alive and reused in some cases
>       // (like when there was no response entity), this is a good
>       // thing even if “abort” sounds like it's going to close.
>       if (!released && request != null) {
>         try {
>           request.abort();
>         }
>         catch (Exception e) {
>           log.warn("Failed to abort HttpUriRequest: "
>                    + e.getMessage(), e);
>         }
>       }
>     }
>   }
> 

Pretty much all this code is not necessary. (1) HttpClient will 
automatically release the underlying connection as long as the entity 
content is consumed to the end of stream; (2) HttpClient will 
automatically release the underlying connection on any I/O exception 
thrown while reading the response content. No special processing is 
required in such as case.

In fact, this is perfectly sufficient to ensure proper release of resources:

HttpResponse rsp = httpclient.execute(target, req);
HttpEntity entity = rsp.getEntity();
if (entity != null) {
     InputStream instream = entity.getContent();
     try {
         // process content
     } finally {
         instream.close();
        // entity.consumeContent() would also do
     }
}

That is it.

> 
> • In addition to this complex finally-block, I have created a class
> which extends HttpEntity (CompressedEntity, does gzip Transfer-Encoding
> of requests), it's assigned to requests, it needs to release resources,
> HttpService.handleRequest does call consumeContent, however I am
> concerned with HTTP request executions that don't make it that far
> (connection manager timeout for instance), in that case my finally block
> has to call request.getEntity().consumeContent().
> 

See above. In case of an I/O error the connection will get automatically 
closed and released back to the manager.

> 
> • HttpException does not extend IOException like 3.x did.  During
> migration I tripped on some code that was explicitly catching
> IOException from HttpClient and other URL encoding, charset
> manipulating, and java.io APIs, while letting other exceptions throw up
> the stack, nothing a little cast/re-throw magic couldn't fix.
> 

Yes, that was a conscious decision. HttpClient, however, re-throws 
HttpException as ClientProtocolException, which as a subclass of 
IOException. So, I am not sure I fully understand the problem here.

> 
> • Wishlist request to add an HttpResponse.getHeader(String) method that
> returns the first header or null.  During the HttpClient 3.x → 4
> migration I found a couple dozen of these:
> 
>   Before:
> 
>     Header lastModifiedHeader =
>       method.getResponseHeader("Last-Modified");
>     if (lastModifiedHeaders != null) {
>       Date headerDate =
>         DateUtil.parseDate(lastModifiedHeader.getValue());
>       …
>     }
> 
>   After:
> 
>     Header[] lastModifiedHeaders =
>       response.getHeaders("Last-Modified");
>     if (lastModifiedHeaders != null
>         && lastModifiedHeaders.length > 0) {
>       Date headerDate =
>         DateUtils.parseDate(lastModifiedHeaders[0].getValue());
>       …
>     }
> 
> No big deal, just additional array subscripts.  This code is going to be
> blind to length > 1, so may as well have a convenience method that
> returns only the first element.
> 
> 

What is wrong with HttpMessage#getFirstHeader / HttpMessage#getLastHeader?


> • Wishlist request to add an additional constructor to StringEntity
> which takes the MIME-type portion of the Content-Type.
> 
>   Without additional constructor:
> 
>     StringEntity entity = new StringEntity(xml, "UTF-8");
>     entity.setContentType("text/xml; charset=UTF-8");
> 
>   With additional constructor:
> 
>     StringEntity entity = new StringEntity(xml, "text/xml", "UTF-8");
> 
> 

Raise a JIRA.


> • Wishlist request for preemptive authentication to be included in the
> API, like HttpClient 3.x had.  There is an example
> ClientPreemptiveBasicAuthentication.java that uses
> HttpRequestInterceptor which I had adapted to my application and it
> works fine.
> 

Raise a JIRA.

> 
> • Multi-part requests in HttpClient 3.x were a little easier to work
> with in a few special cases, the Part interface included the ‘name’
> field, whereas the ‘name’ field is now a parameter in the
> MultipartEntity.addPart(String,BodyContent) call.
> 
>   Before:
> 
>     Part[] parts = {
>       new StringPart("Some-Field", "Some-Value"),
>       new FilePart("Some-File", f)
>     };
> 
>     MultipartRequestEntity entity =
>       new MultipartRequestEntity(parts, method.getParams());
> 
>   After:
> 
>     LinkedHashMap<String,ContentBody> parts =
>       new LinkedHashMap<String,ContentBody>();
>     parts.put("Some-Field", new StringBody("Some-Value"));
>     parts.put("Some-File", new FileBody(f));
> 
>     MultipartEntity entity = new MultipartEntity();
>     for (Entry<String,ContentBody> part : parts)
>       entity.addPart(part.getKey(), part.getValue());
> 
> In some respects I like the API better, but I do have some code which
> needs to assemble the parts (including names) independent of the HTTP
> request execution, so in one application I wrote a Part wrapper:
> 
>   for (Part part : parts)
>     entity.addPart(part.getName(), part.getContent());
> 

If you can think of a fix for the issue, raise a JIRA.


> 
> • The HttpParams framework does not support nested introspection.  I
> have a utility class which configures HttpClient from a configuration
> file, and with HttpClient 3.x I used BeanUtils to copy parameters from
> the configuration file, for example the following would work:
> 
>   PropertyUtils.setProperty
>    (client,
>     "params.soTimeout",
>     "30000");
> 
>   PropertyUtils.setProperty
>    (client,
>     "httpConnectionManager.params.defaultMaxConnectionsPerHost",
>     "2");
> 
>   /* These are just example statements, the actual parameter names
>      are not hard-coded and instead passed from the Configuration
>      API like the following: */
> 
>   Configuration subset = config.subset("HttpClient");
>   for (String key : subset.getKeys())
>     PropertyUtils.setProperty(client, key, subset.getProperty(key));
> 
> I see there are bean classes that wrap HttpParams (e.g.
> ClientParamBean), so potentially similar magic could be done with code
> like:
> 
>   // config has:
>   // HttpClient.ClientParam.maxRedirects = 10
>   // HttpClient.ConnManagerParamBean.MaxTotalConnections = 100
> 
>   HttpAbstractParamBean bean = null;
>   Configuration subset = null;
> 
>   bean = new ClientParamBean(client.getParams());
>   subset = config.subset("HttpClient.ClientParam");
>   for (String key : subset.getKeys())
>     PropertyUtils.setProperty(bean, key, subset.getProperty(key));
> 
>   bean = new ConnManagerParamBean(client.getParams());
>   subset = config.subset("HttpClient.ConnManagerParamBean");
>   for (String key : subset.getKeys())
>     PropertyUtils.setProperty(bean, key, subset.getProperty(key));
> 
> However there are two notable problems:
> 
>   1. Not all parameters have a bean class (CoreConnectionPNames
>      with SO_TIMEOUT)

Bug. Raise a JIRA


> 
>   2. Many parameter types are too complex for ConvertUtils, the
>      worst being MAX_CONNECTIONS_PER_ROUTE with ConnPerRouteBean
>      type.  However this could be alleviated with registering
>      Converters with ConvertUtils, maybe that's going too far for
>      the convenience of this utility class!
> 

ConnPerRouteBean was a bug mistake on my part. I am thinking about 
deprecating the damn thing in 4.1.

> 
> • The method DateUtils#formatDate(Date) is very handy and I use it in a
> lot of applications, nothing wrong here except that the package
> “….impl.cookie” throws me off — having me rely on any external API
> implementation (“impl”) doesn't feel like a smart thing for me to do,
> also I don't use cookies at all, I use this method for handling
> Last-Modified and If-Modified-Since kinds of headers.
> 
> 

We could deprecate DateUtils class in the impl package and move it to 
another package within the 'public' API space.



> • FileBody does not allow the filename field in the Content-Disposition
> header to be overriden, the filename taken from the File object - I have
> software that creates temporary files and needs to assign an implicit
> logical filename.
> 
> 

Raise a JIRA.


> • I see there's already a bug filed for the jcip-annotations.jar compile
> time dependency in client applications (HTTPCLIENT-866), even if it
> doesn't get fixed I'm not too worried about adding the jar to a dozen
> build scripts.
> 

Fixed. It is regrettable this problem slipped into the official release.

Oleg

PS: In case you open a change request in JIRA be prepared that evil 
Russians will try to extort a patch from you.


> 
> Thanks!
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org


Mime
View raw message