hc-httpclient-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gerald Turner <gtur...@unzane.com>
Subject Re: HttpClient 3.1 to 4.0 migration
Date Thu, 03 Sep 2009 22:04:33 GMT
Oleg Kalnichevski <olegk@apache.org> writes:

> Gerald Turner wrote:
>> • Releasing connections has increased the amount of code (finally
>> block logic) that client applications need:
>>
>>   HttpClient 3.x:
>>
>>     HttpMethod method = null;
>>     try {
>>       …
>>     }
>>     finally {
>>       if (method != null) {
>>         try {
>>           method.releaseConnection()
>>         }
>>         catch (Exception e) {
>>           log.warn("Failed to release connection: "
>>                    + e.getMessage(), e);
>>         }
>>       }
>>     }
>>
>>   HttpClient 4.0:
>>
>>     HttpUriRequest request = null;
>>     HttpResponse response = null;
>>     try {
>>       …
>>     }
>>     finally {
>>       boolean released = false;
>>
>>       // Try consumeContent first, so that connection may be
>>       // kept-alive and reused.  Note there is no response entity
>>       // for HTTP codes like 304
>>       if (response != null) {
>>         try {
>>           HttpEntity entity = response.getEntity();
>>           if (entity != null) {
>>             entity.consumeContent();
>>             released = true;
>>           }
>>         }
>>         catch (Exception e) {
>>           log.warn("Failed to consume HttpEntity: "
>>                    + e.getMessage(), e);
>>         }
>>       }
>>
>>       // Otherwise abort the connection, seems allow the
>>       // connection to be kept-alive and reused in some cases
>>       // (like when there was no response entity), this is a good
>>       // thing even if “abort” sounds like it's going to close.
>>       if (!released && request != null) {
>>         try {
>>           request.abort();
>>         }
>>         catch (Exception e) {
>>           log.warn("Failed to abort HttpUriRequest: "
>>                    + e.getMessage(), e);
>>         }
>>       }
>>     }
>>   }
>>
>
> Pretty much all this code is not necessary. (1) HttpClient will
> automatically release the underlying connection as long as the entity
> content is consumed to the end of stream; (2) HttpClient will
> automatically release the underlying connection on any I/O exception
> thrown while reading the response content. No special processing is
> required in such as case.
>

Ah!, I should've mentioned that much of the code I've been working on
fires off GETs/POSTs (even a Range-conditioned GET in one place) and
doesn't bother reading the response content, it's happy with a 200 (or
206).

Some of the code I've been working on expects 304's (If-Modified-Since)
and 404's, there is response content (like a user-friendly File Not
Found HTML page), but again it only cares about the status code.

Anyway the heavy finally block is safe correct?  For instance a 304 has
no content, but I fall thru can call request.abort() even though it's
already been recycled.

> In fact, this is perfectly sufficient to ensure proper release of
> resources:
>
> HttpResponse rsp = httpclient.execute(target, req);
> HttpEntity entity = rsp.getEntity();
> if (entity != null) {
>     InputStream instream = entity.getContent();
>     try {
>         // process content
>     } finally {
>         instream.close();
>        // entity.consumeContent() would also do
>     }
> }
>
> That is it.
>
>>
>> • In addition to this complex finally-block, I have created a class
>> which extends HttpEntity (CompressedEntity, does gzip
>> Transfer-Encoding of requests), it's assigned to requests, it needs
>> to release resources, HttpService.handleRequest does call
>> consumeContent, however I am concerned with HTTP request executions
>> that don't make it that far (connection manager timeout for
>> instance), in that case my finally block has to call
>> request.getEntity().consumeContent().
>>
>
> See above. In case of an I/O error the connection will get
> automatically closed and released back to the manager.
>

I'm not so sure about this - it's a request entity that begins life with
some resource that needs to be cleaned.  If HttpClient doesn't get as
far as obtaining a connection from it's pool, wouldn't the request *not*
have it's consumeContent called?

>> • Wishlist request to add an HttpResponse.getHeader(String) method
>> that returns the first header or null.  During the HttpClient 3.x → 4
>> migration I found a couple dozen of these:
>>
>>   Before:
>>
>>     Header lastModifiedHeader =
>>       method.getResponseHeader("Last-Modified");
>>     if (lastModifiedHeaders != null) {
>>       Date headerDate =
>>         DateUtil.parseDate(lastModifiedHeader.getValue());
>>       …
>>     }
>>
>>   After:
>>
>>     Header[] lastModifiedHeaders =
>>       response.getHeaders("Last-Modified");
>>     if (lastModifiedHeaders != null
>>         && lastModifiedHeaders.length > 0) {
>>       Date headerDate =
>>         DateUtils.parseDate(lastModifiedHeaders[0].getValue());
>>       …
>>     }
>>
>> No big deal, just additional array subscripts.  This code is going to
>> be blind to length > 1, so may as well have a convenience method that
>> returns only the first element.
>>
>>
>
> What is wrong with HttpMessage#getFirstHeader /
> HttpMessage#getLastHeader?
>

Duh... not sure how I missed that!  Thanks.

>> • Wishlist request to add an additional constructor to StringEntity
>> which takes the MIME-type portion of the Content-Type.
>>
>>   Without additional constructor:
>>
>>     StringEntity entity = new StringEntity(xml, "UTF-8");
>>     entity.setContentType("text/xml; charset=UTF-8");
>>
>>   With additional constructor:
>>
>>     StringEntity entity = new StringEntity(xml, "text/xml", "UTF-8");
>>
>>
>
> Raise a JIRA.
>

HTTPCORE-204 with patch.

>> • Wishlist request for preemptive authentication to be included in
>> the API, like HttpClient 3.x had.  There is an example
>> ClientPreemptiveBasicAuthentication.java that uses
>> HttpRequestInterceptor which I had adapted to my application and it
>> works fine.
>>
>
> Raise a JIRA.
>

HTTPCLIENT-872 with patch.

>> • Multi-part requests in HttpClient 3.x were a little easier to work
>> with in a few special cases, the Part interface included the ‘name’
>> field, whereas the ‘name’ field is now a parameter in the
>> MultipartEntity.addPart(String,BodyContent) call.
>>
>>   Before:
>>
>>     Part[] parts = {
>>       new StringPart("Some-Field", "Some-Value"),
>>       new FilePart("Some-File", f)
>>     };
>>
>>     MultipartRequestEntity entity =
>>       new MultipartRequestEntity(parts, method.getParams());
>>
>>   After:
>>
>>     LinkedHashMap<String,ContentBody> parts =
>>       new LinkedHashMap<String,ContentBody>();
>>     parts.put("Some-Field", new StringBody("Some-Value"));
>>     parts.put("Some-File", new FileBody(f));
>>
>>     MultipartEntity entity = new MultipartEntity();
>>     for (Entry<String,ContentBody> part : parts)
>>       entity.addPart(part.getKey(), part.getValue());
>>
>> In some respects I like the API better, but I do have some code which
>> needs to assemble the parts (including names) independent of the HTTP
>> request execution, so in one application I wrote a Part wrapper:
>>
>>   for (Part part : parts)
>>     entity.addPart(part.getName(), part.getContent());
>>
>
> If you can think of a fix for the issue, raise a JIRA.
>

FormBodyPart does the trick (couples name and content).

MultipartEntity.addPart looks like this:

  public void addPart(final String name, final ContentBody contentBody) {
    this.multipart.addBodyPart(new FormBodyPart(name, contentBody));
    this.dirty = true;
  }

How about an addPart override that looks like:

  public void addPart(final BodyPart bodyPart) {
    this.multipart.addBodyPart(bodyPart);
    this.dirty = true;
  }

>> • The HttpParams framework does not support nested introspection.  I
>> have a utility class which configures HttpClient from a configuration
>> file, and with HttpClient 3.x I used BeanUtils to copy parameters
>> from the configuration file, for example the following would work:
>>
>>   PropertyUtils.setProperty
>>    (client,
>>     "params.soTimeout",
>>     "30000");
>>
>>   PropertyUtils.setProperty
>>    (client,
>>     "httpConnectionManager.params.defaultMaxConnectionsPerHost",
>>     "2");
>>
>>   /* These are just example statements, the actual parameter names
>>      are not hard-coded and instead passed from the Configuration
>>      API like the following: */
>>
>>   Configuration subset = config.subset("HttpClient");
>>   for (String key : subset.getKeys())
>>     PropertyUtils.setProperty(client, key, subset.getProperty(key));
>>
>> I see there are bean classes that wrap HttpParams (e.g.
>> ClientParamBean), so potentially similar magic could be done with
>> code like:
>>
>>   // config has:
>>   // HttpClient.ClientParam.maxRedirects = 10
>>   // HttpClient.ConnManagerParamBean.MaxTotalConnections = 100
>>
>>   HttpAbstractParamBean bean = null;
>>   Configuration subset = null;
>>
>>   bean = new ClientParamBean(client.getParams());
>>   subset = config.subset("HttpClient.ClientParam");
>>   for (String key : subset.getKeys())
>>     PropertyUtils.setProperty(bean, key, subset.getProperty(key));
>>
>>   bean = new ConnManagerParamBean(client.getParams());
>>   subset = config.subset("HttpClient.ConnManagerParamBean");
>>   for (String key : subset.getKeys())
>>     PropertyUtils.setProperty(bean, key, subset.getProperty(key));
>>
>> However there are two notable problems:
>>
>>   1. Not all parameters have a bean class (CoreConnectionPNames
>>      with SO_TIMEOUT)
>
> Bug. Raise a JIRA
>

HTTPCLIENT-873.

>>
>>   2. Many parameter types are too complex for ConvertUtils, the
>>      worst being MAX_CONNECTIONS_PER_ROUTE with ConnPerRouteBean
>>      type.  However this could be alleviated with registering
>>      Converters with ConvertUtils, maybe that's going too far for
>>      the convenience of this utility class!
>>
>
> ConnPerRouteBean was a bug mistake on my part. I am thinking about
> deprecating the damn thing in 4.1.
>

That's too bad, although it does seem useful to configure varying pool
sizes by hostname.

What do you think about the nested introspection thing?

It could be done by having the bean classes reference each other, like
ClientParamBean.getConnManagerParam (returns ConnManagerParamBean), but
there is also the problem of complex types — maybe make a policy for the
bean classes to deal with int/long/boolean/String types only (and
getters for nested beans).

>> • The method DateUtils#formatDate(Date) is very handy and I use it in
>> a lot of applications, nothing wrong here except that the package
>> “….impl.cookie” throws me off — having me rely on any external API
>> implementation (“impl”) doesn't feel like a smart thing for me to do,
>> also I don't use cookies at all, I use this method for handling
>> Last-Modified and If-Modified-Since kinds of headers.
>>
>
> We could deprecate DateUtils class in the impl package and move it to
> another package within the 'public' API space.
>

I vote yes ☺

>> • FileBody does not allow the filename field in the
>> Content-Disposition header to be overriden, the filename taken from
>> the File object - I have software that creates temporary files and
>> needs to assign an implicit logical filename.
>>
>
> Raise a JIRA.
>

HTTPCLIENT-871 with patch.

>> • I see there's already a bug filed for the jcip-annotations.jar
>> compile time dependency in client applications (HTTPCLIENT-866), even
>> if it doesn't get fixed I'm not too worried about adding the jar to a
>> dozen build scripts.
>>
>
> Fixed. It is regrettable this problem slipped into the official
> release.
>

Thanks.

> PS: In case you open a change request in JIRA be prepared that evil
> Russians will try to extort a patch from you.
>

Done ☺

-- 
Gerald Turner  Email: gturner@unzane.com  JID: gturner@jabber.unzane.com
GPG: 0xFA8CD6D5  21D9 B2E8 7FE7 F19E 5F7D  4D0C 3FA0 810F FA8C D6D5

---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org


Mime
View raw message