hc-httpclient-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christophe Bouhier" <christo...@kualasoft.com>
Subject Re: HttpClient bandwidth
Date Wed, 31 Aug 2005 05:13:24 GMT
Hi,

What we do is use the HTTP HEAD method HeadMethod() to retrieve the modified
date of on the server.
something like this after invoking execute():

        if ((header = method.getResponseHeader("Date")) != null) {
            date = header.getValue();
        }
        if ((header = method.getResponseHeader("Last-Modified")) != null) {
            lastModifiedDate = header.getValue();
        }

        if ((header = method.getResponseHeader("Content-Type")) != null) {
            contentType = header.getValue();
        }

        if ((header = method.getResponseHeader("Content-Length")) != null) {
            contentLength = header.getValue();
        }

The values are strings, so you will need to convert them. The format is RFC
822, so you need to convert to Date type to compare.
I use a simple date formatter.

                DateFormat formatter = DateFormat.getDateTimeInstance(
                        DateFormat.MEDIUM, DateFormat.MEDIUM, Locale.US);
                SimpleDateFormat simpleFormatter = (SimpleDateFormat)
formatter;
                simpleFormatter.applyPattern("EEE, dd MMM yyyy kk:mm:ss z");
                if( modifiedString != null && !"".equals( modifiedString ) )
{
                    Date lDate = simpleFormatter.parse(modifiedString);
                    modified = lDate.getTime();
                }

Next you do some date comparison with your file on disk.
Hope this helps a bit.

Cheers / Christophe



----- Original Message ----- 
From: "B K" <griip@hotmail.com>
To: <httpclient-user@jakarta.apache.org>
Sent: Wednesday, August 31, 2005 3:09 AM
Subject: RE: HttpClient bandwidth


> I am a little confused by your comments Gus as I understand it the request
> headers are being sent as part of the request to the server, how would
> checking these headers help me out?
>
> Cheers
>
>
> >From: Gustavo Hexsel <ghexsel@sagebrushcorp.com>
> >Reply-To: "HttpClient User Discussion"
<httpclient-user@jakarta.apache.org>
> >To: 'HttpClient User Discussion' <httpclient-user@jakarta.apache.org>
> >Subject: RE: HttpClient bandwidth
> >Date: Tue, 30 Aug 2005 10:25:51 -0500
> >
> >   Can't you just check the header's value with getRequestHeader() before
> >you
> >call getResponseBodyAsStream()?
> >
> >   If it doesn't match, just close the connection... the website will
> >generate the response page anyway and the load should be about the same
for
> >it, but the httpClient doesn't need all that.
> >
> >   []s Gus
> >
> >
> >-----Original Message-----
> >From: Oleg Kalnichevski [mailto:olegk@apache.org]
> >Sent: August 30, 2005 6:24 AM
> >To: httpclient-user@jakarta.apache.org
> >Subject: Re: HttpClient bandwidth
> >
> >
> >On Tue, Aug 30, 2005 at 10:15:50PM +1000, B K wrote:
> > > Hi all,
> > > I have developed an application using httpclient, and now that I
> > > have started using it, it is using to much bandwidth I wonder if
> >anybody
> > > has some pointers on how to reduce the amount of data being
transferred.
> >My
> > > idea was to use the response headers to only retrieve the response if
> >the
> > > data had changed, unfortunately I can't find any way of doing this, my
> >idea
> > > was to only retrieve the response when the web content changed. The
web
> > > sites I am dealing with are very dynamic and change every couple of
> >minutes
> > > so I need to check every minute for updates to the data.
> > >
> > > The application is gathering data from 10 web sites, and I have
> >developed
> > > it so there are 10 instances of httpclient, one for each web site, the
> >big
> >
> > > problem I see is everytime I send out the requests I have to download
> >the
> > > response even if there is no change in the data. Anybody got some
bright
> > > design ideas on how to cut down, I have researched and come up with
> >nothing.
> > >
> > > Thanks
> > >
> >
> >B K,
> >
> >There's not much you can do unless the target servers play along. Please
> >refer to the HTTP spec [1] and take a look at the 304 Not Modified
> >mechanism [2] for details
> >
> >Hope this helps
> >
> >Oleg
> >[1] http://www.w3.org/Protocols/rfc2616/rfc2616.html
> >[2] http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.3.5
> >
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
> > > For additional commands, e-mail:
httpclient-user-help@jakarta.apache.org
> > >
> > >
> >
> >---------------------------------------------------------------------
> >To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
> >For additional commands, e-mail: httpclient-user-help@jakarta.apache.org
> >
> >---------------------------------------------------------------------
> >To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
> >For additional commands, e-mail: httpclient-user-help@jakarta.apache.org
> >
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: httpclient-user-help@jakarta.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-user-help@jakarta.apache.org


Mime
View raw message