hc-httpclient-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Graeme" <coolki...@hotmail.com>
Subject RE: How can I find certain words in a html page?
Date Thu, 13 Oct 2005 12:20:55 GMT
Ok thanks ill should be able to search for those two things in the string
then get the substring of what is between them.

-----Original Message-----
From: Thom Hehl [mailto:thom@nowhereatall.com] 
Sent: 13 October 2005 11:33
To: HttpClient User Discussion
Subject: Re: How can I find certain words in a html page?

Start by looking at String.matches(). If that will meet your needs, it 
could save you a bit of work.

Owen Smith wrote:

>Since you have a pretty exact idea of what surrounds the data that
>you're looking for a bit of work with regular expressions (regexps)
>should be enough to extract the data you want.  There are a bunch of
>packages you can use to provide regexp functionality.  A bit of
>searching with google should be enough to get you started.
>
>HtH,
>Owen
>
>On 10/12/05, Graeme <coolkidd3@hotmail.com> wrote:
>  
>
>>I am going to be using HTTPCLIENT to get the source of a web page and I am
>>hoping to be able to extract certain information from that webpage. It
will
>>all be HTML and I am looking for all the information between these tags
>>
>>    
>>
><snipped HTML stuff>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
>For additional commands, e-mail: httpclient-user-help@jakarta.apache.org
>
>
>  
>


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-user-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-user-help@jakarta.apache.org


Mime
View raw message