commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From (Oliver Heger)
Subject [Latka]Following references
Date Fri, 12 Dec 2003 19:08:25 GMT

as I have promised I created a patch (in bugzilla, bug ID is 25487) that 
adds the feature of extracting references from downloaded pages to 
Latka. These references can be refered to in later <request> tags.

The basic idea is that there are three new Jelly tags - link, form and 
frame - that can be put in the body of a <request> tag. With the 
attributes of these tags it is possible to refer to a corresponding 
reference on a page that has already been downloaded. The tags alter the 
actual request to load the URL defined by this reference. To demonstrate 
I have written an alternative version of the TestCommonsWebsite.xml 
script that makes use of these new tags.

Some remarks to the code:

There is a new package o.a.c.latka.html that contains the classes for 
parsing HTML pages and extracting references. I could not use the 
URLExtractor class in the attic for this purpose because I had to obtain 
much more information from the page, e.g. the body of the <a> tags or 
hidden input fields in the body of a <form> tag.

The RequestImpl class had to be slightly modified to allow sub tags of 
the <request> tag to change the URL of the actual request (creation of 
the HttpMethod object is now deferred). Also the ResponseImpl class now 
maintains an object with all references extracted from this page, which 
is created on demand.

I hope that I have met all coding style guides. Please have a look at 
the code and let me know what you think. If you decide to apply the 
patch, I can also provide some documentation updates.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message