jmeter-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bruce Atherton <>
Subject Using JMeter for Archiving a Website?
Date Mon, 10 Dec 2001 17:52:43 GMT
I am trying to archive the contents of an extranet website which is mostly 
dynamic content. I'd like to record what it's contents every day, and store 
them in a format where you can open up the website and browse it as it 
existed on any given day.

I posted a message on Usenet and one respondent suggested I look at JMeter 
for a solution. I was wondering whether anyone on this list had set up 
JMeter to do something similar, or had other suggestions as to how I could 
accomplish my task, involving JMeter or not. I'm willing to code some Java 
if that would help.

Some of the features I require for this website snapshot program:

1. Parse the HTML and extract further URLs to follow, just like any spider 

2. Provide support for URL Encoding of a Session ID

3. Parse forms to recognize Submit URLs and the field data that must be 
returned in a POST, including hidden fields.

4. Allow setting a configuration file to provide the data that should be 
returned for a particular field in a form (for example, setting what should 
be returned in "username" and "password" fields).

5. Support regular expressions so that you can make sure the session is 
going the way it should. For example, if you get "Login Failed" in the 
returned HTML you should be able to recognize that as an error condition.

6. Replace any absolute URLs with relative ones, so that if you open the 
archive on disk it will look and act exactly the same way the web site did 
that day.

7. Do depth first searches (which a user could conceivably do) rather than 
breadth first (which a user could not do) so that context within the 
session is kept sensible.

Any pointers, suggestions, guidelines? I'd be most appreciative of any 
information. Thanks.

To unsubscribe, e-mail:   <>
For additional commands, e-mail: <>

View raw message