jmeter-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sebb <seb...@gmail.com>
Subject Re: Regular expression extractor for spider
Date Tue, 03 Sep 2013 18:36:40 GMT
On 3 September 2013 19:08, Jordi Carretero <jordicarretero@gmail.com> wrote:
> Hi
>
> I'm building a spider using a regular expression extractor and a for-each-
> controller and works pretty well but..
>
> I'm using <a href="[.]*/([^"]+)" as a expression extractor , and works well
> to extract links like:
> <a href="../rel/c/items" >
> <a href="/professions.html"
>
> but I can not find any expression that will work at the same time for
> expressions found in some sites like:
>
> <a href="http://www.mysite.es/index.php?main_page=page&amp;id=20<http://www.mysite.es/index.php?main_page=page&id=20>
> "
>
> that include the full domain at the beginning (and has to be removed)
>
> It's a matter of working with the perl expression but after some days I
> could not manage to make it work, so any help will be appreciated

If you want to ignore an optional string, use something like:

(?:http://www\.mysite\.es)?

The form (abc)? means abc or nothing; the (?:) form means don't save
the contents.

In your case, if you want to ignore both ".", ".." and
"http:/www.mysite.es" you could use:

(?:http://www\.mysite\.es|\.\.?)?

BTW, rather than use "[.]" to escape the meta-character ".", the usual
method is "\.".

> Thanks

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@jmeter.apache.org
For additional commands, e-mail: user-help@jmeter.apache.org


Mime
View raw message