jmeter-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Felix Schumacher <felix.schumac...@internetallee.de>
Subject Re: Test Script Recorder XML Regex Matching
Date Tue, 07 Oct 2014 19:13:09 GMT
Am 06.10.2014 um 02:08 schrieb sebb:
> On 5 October 2014 15:24, Felix Schumacher
> <felix.schumacher@internetallee.de> wrote:
>> Am 05.10.2014 um 14:35 schrieb sebb:
>>
>>> On 5 October 2014 13:26, Felix Schumacher
>>> <felix.schumacher@internetallee.de> wrote:
>>>> Am 05.10.2014 um 11:30 schrieb sebb:
>>>>
>>>>> On 4 October 2014 19:41, Philippe Mouawad <philippe.mouawad@gmail.com>
>>>>> wrote:
>>>>>> On Sat, Oct 4, 2014 at 2:10 PM, Felix Schumacher <
>>>>>> felix.schumacher@internetallee.de> wrote:
>>>>>>
>>>>>>> Am 29.09.2014 um 22:32 schrieb Philippe Mouawad:
>>>>>>>
>>>>>>>> Hi Felix,
>>>>>>>>
>>>>>>> Hi
>>>>>>> I agree with sebb, patch is interesting.
>>>>>>>> But it clearly needs to be documented (I think many users
don't know
>>>>>>>> about
>>>>>>>> this feature which is really interesting) as long as code,
reading
>>>>>>>> patch
>>>>>>>> first it wasn't clear for me what was intended.
>>>>>>>>
>>>>>>> I have added documentation to the patch and found two other things,
>>>>>>> that
>>>>>>> I
>>>>>>> changed
>>>>>>> in the same bug-entry.
>>>>>>>
>>>>>>> The random order of applying the matchers, seems a bit strange,
so I
>>>>>>> sorted the matchers
>>>>>>> first by their length and if the matchers are the same length,
then by
>>>>>>> the
>>>>>>> name of their keys. So
>>>>>>> the set
>>>>>>>     {'domain': 'example.com', 'server': 'www',  'regex': 'w.*'
}
>>>>>>> would be applied in the order ['domain', 'regex', 'server'] since
>>>>>>> 'domain'
>>>>>>> has the longest matcher and
>>>>>>> 'regex' comes before 'server' alphabetically (matchers are both
the
>>>>>>> same
>>>>>>> length).
>>>>>>>
>>>>>> Isn't it better to order by longest value or regexp ?
>>>>>> www is more specific than w.*
>>>>>> So would be :
>>>>>> domain, server , regex
>>>>> Or the code could try to match every variable and select the one that
>>>>> produces the longest match.
>>>>>
>>>>> But rather than try and sort the regexes, which is always going to be
>>>>> tricky to do "correctly" (whatever that means), maybe the user should
>>>>> be given control of the matching order.
>>>>>
>>>>> For example, it is probably possible to match by order of appearance.
>>>>>
>>>>> It would certainly be possible to match the variables in sorted order
by
>>>>> name.
>>>>> This would be a bit more awkard to use than changing the order of
>>>>> variable definitions.
>>>> I just wanted to give a simple algorithm for ordering, which I think is
>>>> better than random ordering.
>>>>
>>>> Correctness will be hard to implement, when everyone has a different view
>>>> on
>>>> the correct ordering.
>>>>
>>>> I had thought of giving more control to the user by appending the
>>>> variable
>>>> names with something to sort by.
>>>>
>>>> For example extending the above example with variable names ['domain',
>>>> 'server', 'regex'] the names could be
>>>> changed to ['domain_3', 'server_1', 'regex_2'] to impose replacement in
>>>> the
>>>> order ['server', 'regex', 'domain'].
>>>> But what should we do with the suffix '_\d+'? (A prefix could be used,
>>>> too)
>>>>
>>>> We could look for a specially named variable like '_regex_order' which
>>>> could
>>>> have a comma separated list of
>>>> the variable names in the wished order.
>>>>
>>>> The longer I think about it, the more I am inclined to take the simple
>>>> ordering algorithm of length and then name. One can
>>>> always make any regex longer by adding useless junk like
>>>> '(?:WILLNOTBEFOUNDANYWAY)?' and in such a way influence
>>>> the order.
>>> No, length of regex is not useful.
>> But it is easy to do and can be done consistently before trying to match :)
> Just because it is easy does not make it useful.
>
>>> More useful would be sorting by matched string.
>> I will try to do a patch which will do that, but I think it will be more
>> complex.
> Yes, it will be more complex.
> But I think it is more likely to be correct, but that's not
> guaranteed, which is why I think the user should have control.
>
>>> Sorting by name is awkward to use, and anyway what about non-regexes
>>> that happen to match the same text?
>> Well in regex mode every string happens to be a regex. And with sorting by
>> name do you include using
>> (and possibly stripping off) a prefix or suffix?
> It's not only regex matching that has potential ordering issues.
>
>>>
>>> I don't think it's possible to automatically sort correctly by regex.
>> Well it is simple to order it correctly, when you want to have it sorted by
>> the current algorithm. But that is
>> obviously not your preferred order.
> I just don't think it's possible to guarantee the correct order automatically.
> The best one can hope for is that it will be right more often than
> not, but there will always be edge cases.
>
> Which is why I don't think it's worth trying to guess what people will need.
> The user needs to be in control.
>
>> As I said, I think any repeatable
>> ordering is better then no order.
> It needs to be predictable (and documented).
> The current order is repeatable (on a given system), but is not easily
> predictable.
>
>>> So we should allow the user to control the search order, as I already
>>> suggested a short while ago.
>> Right, what is your suggestion of means to accomplish that order?
> I already suggested using the order of definition of the variables on
> the test plan.
> That should be possible, and is easier to use than variable renaming -
> which may result in awkward names.
Having looked a bit deeper into this, it seems the order is already 
defined :)

In Arguments#getArgumentsAsMap the arguments will be added to a 
LinkedHashMap,
so we already have a deterministic behaviour.

Sorry for the noise on that one.

Regards
  Felix
>
>> Would you like it to be another variable with a special name?
>> (I called that one '_regex_order' above).
> Not unless it's not possible to use the definition order.
> It's even more awkward to use than variable renaming.
>
>> What happens to variables, that the user missed to mention?
> Hopefully not relevant, but I imagine they would be handled in alpha
> order after the others.
>
>
>> Regards
>>
>>   Felix
>>>>>>> If no one objects, I will submit it next week.
>>>>>>>
>>>>>>> Regards
>>>>>>>     Felix
>>>>>>>
>>>>>>>> Thanks for contributing
>>>>>>>> Regards
>>>>>>>>
>>>>>>>>
>>>>>>>> On Monday, September 29, 2014, sebb <sebbaz@gmail.com>
wrote:
>>>>>>>>
>>>>>>>>     On 29 September 2014 15:49, Felix Schumacher
>>>>>>>>> <felix.schumacher@internetallee.de <javascript:;>>
wrote:
>>>>>>>>>
>>>>>>>>>> Am 29. September 2014 12:46:19 MESZ, schrieb sebb
<sebbaz@gmail.com
>>>>>>>>>>
>>>>>>>>> <javascript:;>>:
>>>>>>>>>
>>>>>>>>>> On 29 September 2014 11:24, Felix Schumacher
>>>>>>>>>>> <felix.schumacher@internetallee.de <javascript:;>>
wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Am 29.09.2014 11:56, schrieb sebb:
>>>>>>>>>>>>
>>>>>>>>>>>>     On 28 September 2014 18:11, Felix Schumacher
>>>>>>>>>>>>> <felix.schumacher@internetallee.de
<javascript:;>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Am 22.09.2014 um 11:13 schrieb Marijn
Wijbenga:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I've attached a jmeter project file
and a html file that
>>>>>>>>>>>>>>
>>>>>>>>>>>>> demonstrates the
>>>>>>>>>>>> issue. In order to reproduce:
>>>>>>>>>>>>>> 1. Load up xml-bug-test.jmx in jmeter.
>>>>>>>>>>>>>> 2. Start the proxy (recorder)
>>>>>>>>>>>>>> 3. Place xml-bug-test.html on a webserver
somewhere (if on
>>>>>>>>>>>>>>
>>>>>>>>>>>>> localhost, do
>>>>>>>>>>>> not
>>>>>>>>>>>>>> forget to remove localhost from proxy
exclusion if applicable)
>>>>>>>>>>>>>> 4. Navigate with a browser to this
file (using the proxy)
>>>>>>>>>>>>>> 5. Click both buttons in order.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I could not post to a html file,
hence the "test 2" button will
>>>>>>>>>>>>>>
>>>>>>>>>>>>> post to
>>>>>>>>>>>> Google. The page that loads has an error,
but it still records
>>>>>>>>>>>> the
>>>>>>>>>>>>> post
>>>>>>>>>>>> request which is what we want to see.
>>>>>>>>>>>>>> I also discovered that when I was
using a "get" request instead
>>>>>>>>>>>>>>
>>>>>>>>>>>>> (I've
>>>>>>>>>>>> made
>>>>>>>>>>>>>> that "test 1") then it doesn't match
the first character (%). I
>>>>>>>>>>>>>>
>>>>>>>>>>>>> think
>>>>>>>>>>>> this
>>>>>>>>>>>>>> is related.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The project has a user defined variable
called "TEST" with a
>>>>>>>>>>>>>> value
>>>>>>>>>>>>>>
>>>>>>>>>>>>> os
>>>>>>>>>>>> ".*",
>>>>>>>>>>>>>> I've ticked the box
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> To see the results, in the recording
controller the last two
>>>>>>>>>>>>>>
>>>>>>>>>>>>> requests
>>>>>>>>>>>> contain a parameter with these values:
>>>>>>>>>>>>>> Test 1: %${TEST}
>>>>>>>>>>>>>> Test 2: <${TEST}>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Both should be just ${TEST} I believe.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In the current implementation the
regex will be matched against
>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>
>>>>>>>>>>>>> pattern
>>>>>>>>>>>> which looks like
>>>>>>>>>>>>>>      \b(YOUR_VALUE)\b
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> As % and < are boundary characters
they are excluded from you
>>>>>>>>>>>>>>
>>>>>>>>>>>>> pattern.
>>>>>>>>>>>>> This is deliberate.
>>>>>>>>>>>>> There were problems previously as partial
values were being
>>>>>>>>>>>>> unexpectedly matched.
>>>>>>>>>>>>>
>>>>>>>>>>>>> See https://issues.apache.org/bugzilla/show_bug.cgi?id=52678
>>>>>>>>>>>>>
>>>>>>>>>>>> I thougt so. Maybe, that would have been
helped by adding more
>>>>>>>>>>>> documentation, but then it is regex...
>>>>>>>>>>>>
>>>>>>>>>>>>>     I would consider this a bug, or at
least documentation could
>>>>>>>>>>>>> be
>>>>>>>>>>>>> a
>>>>>>>>>>>>> bit
>>>>>>>>>>>> more
>>>>>>>>>>>>>> concise.
>>>>>>>>>>>>>>
>>>>>>>>>>>>> Patches welcome.
>>>>>>>>>>>>>
>>>>>>>>>>>> A patch was attached :)
>>>>>>>>>>>>
>>>>>>>>>>> I meant that we would welcome a patch for the
documentation.
>>>>>>>>>>> Or at least some indication of where the documentation
needs to be
>>>>>>>>>>> updated to clarify the current behaviour.
>>>>>>>>>>>
>>>>>>>>>> I will look into that.
>>>>>>>>>>
>>>>>>>>> Thanks.
>>>>>>>>>
>>>>>>>>>     What is your opinion on the option to detect parens
and modify
>>>>>>>>> the
>>>>>>>>> regex
>>>>>>>>> behavior?
>>>>>>>>>
>>>>>>>>> Looks good to me.
>>>>>>>>>
>>>>>>>>> The parens are very unlikely to have been used in existing
tests, so
>>>>>>>>> the modified behaviour is unlikely to break anything.
>>>>>>>>> But we should document it in the release notes just in
case.
>>>>>>>>>
>>>>>>>>>     Felix
>>>>>>>>>>> Attached is a patch against trunk, which checks
the regex if it
>>>>>>>>>>>>> starts
>>>>>>>>>>>> with
>>>>>>>>>>>>>> '(' and ends with ')' and uses the
regex as given, instead of
>>>>>>>>>>>>>>
>>>>>>>>>>>>> building
>>>>>>>>>>>> its
>>>>>>>>>>>>>> own version.
>>>>>>>>>>>>>>
>>>>>>>>>>>>> Please use Bugzilla for patches; it's
easier to keep track of
>>>>>>>>>>>>> them.
>>>>>>>>>>>>>
>>>>>>>>>>>> I have already done so yesterday shortly
after sending my mail.
>>>>>>>>>>>> It
>>>>>>>>>>>> is
>>>>>>>>>>>> https://issues.apache.org/bugzilla/show_bug.cgi?id=57032
>>>>>>>>>>>>
>>>>>>>>>>>> What is missing from the patch is documentation.
If the feature
>>>>>>>>>>>> as
>>>>>>>>>>>>
>>>>>>>>>>> such is
>>>>>>>>>>>
>>>>>>>>>>>> ok, then I would add that to the existing
documentation.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Regards
>>>>>>>>>>>>      Felix
>>>>>>>>>>>>
>>>>>>>>>>>>>> Also, see notes below.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>> From: sebb [mailto:sebbaz@gmail.com
<javascript:;>]
>>>>>>>>>>>>>> Sent: 21 September 2014 01:52
>>>>>>>>>>>>>> To: JMeter Users List
>>>>>>>>>>>>>> Subject: Re: Test Script Recorder
XML Regex Matching
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 19 September 2014 16:45, Marijn
Wijbenga
>>>>>>>>>>>>>> <Marijn.Wijbenga@cgpbooks.co.uk
<javascript:;>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I have an issue, which might well
be a potential bug, where a
>>>>>>>>>>>>>>
>>>>>>>>>>>>> posted
>>>>>>>>>>>> value
>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> not being matched by the Test Script
Recorder's Regex Matching
>>>>>>>>>>>>>> functionality.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The request I'm recording has a post
value containing XML (SAML
>>>>>>>>>>>>>>
>>>>>>>>>>>>> token to
>>>>>>>>>>>> be
>>>>>>>>>>>>>> exact) which I'd like to replace
with a variable automatically.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> What does the value look like?
>>>>>>>>>>>>>> Does it have multiple lines?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> No, it did not have multiple lines.
I did check if this was the
>>>>>>>>>>>>>>
>>>>>>>>>>>>> case, but
>>>>>>>>>>>> it
>>>>>>>>>>>>>> wasn't
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> For testing purposes I have configured
a User Defined Variable
>>>>>>>>>>>>>>
>>>>>>>>>>>>> (called
>>>>>>>>>>>> TEST)
>>>>>>>>>>>>>> with a value of "(?s)^.*$", I've
tried "^.*$" and ".*" as well
>>>>>>>>>>>>>> (all
>>>>>>>>>>>>>> without
>>>>>>>>>>>>>> double
>>>>>>>>>>>>>> quotes).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Only ".*" replaces the content with
this: <${TEST}>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> That does not make sense.
>>>>>>>>>>>>>> ".*" will match everything, including
< and >, so the content
>>>>>>>>>>>>>> would
>>>>>>>>>>>>>> become
>>>>>>>>>>>>>> ${TEST}
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I know. It doesn't really. Hence
I think this might be a bug.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I've tried other expressions as well
and I'm able to match
>>>>>>>>>>>>>> anything
>>>>>>>>>>>>>> within
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> <> characters, but not those
characters itself.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Again, that does not make sense.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The weird thing is, that inside the
outer <> characters there
>>>>>>>>>>>>>> are
>>>>>>>>>>>>>>
>>>>>>>>>>>>> other
>>>>>>>>>>>> <>
>>>>>>>>>>>>>> characters that are matched fine.
It's just the first and last
>>>>>>>>>>>>>>
>>>>>>>>>>>>> character.
>>>>>>>>>>>> Does anyone else have experienced the same
thing, or is this a
>>>>>>>>>>>>> known
>>>>>>>>>>>> issue?
>>>>>>>>>>>>>> It is not a known issue, and may
not even be an issue.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Or should I post this in the developer's
mailing list?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> No, the developers all follow this
list.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Great, please see attachment for
an example.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Cheers
>>>>>>>>>>>>>>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@jmeter.apache.org
For additional commands, e-mail: user-help@jmeter.apache.org


Mime
View raw message