commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcus Craske (JIRA)" <>
Subject [jira] [Created] (VALIDATOR-429) UrlValidator - path is invalid due to using for validation (regression)
Date Wed, 20 Sep 2017 22:47:00 GMT
Marcus Craske created VALIDATOR-429:

             Summary: UrlValidator - path is invalid due to using for validation
                 Key: VALIDATOR-429
             Project: Commons Validator
          Issue Type: Bug
          Components: Routines
    Affects Versions: 1.6
            Reporter: Marcus Craske

h1. Summary
We've been hit by a bug in a real world application after upgrading 1.4.1 to 1.6, where previously
valid URLs are no longer valid, which looks to be due to using for validating
the path of a URL.

h2. Steps to Reproduce
Our application went to validate URLs similar to the following:

This is no longer valid in 1.6.1, but the following cases are:

h2. Impact
It seems paths in UrlValidator are being parsed/validated as host-names, per's

h2. Technical
It looks like this may have been introduced by the following change:

Specifically due to now using to validate a path. The usage is as follows in
URI uri = new URI(null,null,path,null);

It looks like URI is trying to parse the path as a hostname when the schema and hostname are
not specified.

Example to reproduce:
new URI(null, null, "//_test", null);   // throws URISyntaxException

Same example with other parts, no longer throwing exception:
new URI(null, "test", "//_test", null);

Even though states string components can be null, it seems the URL built internally,
which is validated, is slightly different. So when specifying a hostname with URI, internally
it constructs:
* //test//_test

Using no hostname, in the same way as UrlValidator, the following is constructed and validated
* //_test

Therefore it looks like there's either a bug in, or its usage is not correctly

h2. Fix
A potential fix is to change org.apache.commons.validator.routines.UrlValidator to pass an
empty string in the hostname. Internally, in, this produces:
* ////_test

Thus the hostname is empty, which is considered empty, and the correct path is validated.

Would this fix be appropriate, or considered too fragile?

Alternatively the fix could be to extract similar logic to, to validate the path,
which appears to be just checking the characters are valid and between a certain range. This
logic can be seen in, which calls upon checkChars.

This message was sent by Atlassian JIRA

View raw message