commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bruno P. Kinoshita (JIRA)" <>
Subject [jira] [Commented] (VALIDATOR-429) UrlValidator - path is invalid due to using for validation (regression)
Date Thu, 12 Oct 2017 07:57:00 GMT


Bruno P. Kinoshita commented on VALIDATOR-429:

Might be easier to review the suggested changes if there is a pull request in GitHub, or a
patch attached (former is preferable IMO). Might have time to give it a try and review it
in the next days.

> UrlValidator - path is invalid due to using for validation (regression)
> ------------------------------------------------------------------------------------
>                 Key: VALIDATOR-429
>                 URL:
>             Project: Commons Validator
>          Issue Type: Bug
>          Components: Routines
>    Affects Versions: 1.6
>            Reporter: limpygnome
>              Labels: easyfix
> h1. Summary
> We've been hit by a bug in a real world application after upgrading 1.4.1 to 1.6, where
previously valid URLs are no longer valid, which looks to be due to using for
validating the path of a URL.
> h1. Steps to Reproduce
> Our application went to validate URLs similar to the following:
> *
> This is no longer valid in 1.6.1, but the following cases are:
> *
> *
> h1. Impact
> It seems paths in UrlValidator are being parsed/validated as host-names, per's
> h1. Technical
> It looks like this may have been introduced by the following change:
> Specifically due to now using to validate a path. The usage is as follows
in org.apache.commons.validator.routines.UrlValidator:
> {code}
> URI uri = new URI(null,null,path,null);
> {code}
> It looks like URI is trying to parse the path as a hostname when the schema and hostname
are not specified.
> Example to reproduce:
> {code}
> new URI(null, null, "//_test", null);   // throws URISyntaxException
> {code}
> Same example with other parts, no longer throwing exception:
> {code}
> new URI(null, "test", "//_test", null);
> {code}
> Even though states string components can be null, it seems the URL built
internally, which is validated, is slightly different. So when specifying a hostname with
URI, internally it constructs:
> * //test//_test
> Using no hostname, in the same way as UrlValidator, the following is constructed and
validated internally:
> * //_test
> Therefore it looks like there's either a bug in, or its usage is not correctly
> h1. Fix
> A potential fix is to change org.apache.commons.validator.routines.UrlValidator to pass
an empty string in the hostname. Internally, in, this produces:
> * ////_test
> Thus the hostname is empty, which is considered empty, and the correct path is validated.
> Would this fix be appropriate, or considered too fragile?
> Alternatively the fix could be to extract similar logic to, to validate
the path, which appears to be just checking the characters are valid and between a certain
range. This logic can be seen in, which calls upon checkChars.

This message was sent by Atlassian JIRA

View raw message