directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steven Legg <>
Subject Re: [filter] interpretting presence verses substring with whitespace
Date Fri, 12 Nov 2004 03:33:29 GMT


Kurt D. Zeilenga wrote:
> At 08:59 PM 11/8/2004, Steven Legg wrote:
>>Hi Alex,
>>Alex Karasulu wrote:
>>>I have some questions regarding the interpretation of LDAP search filters specifically
differentiating between presence and substring items when whitespace is present.  According
to the ABNF describing these rules in [FILTERS], and some additional rules in [MODELS] ,
>>>    ...
>>>    present        = attr EQUALS ASTERISK
>>>    substring      = attr EQUALS [initial] any [final]
>>>    initial        = assertionvalue
>>>    any            = ASTERISK *(assertionvalue ASTERISK)
>>>    final          = assertionvalue
>>>    attr           = attributedescription
>>>    ...,
>>>the presence of whitespace is considered significant in the assertionvalue.
> This wording is, I think, causing your problem.
> Any and all whitespace is part of some assertionvalue.
> Whether or not its significant to the evaluation of the
> filter depends on the rule involved.
>>>Please correct me if I'm wrong but this means that the following filter expressions
are interpreted differently:
>>>(for simplicity I'm equating whitespace to be a single space character, %x20)
>>>1. (ou=*)
>>>  - there is no whitespace at all
>>>  - interpreted as a presence filter
>>>  - matches all entries containing the ou attribute
>>>2. (ou= *)
>>>  - there is whitespace before the ASTERISK after the EQUALS
>>>  - interpreted as a substring filter
>>>  - the space is interpreted as the [initial]
>>>  - matches all values of ou starting with a space, %x20
>>The exact matching behaviour depends on the attribute type. Typically though,
>>it will be equivalent to caseIgnoreSubstringsMatch. Assuming that is the case
>>then the current ldapbis specifications would invoke stringprep on each
>>candidate attribute value and each substring of the assertion.
> I argue that the behavior described by X.521 is the same
> as prescribed by [Syntaxes][LDAPprep].
>>The result will
>>be that no attribute value will have (for matching purposes) a leading space.
>>The initial substring will get reduced to empty which then becomes a
>>single space. After that it is a code point comparison. Since no attribute
>>value has a leading space, none are matched, and the result is empty.
> IMO, that's what X.521 says should happen.
>>This isn't the intuitive result either. The same occurs in the other examples for
much the same reasons.
> Yes.
>>Treating the whitespace as insignificant (unless escaped) in the string
>>representation of the filter partly helps as it makes all your examples
>>equivalent to a present match,
> I don't understand this statement.
>         (ou= *) and (ou=\20*) are two encodings of the same filter
>         (substrings assertion for the initial string " "),
>         neither of which is equivalent to a present match.

Alex was postulating an alternative solution where unescaped whitespace in the string
representation of the filter is insignificant and would be stripped in converting
the string representation into an LDAP search filter in protocol. If that were so
then (ou= *) and (ou=*) would be carried in LDAP as presence matches, and
(ou=\20*) would be carried as a substrings match with initial substring " ".

>>but there would still be a problem with
>>cases where the whitespace is explicitly escaped. Stringprep will still
>>cause (ou=\20*) to match nothing.
>>It seams to me that stringprep should allow a result string to be empty,
>>rather than replacing it by a single space. If that were the case then an
>>initial substring of " " would be reduced to an empty string, which would
>>trivially match every value, giving the same effect as a presence match.
> That's not consistent with the behavior described in X.521.

It may be consistent with the behaviour described in RFC 2252, which omitted
to say that a string of all spaces is replaced with a single space.

I personally think that replacement step is unwise. It leads to odd results
like the following: an initial substring of " " matches a value of " ", but
doesn't match a value of " foo" even though " " is clearly a prefix of " foo".

>>Similarly, an any substring that reduces to an empty string is trivially
>>satisfied and so is effectively ignored. In fact, this change to stringprep
>>would make escaping of whitespace in the string representation of filters
>>largely moot.
> I don't understand your last point here.

This was again in reference to Alex's alternative solution. If stringprep/LDAPprep
didn't replace an empty string with a single space then the effects would be
much the same as Alex's solution with respect to string filters.


 > Irregardless of
> how the matching is performed, escaping whitepace in the
> string representation of the filter produces the same filter
> wire-encoding as if the whitescape were not escaped in the
> string representation, and hence matches in the exact same
> manner.  That is, the escaping has only been moot.  Introduction
> of LDAPprep doesn't change that.
>>>3. (ou=* )
>>>  - there is whitespace after the ASTERISK before the RPAREN
>>>  - interpreted as a substring filter
>>>  - the space is interpreted as the [final]
>>>  - matches all values of ou ending with a space, %x20
>>>4. (ou= * )
>>>  - there is whitespace before the ASTERISK and after the ASTERISK
>>>  - interpreted as a substring filter
>>>  - the first and last spaces are interpreted as the [initial] and [final] values
>>>  - matches all values of ou starting and ending with a space, %x20
>>>5. there's another class where two or more ASTERISKS sandwich whitespace: (ou=*
>>>  - although other forms would be a bit nonsensical this one may be valid and
would match      all entires with ou values starting or ending with a space, %x20
>>>Are these correct interpretations according to the ABNF and is the matching behavior
>>>Now I'd like to open for discussion whether or not these interpretations are intuitively
correct.  As an end user issuing search filters to a directory I've come to expect the directory
to be extra forgiving when it comes to things like whitespace.  Users have gotten this feeling
regarding whitespace forgiveness from the way distinguished names are normalized by the directory.
 It's intuitive for the user to presume some of this forgiving nature extends to filters which
can match on attributes with the DN syntax.  So looking at the examples above I can see how
a user may think that all these filters are in fact equal to one another.  The user is not
thinking, "=* is a distinct atomic operator token to a parser and is inseparable where a space
makes it no longer a presence ffilter."  The user thinks well I'm matching for anything. 
>>>What if they just like to put spaces around parentheses in their filter expressions?
 This space forgiving nature is "turned on" for matching normal equality expressions on attributes
like ou and is especially forgiving if distinguishedNameMatch is in effect for respective
>>>So would you agree that there is some mismatch between the hard ABNF interpretation
and the mental interpolation of users writing filters?  
>>>IMO I think all whitespace should be escaped if significant.  
> See above.  Escaping whitespace in the string representation has
> zero impact upon the wire encoding of the filter nor its evalutation. 
>>>Otherwise whitespace should be trimmed from the edges of attributevalues.  Also
whitespace within the interior of the value should be reduced to a single space to preserve
tokenization order while matching.  With regard to substring items the 'any' pieces between
two ASTERISKS  that are purely composed of whitespace should be discarded and the ASTERISKS
consolidated into one.
>>>This makes life tougher on those that really want to match based on whitespace.
 However they can just escape out the whitespace in their filters like so:
>>>1. (ou=*)
>>>2. (ou=\20*)
>>>3. (ou=*\20)
>>>4. (ou=\20*\20)
>>>5. (ou=*\20*)
>>>Comments? Thoughts?
>>>[Filters]     Smith, M. (editor), LDAPbis WG, "LDAP: String
>>>              Representation of Search Filters",
>>>              draft-ietf-ldapbis-filter-xx.txt, a work in progress.
>>>[Models]      Zeilenga, K. (editor), "LDAP: Directory Information Models",
>>>              draft-ietf-ldapbis-models-xx.txt, a work in progress.

View raw message