directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jérôme Baumgarten <jbaumgar...@gmail.com>
Subject Re: LDAP protocol implementation and data containing accents
Date Wed, 31 Aug 2005 09:00:02 GMT
On 8/30/05, Emmanuel Lecharny <elecharny@gmail.com> wrote:
> > Also, accents in a filter are incorrectly received (decoded ?) in
> > SearchHandler, for example the filter (sn=*é*) is retrieved as
> > (sn=*Ã(c)*).
> 
> Are you using UTF-8 to encode your string? Data are stored in UTF-8
> format in Ldap.

I did some other tests and I get the following (clients and server
running on a Windows box) :

 * JXplorer (build JXv3.2b2 2005-08-18 13:46 EST) : filter is
incorrect w.r.t accents

 * Softerra LDAP Browser 2.6 : filter is incorrect w.r.t accents

 * JNDI test code : filter is incorrect w.r.t accents

 * JLDAP test code : filter is incorrect w.r.t accents

 * OpenLDAP ldapsearch (but running on a Linux box) : filter is
correct w.r.t accents

I can fix these problems if I do the following :

String filter = LdapProxyUtils.filterToString(request.getFilter());
try {
  filter = new String(filter.getBytes(), "UTF-8");
} catch (UnsupportedEncodingException ueEx) {
  throw new RuntimeException(ueEx);
}

But I don't really understand why I must do so since "RFC 2254 - The
String Representation of LDAP Search Filters" says that it is
represented as an UTF-8 string. Thus I would expect the filter value
to be correct, no matter the platform my LDAP proxy is running on.

Also, has anyone tested search on ApacheDS with filter containing
accents ? The problems I'm facing right now may also be present with
ApacheDS.

> >
> > A wild guess would be some kind of coding / decoding problem (charset
> > related ?). I believe that if I get it in my own SearchHandler
> > implementation, the one provided with ApacheDS should also get it.
> > That may be a problem for people dealing with data having accents
> > (like my firstname).
> >
> > Are my assumptions correct ?
> 
> Hmmmm. Just try using new String("Jérôme", "UTF-8");
> 
> Emmanuel L&eacute;charny ;)

Jérôme

Mime
View raw message