directory-api mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Radovan Semancik <radovan.seman...@evolveum.com>
Subject Re: Binary values and humanRedable flag
Date Mon, 10 Aug 2015 19:02:51 GMT
On 08/10/2015 06:15 PM, Emmanuel L├ęcharny wrote:
>> No magic needed here (although some magic might come very useful with
>> some LDAP servers :-) ) .... I just expect that when no
>> X-NOT-HUMAN-READABLE is present then the default from the RFC is used.
>> Isn't that a reasonable expectation?
> No, sadly.

I still do not see why is that a problem. Default value based on RFC 
seems to be much more reasonable than plain "true". Especially if it 
will work with both ApacheDS and at least two other servers - out of the 
box.

I can do it ... I only need a blessing.

But anyway, if the answer is still "no" then I can cope with it. I just 
would like to express slight disappointment. I hoped that the ambition 
of Apache Directory API is a broad interoperability and not just 
interoperability with a single directory server.

> It's more complex.
>
> Values is stored in two forms :
> - binary
> - String
>
> except that binary values are not stored in String form.

Are you sure?

StringValue.getBytes() returns field bytes (byte[]).
StringValue.getString() returns field wrappedValue (String).

So it looks like both values are stored now (at least in the memory - 
but that is what matters for the API).

> When we receive a new value for a H-R attribute, we receive it on the
> server as an UTF-8 byte[], and we convert it immediately to a String.

Yes. And this still can happen. Just store the original byte[] as it was 
received.

If I understand the code correctly then the current code 
(StringValue.readExternal(...)) converts the value from stream to UTF8 
(wrappedValue = in.readUTF()) and then immediately converts it back 
(bytes = Strings.getBytesUtf8( wrappedValue )). Which means two 
conversions instead of one. But there is even worse aspect. The content 
of "bytes" may not be the same as the value that was originally 
received. This is really really bad for diagnostics (at the very least) 
because it is in fact lying.

> for the on going processing, up to the point we store the data into the
> backend, we always use the String representation, because this is how we
> do comparison and normalization. When we store the data in the backend,
> in order to save the String Preparation processing which is already
> over-expensive (http://tools.ietf.org/search/rfc4518

And this can still happen. Just modify StringValue.readExternal(...) to 
store the original value to "bytes" and then convert it to 
"wrappedValue". You will save one conversion and you will have more 
robust and predictable system.

> Switching to a byte[] for everything *in the server* would force us to
> rewrite all the syntax checkers/comparions/normalizers to work with
> UTF-8, whoich would be a major burden.

If the server is using StringValue then you will not need to change 
anything. Interface will remain the same and behavior compatible (if you 
do not think about "comaptible bugs" of course).

> Life is complicated...

Maybe not that much :-)


-- 
Radovan Semancik
Software Architect
evolveum.com


Mime
View raw message