directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kiran Ayyagari <kayyag...@apache.org>
Subject Re: Strig vs Byte[] for values in the server : some new ideas
Date Thu, 22 Aug 2013 09:32:54 GMT
On Thu, Aug 22, 2013 at 2:56 PM, Emmanuel L├ęcharny <elecharny@gmail.com>wrote:

> Hi guys,
>
> it has been years I'm thinking about using byte[] inside the server for
> values. I have tried more than once to get rid of the String, with no
> success so far : we are too dependant on Strings to get rid of that
> (like, the PrepareStrng method works on String, not on byte[], the very
> same for the various comparators, normalizers, syntaxc heckers).
>
> Bottom line, we have to keep the values as Strings.
>
> But is this true for every values ?
>
> In fact, we always store the received attribute's values in two
> different format :
> - a normalized String (if it's a HR Attribute) which gets normalized
> yada yada
> - a UP String, which is the value as it has been provided by the user,
> and which is left untouched.
>
> Now, consider a add operation, folloxed by a search operation, from a
> specific attribute point of vue (say, the 'description' AT)
>
> User add :
> ----------
> description:String ---> API ---> conversion to UTF-8 ---> Server
>
> Server AddHandler :
> -------------------
> description:byte[] ---> decoder ---> conversion to String ---> creation
> of the normValue ---> storage on disk ---> conversion of upValue and
> NormValue to byte[]
>
>
> User search :
> -------------
> send searchRequest
> ...
> wait for response
>
> Server SearchHandler :
> ----------------------
> fetch the entry => deserialize the Up and Norm value of the description
> AT (ie, byte[] to String conversion)
> entry processing through the interceptors
> write the SearchResultEntry ---> conversion of the description AT
> UpValue to byte[] (we don't care about the normValue at this point)
>
> User search :
> -------------
> ...
> convert the description Up value to String
>
>
> As we can see, in both operation, we are overdoing : there is no need to
> convert the UpValue to a String, as we will do a byte[] -> String ->
> byte[] of this UP value in the search. For the Add, it's slightly better
> (or less worse) : we can avoid a String--> byte[] conversion when
> storing the value.
>
>
> Making the UpValue a byte[] will save us a lot of wasted CPU, and
> probably a bit of space on disk, as a String requires 2 bytes per char
> to be serialized.
>
> even the byte[] will be of same size, I don't see where the gain is
am I missing something?

> This is something we have to work on before 2.0, as the underlying
> database will be impacted, as we will not serialize the UpValue as a
> String but as a byte[].
>
> Thoughts ?
>
> --
> Regards,
> Cordialement,
> Emmanuel L├ęcharny
> www.iktek.com
>
>


-- 
Kiran Ayyagari
http://keydap.com

Mime
View raw message