cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefano Mazzocchi <stef...@apache.org>
Subject Re: DO NOT REPLY [Bug 23299] - [PATCH] UTFDataFormatException: String cannot be longer than 32k.
Date Wed, 12 Nov 2003 14:16:35 GMT

On 12 Nov 2003, at 13:54, Torsten Curdt wrote:

>>> Let's not discuss this to death. I'll fix it :)
>> Minor suggestion. >32k events do not happen everyday, but <32k events 
>> do, a lot. So, instead of adding size to the <32k events, it's better 
>> to add some size to >32k events.
> > What I mean is:
>> 1) Leave length as it is, 2 bytes.
>> 2) Have one value reserved, say 0xFFFF
>
> Let's say 0x7FFF because the highest bit is reserved

yes, 0x7FFF makes sense. If the bitmap is 0 followed with all 1's, then 
we have an indication that the next two bytes are the real length of 
the string.

I like it.

>> 3) In case of >32k event, write first 0xFFFF, and then 32 bit length.
>
> ...instead of choping the string into 0x7FFF chunks

yep

>
>> This way, old events do not grow larger, only large event header 
>> increases - becames larger on 4 bytes.
>
> ...I guess I like it even more than chopping up the strings because
> the additional overhead is fix :)
>
> Excellent

I like this too. It is similar to the UTF-8 encoding that uses 1,2 or 3 
bytes depending on the "likelyness" of occurance of the char.

Good. +1

> --
Stefano.


Mime
View raw message