axis-c-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Samisa Abeysinghe <sam...@wso2.com>
Subject Re: Caching support for large attachments
Date Sun, 16 Mar 2008 10:56:27 GMT
Senaka Fernando wrote:
>> Hi Manjula, Thilina and others,
>>
>> Yep, I think I'm exactly in the same view point as Thilina when it comes
>> to handling attachment data. Well for the chunking part. I think I didn't
>> get Thilina right in his first e-mail.
>>
>> And, However, the file per MIME part may not always be optimal. I say
>> rather  each file should have a fixed Max Size and if that is exceeded
>> perhaps you can divide it to two. Also a user should always be given the
>> option to choose between Thilina's method and this method through the
>> axis2.xml (or services.xml). Thus, a user can fine tune memory use.
>>
>> When it comes to base64 encoded binary data, you can use a mechanism where
>> the buffer would always have the size which is a multiple of 4, and then
>> when you flush you decode it and copy it to the file, so that should
>> essentially be the same to a user when it comes to caching.
>>
>> OK, so Manjula, you mean when the MIME boundary appears partially in the
>> first read and partially in the second?
>>
>> Well this is probably the best solution.
>>
>> You will allocate enough size to read twice the size of a MIME boundary
>> and in your very first read, you will read 2 times the MIME boundary, then
>> you will search for the existence of the MIME boundary. Next you will do a
>> memmove() and move all the contents of the buffer starting from the
>> MidPoint until the end, to the beginning of the buffer. After doing this,
>> you will read a size equivalent to 1/2 the buffer (which again is the size
>> of the MIME boundary marker) and store it from the Mid Point of the buffer
>> to the end. Then you will search again. You will iterate this procedure
>> until you read less than half the size of the buffer.
>>     
>
> If you are interested further in this mechanism, I used this approach when
> it comes to resending Binary data using TCPMon. You may check that also.
>
> Also, the strstr() has issues when you have '\0' in the middle. Thus you
> will have to use a temporary search marker and use that in the process.
> Before calling strstr() you will check whether strlen(temp) is greater
> than the MIME boundary marker or equal. If it is greater, you only need to
> search once. If it is equal, you will need to search exactly twice. If it
> is less you increment temp by strlen(temp) and repeat until you cross the
> Midpoint. So this makes the search even efficient.
>
> If you want to make the search even efficient, you can make the buffer
> size one less than the size of the MIME boundary marker, so when you get
> the equals scenario, you will have to search only once.
>
> The fact I've used here is that strstr and strlen behaves the same in a
> given implementation. In Windows if strlen() is multibyte aware, so will
> strstr(). So, no worries.
>   

We have an efficient parsing mechanism already, tested and proven to 
work, with 1.3. Why on earth are we discussing this over and over again?

Does caching get affected by the mime parser logic? IMHO no. They are 
two separate concerns, so ahy are we wasting time discussing parsing 
while the problem at had is not parsing but caching?

Writing the partially passed buffer was a solution to caching. Do we 
have any other alternatives? If so what, in short, what are they?

Samisa...



---------------------------------------------------------------------
To unsubscribe, e-mail: axis-c-dev-unsubscribe@ws.apache.org
For additional commands, e-mail: axis-c-dev-help@ws.apache.org


Mime
View raw message