directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Emmanuel Lecharny" <elecha...@gmail.com>
Subject Re: Streaming / Serializing Big Objects
Date Fri, 08 Sep 2006 17:47:59 GMT
Ole,

just keep in mind that we are talking of byte[] or String, not complex Java
objects :)

What we need is a simple mechanism that will allow the server to stream thos
two kind of objects. The main issue, if we stream to disk, is to avoid
zillions of small files to be created. We need a storage which will be able
to store those blobs into a single file, even if it's 10 Gb large.

An other point is that we can't do XML : it's overkilling. You will have
structures like :
<jpegPhoto name="MyFace.jpg">
  Ar45tYU...Rt==  (2Mbytes of base64 data)
</jpegPhoto>

Don't over(ab)use XML ;)

(ok, I know : compared to the disk access, it's ate least 2 order of
magnitude faster, but the less CPU we eat, the more can be used by other
threads).

Any idea is welcome, and ma be we can start a page on confluence with those
ideas. Atm, we are just in a

Emmanuel.

On 9/8/06, Ole Ersoy <ole_ersoy@yahoo.com> wrote:
>
>
> 1-Decoder
> So if the decoded request request object is above the
> configured threshold, then ADS would need to persist
> it per the configured persitance mechanism(Prevayler,
> ...), otherwise we store it in memory.
>
> The myfaces upload component looks at it's size
> threshold and serializes the uploaded file if it's
> above the specified threshold.  I'm sure it's just
> uses Java serialization straight up, but the component
> can be hooked up to any integration/persistance layer
> naturally.
>
> Suppose the whole directly tree was stored using the
> Eclipse EMF API.
>
> The the decoder would map the request object directly
> to a EMF object, and EMF's persistance mechanism could
> be invoked to persist to xml, straight up object
> serialization, the Service Data Object API could be
> invoked to serialize to databases, etc.  Web Services
> could be invoked, it's a pretty sexy API, with a lot
> of possibilities.
>
> When it comes to streaming images, resources, etc. I
> would think the tomcat API's should be really good for
> that....
>
>
>
>
>
>
>
>
>
>
>
> --- Emmanuel Lecharny < elecharny@gmail.com> wrote:
>
> > Here is what we have to do to stream large objects :
> >
> > 1- Decoder :
> > When we read the user request, we decode it from
> > ASN.1 BER to a byte[] or to
> > a String, depending of the object Type. But
> > basically, we get a byte[].
> > Whatever, we have two concerns :
> >  A- if the length of this object - which is always
> > known- is above a certain
> > size (let say 1K), then we must store the object
> > somwhere else than in
> > memory. To do so, we must have a storage which can
> > handle Strings, byte[]
> > and StreamedObject[]. This has an impact on all
> > messages (we can't just
> > work on some attributes, we have to be generic). So
> > this is a huge
> > refactoring, with accessors for those objects, and
> > especially a Stream.read()
> > accessor.
> >  B- If we have to store a String (even a big one),
> > we have to convert the
> > byte[] to a String. If the String is big, then we
> > must find a way to apply
> > the byte[] -> String UTF8 conversion from a stream,
> > and stream back the
> > result. Not so easy ...
> >
> > 2- Database storage :
> > Well, we now have decoded a request, and we have to
> > store the value. The
> > backend is not Stream ready at all. It should be
> > able to handme a Stream and
> > stores data without having to allocate a huge bunch
> > of byte[].
> > Another problem is the other operation : we read an
> > entry from the backend,
> > and we want a streamed data to remain streamed.
> > Again, huge modification.
> >
> > 3- Encoder :
> > Now, let suppose that we successfully get some data
> > from the backend, and
> > let's suppose that those data are streamed. We want
> > to send them back to the
> > client without having to create a big byte[]. That
> > means we must be able to
> > ask MINA to send chunks of data until we are done
> > with the streamed data.
> > ATM, what we do is that we write a full PDU - result
> > of the encode() method
> > - and MINA send it all. Here, the mechanism will be
> > totally different : we
> > should inform MINA to send some data as soon as we
> > have a block of bytes
> > ready (if we send 1500 bytes long blocks, then we
> > may have to call MINA many
> > times for a jpegPhoto.
> >
> > I may have forgotten some issues, so please tell me
> > ! Regarding using a
> > existing piece of code, I have to say : "well, why
> > not ?". Right now, I
> > think we should think seriously about the point I
> > mentionned, and may be on
> > a confluence page. Streaming will take at least 2
> > weeks to write... Any
> > already written piece of code that can help is ok :)
> >
> > Emmanuel
> >
> > On 9/8/06, Ole Ersoy <ole_ersoy@yahoo.com> wrote:
> > >
> > > I accidentally deleted the original message...
> > >
> > > The myfaces file upload component can be
> > configured to
> > > serialize objects larger than a specified size.
> > >
> > > If that sounds useful, I can extract some code...
> > >
> > > Cheers,
> > > - Ole
> > >
> > > __________________________________________________
> > > Do You Yahoo!?
> > > Tired of spam?  Yahoo! Mail has the best spam
> > protection around
> > > http://mail.yahoo.com
> > >
> >
> >
> >
> > --
> > Cordialement,
> > Emmanuel Lécharny
> >
>
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com
>



-- 
Cordialement,
Emmanuel Lécharny

Mime
View raw message