jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jde...@21technologies.com
Subject Re: Jackrabbit performance with large binaries
Date Mon, 11 Dec 2006 23:20:52 GMT
Thanks for the response.
I've been playing around with this memory problem some more and it appears 
to be a Postgresql issue.  I tried using my identical repository test code 
with an embedded Derby DB instead of SimpleDBPersistenceManager configured 
to use Postgresql.  I used both DerbyPersistenceManager and 
SimpleDBPersistenceManager configured to use derby.  In both cases, I was 
able to dump large binary files into the repository without the memory 
increase I saw using Postgresql (I used 64MB of JVM memory to successfully 
add all the files into a Derby-based repository as suggested).  Therefore, 
I believe this is either a problem with the way Postgresql handles blobs, 
or it is a problem with Jackrabbit's management and configuration of 
Postgresql.

Has anyone out there successfully dumped large binary files into a 
SimpleDBPersistenceManager using Postgresql and only 64MB of JVM heap?

Thanks,
Joe.


"Stefan Guggisberg" <stefan.guggisberg@gmail.com> wrote on 12/08/2006 
03:16:49 AM:

> hi joe,
> 
> On 12/8/06, jdente@21technologies.com <jdente@21technologies.com> wrote:
> > Hi,
> > I've been storing binary files of different sizes using
> > SimpleDBPersistenceManager configured to use postgresql.  I have
> > successfully added files of 2.5 MB (around 1 second to save) up 
through
> > 103 MB (around 80 seconds to save).  I am storing the binary files by
> > creating a file system using nt:folder, nt:file, and nt:resource.  The
> > binary files are then getting streamed into the jcr:data field of the
> > appropriate resource node:
> >
> >           Node resourceNode = fileNode.addNode("jcr:content",
> > "nt:resource");
> >         resourceNode.setProperty("jcr:mimeType",
> > typeHandler.getMimeType());
> >         resourceNode.setProperty("jcr:encoding",
> > typeHandler.getTextEncoding());
> >         resourceNode.setProperty("jcr:data", resourceInput);
> >
> >         resourceInput is defined as a BufferedInputStream(new
> > FileInputStream(binaryFile), 16384);
> >         I then save the session.
> >
> > I have been getting a lot of out of memory exceptions running these 
tests.
> >  The ammount of memory needed to successfully save a file increases
> > linearly with the size of the file.  In order to avoid an out of 
memory
> > exception I need to set aside at least 7.5 times as much memory in the 
VM
> > as the size of the file I want to save.  I have a similar problem when
> > deleting files, since the entire node is brought into transient memory
> > before it is deleted.  Is there a better way to save binary content 
that
> > won't require a constant increase in memory? Is there any way to avoid
> > bringing the entire file into memory before it's saved (or bringing 
the
> > entire node into memory again when it's deleted)?
> 
> this sounds really bad. jackrabbit should be able to store e.g. a 500mb 
file
> with 64mb of jvm heap without any problems.
> 
> large binary data in jackrabbit is always streamed, never materialized.
> 
> i guess the postgress jdbc driver does materialize the binary stream,
> hence the increase in memory you experience. i don't know much about
> postgress, i only verified that the postgress schema works in general.
> 
> when you use jackrabbit's default persistence (i.e. embedded derby)
> you shouldn't have this problem.
> 
> i'd say you have 3 options:
> 1. store binary data in the fs rather than in the db 
(externalBlobs=true);
>     however, the fs is not transactional; if you experience a power loss 
in
>     the middle of a transaction you might end up with inconsistent 
binary
>     data (e.g. file has been updated although tx never succeeded).
> 2. use another db (e.g. derby)
> 3. do some research regarding this issue in the postgress mailing lists;
>     maybe there's a configuration option or something similar
> 
> cheers
> stefan
> 
> >
> > Thanks for the help,
> > Joe.
> >



Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message