jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jukka Zitting" <jukka.zitt...@gmail.com>
Subject Re: NGP: Prototyping
Date Fri, 23 Nov 2007 11:15:23 GMT

On Nov 23, 2007 10:23 AM, Thomas Mueller <thomas.tom.mueller@gmail.com> wrote:
> In any case, I wouldn't architect Jackrabbit around memory mapped
> files.

I guess you're right, thanks for the background.

I was hoping to offload paging features (when to load a part of the
file to memory and when to discard it) to the operating system, but it
seems like we need to handle that explicitly.

> Your patch exposes java.nio.ByteBuffer: Record.getBuffer(); I'm
> not sure why this is required.

Instead of parsing node data structures into Java object graphs, I
would like to access them directly using the on-disk serialization
format. For example a Node.getNodes() call should return an iterator
that just keeps a pointer for traversing the underlying buffer. This
would be perfect with a memory mapped file, but I guess we need to
provide some other way to get random access to the underlying records.

> > > Journal:
> > > "A record is never modified or removed once it has been added to a
> > > journal.". I agree, if there is a mechanism to removing old journal
> > > files.
> >
> > The way I see it, during normal operation we only need to be able to
> > append data to the file, so this shouldn't be a problem. At some
> > points the journal file needs to be vacuumed to save space, but that
> > operation would typically produce a new file that's then used to
> > replace the old journal. I'm not yet sure how often and at which
> > points such vacuuming needs to take place.
> I'm a bit worried that for long running processes (and Jackrabbit
> usually runs on the server) this could be a problem.

If we don't need to worry about the problem of not being able to
delete mapped files, then there should be no problem compacting the
journal and removing old files while Jackrabbit is running.

> > Also, I have an experimental C version of the journal and tree code.
> > The idea behind that is possibly to implement a mod_dav module to
> > enable direct WebDAV access to the tree structure stored in the
> > journal.
> I don't understand why you would write a WebDAV server in C. What
> advantages does this have?

I'm doing it mostly as a proof of concept.

> I see there are systems where Java is not supported. However now that
> Java is the most common programming language, and that Java has become
> more open source, I don't understand why.

There are many cases where a full Java stack is not available (even if
installed on a system). For example a typical Apache+PHP server. It
would be nice to make general content repository features easily
available for such systems.

Also, it's much easier to integrate a C library with various other
programming languages and environments.

> JNI has a big overhead, both performance wise and development wise.
> [...]
> So I don't buy the 'performance' argument - unless you can show I'm
> wrong of course ;-)

I guess you're right.


Jukka Zitting

View raw message