jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Baptiste Quenot <...@apache.org>
Subject Re: eXist
Date Mon, 23 Apr 2007 07:29:14 GMT
* Marcin Nowak:

> Recently I've  discovered XML database quite  similar in general
> concepts to Jackrabbit,  in fact it does  not provide versioning
> and  referencing  between  nodes  but   it  is  really  fast  as
> I  compared  it  with  Jackrabbit, especially  in  querying  and
> importing nodes, question is why Jackrabbit performs so badly in
> comparison to eXist?

You're asking  for a troll very  obviously, so I won't  comment on
it, but there are a few things that are worth to mention:

1. eXist  is  an XML  database,  Jackrabbit  is  not, so  you  are
   comparing two  unrelated things.   Moreover, even if  the query
   syntax can look similar, eXist returns XML, whereas JCR returns
   Java objects.  You need to understand the implications of this,
   namely parsing the  resulting XML and work with  it can quickly
   lead to  memory and CPU  starvation, especially when  the query
   returns a lot of documents.  JCR  plays nicely with this, as it
   returns an iterator on the data set.

2. Jackrabbit is  mostly seen  as a Java-API,  whereas eXist  is a
   standalone beast with specific servlets that talk xmlrpc, REST,
   and  so  on mostly  accessed  using  HTTP requests  causing  an
   additional  overhead.  eXist  even  has a  front-end  based  on
   Cocoon.  A  *lot* of caching is  done on the eXist  side, while
   with Jackrabbit you will need  a second-level cache in your own
   code to address that.

3. In my  book, eXist is not  designed to let you  query the whole
   database at  once, whereas  Jackrabbit allows  you to  return a
   sorted  subset  of documents  from  the  whole repository  very
   efficiently,  by design.   Accessing one  XML document  is very
   different from querying the whole database with 10k+ documents.
   Play with eXist more than 5 minutes with a serious data set and
   you will notice by yourself.

4. Jackrabbit's efficiency  at importing nodes depends  largely on
   the persistence  and filesystem  implementation you  are using.
   For example I've seen the  BDB storage backend perform 10 times
   faster than the XML-file-based one.

5. When  you compare  two approaches  (one XML  database, one  JCR
   repository) for your own usecase, and moreover when you ask for
   feedback about  your experiments,  publish the results  of your
   benchmarks, be very  careful to mention *what*  you tested, and
   *how*.  You also need to mention of course the numeric figures.
   Otherwise you're just spreading FUD.

     Jean-Baptiste Quenot
aka  John Banana   Qwerty

View raw message