jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wolfgang Gehner" <wgeh...@infonoia.com>
Subject Re: Multirow update/insert/delete issue
Date Fri, 12 Nov 2004 06:29:05 GMT
Maybe we talk about the same thing in different ways?

So you do

dbtransaction.begin()
insert ... (one row)
dbtransaction.commit()
dbtransaction.begin()
insert ... (one row)
dbtransaction.commit()

a thousand times?

We want to do
dbtransaction.begin()
insert .. (one row)
insert .. (one row)
insert .. (one row)
etc..
dbtransaction.commit()
... which I hope you will concede would me more efficient, and
where we can do a thousand in no time at all, pretty much no matter what the
underlying database. BTW, what's your configuration?

Of cource a user might also *want* to ensure that either all operations
succeed or none.

...and we wonder how we can realize this observing the current
PersistenceMgr api, and thought you might have an idea. A
persistenceMgr.store(nodesToUpdate, nodesToInsert, nodesToDelete) would be
useful for us, but we were also thinking of consuming a save() event so we
know when to commit.


Wolfgang

----- Original Message ----- 
From: "Stefan Guggisberg" <stefan.guggisberg@gmail.com>
To: <jackrabbit-dev@incubator.apache.org>
Sent: Thursday, November 11, 2004 6:36 PM
Subject: Re: Multirow update/insert/delete issue


> On Thu, 11 Nov 2004 12:32:27 +0100, Wolfgang Gehner
> <wgehner@infonoia.com> wrote:
> > We're fully aware of the good benchmarks when not using LocalFileSystem.
> > "3. Object with LocalFileSystem, not surprisingly either, showed the
worst
> >    performance: ca. 30 sec./1000 nodes"
> >
> > So there is no criticism implied or intended whatsoever.
> > I've just taken the analogy that writing to a db is like writing a
thousand
> > files *when it's done one by one*.
>
> sorry, i still don't buy this. the jdbc based persistence manager i hacked
> together is just doing that: if 1000 nodes are added and saved in one
call,
> it is inserting 1000 node records plus 1000 property records *one by one*.
> i ran the test and it averaged at 3 - 3.5 sec./1000 nodes. in fact it came
> close to the best results that i got with the b-tree based persistence
> managers.
>
>
> >
> > We are new to the Jackrabbit api and wonder how we can wrap multiple
node
> > writes/inserts/or deletes in one db transaction with the current
> > PersistenceMgr API. When we can do that, performance will be no issue.
We
> > might have PersistentMgr listen to an event emitted by node.save(), and
> > persist only then? What do you think?
>
> the bad performance you are experiencing is imo not caused by the data
> model of your underlying persistence layer, not by the current
implementation
> of jackrabbit. if you send me the schema that you are using for
> persisting nodes and properties in a rdbms, i will have a look at it.
>
> >
> > Would you like to look at our code as is?
>
> sure.
>
> regards
> stefan
>
> >
> > Stefan, we look forward to your recommendation.
> >
> > Best regards,
> >
> > Wolfgang
> >
> >
> >
> > ----- Original Message -----
> > From: "Stefan Guggisberg" <stefan.guggisberg@gmail.com>
> > To: <jackrabbit-dev@incubator.apache.org>
> > Sent: Wednesday, November 10, 2004 6:36 PM
> > Subject: Re: Multirow update/insert/delete issue
> >
> > > a few comments/clarifcations inline...
> > >
> > > On Wed, 10 Nov 2004 17:41:46 +0100, Wolfgang Gehner
> > > <wgehner@infonoia.com> wrote:
> > > >
> > > > As discussed with David offline, when 1000 nodes are inserted, in
the
> > current implementation the PersistenceMgr.store() method
> > > > is called a 1000 times. So the XMLPersistenceMgr takes 30 seconds to
do
> > those 1000 write operations.
> > >
> > > not quite correct: i said that the XML/ObjectPersistenceManager in
> > > combination on a CQFileSystem takes ca. 5 sec. for adding and saving
> > > 1000 nodes (that's
> > > 2000 write operations, 1000 nodes + 1000 properties).
> > >
> > > > A JDBC implementation of the current PersistenceMgr API is
"condemned"
> > to do the same thing. We'd really look to a way to bundle those 1000
writes
> > into one "transaction", so we can take 2-3 seconds on a relational
database
> > rather than 30.
> > >
> > > again, a jdbc implementation is *not* condemned to take 30 sec.!
> > > i hacked a quick&dirty implementation of a jdbc persistence manager
(with
> > a very
> > > *primitive* schema) that took less than 5 sec. for adding and saving
1000
> > nodes.
> > >
> > > >
> > > > So we'd like to throw into the discussion the following thoughts:
> > > > - how about maintaining an instance of of PersistenceMgr (pm) not on
> > (Persistent)NodeState but on NodeImpl
> > > > - the implementation of node.save() to collect info what nodes incl.
> > children to save and call a persistenceMgr.store(
> > > > nodesToUpdate, nodesToInsert, nodesToDelete) just once. That way the
pm
> > could bundle operations in line with the
> > > > repository requirements.
> > > >
> > > > This would make Jackrabbit's persistence model follow the DAO (data
> > access object) pattern as we understand it.
> > > >
> > > > Would be pleased to elaborate and discuss. And share our JDBC
> > PersistenceMgr prototype with anyone interested (it passes the current
api
> > unit test, but has a very non-optimized ER design and is inflicted with
the
> > issue discussed in this message).
> > > >
> > > > Best regards,
> > > >
> > > > Infonoia S.A.
> > > > rue de Berne 7
> > > > 1201 Geneva
> > > > Tel: +41 22 9000 009
> > > > Fax: +41 22 9000 018
> > > > wgehner@infonoia.com
> > > > http://www.infonoia.com
> > > >
> >
> >


Mime
View raw message