jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Mueller" <thomas.tom.muel...@gmail.com>
Subject Re: [jira] Resolved: (JCR-926) Global data store for binaries
Date Tue, 25 Sep 2007 10:31:00 GMT

A database-backed data store would be great!

> methods throw IOException
> DataStoreException

What about RepositoryException?

> GC does a p.getStream().close() to update the last modified time of the record

You are right, this will be changed.

> store.updateRecordLastModifiedTime(id, System.currentTimeMillis());

What about updating the time when getLength() is called?
There is already DataStore.updateModifiedDateOnRead(long before);
a separate method is not required in my view.

> The issue of the delay when calling getStream()/getRecord() means that
> the information provided by the record has to be stored in the record,
> instead of relaying in the backing store (like it's done in the
> FileDataRecord class).

Sorry I don't understand this part.

> a stream that closes its DB resources when it's closed is needed.

That should be simple to implement; I suggest to do this as part of the
database data store (I can help if required).

> getRecord() should not retrieve the stream
> and keep resources open unless explictly asked

Sure. This is done already for the FileDataRecord. Is there a problem
to delay opening the stream for the database data store?

> We have code to run the GC only once on repository
> startup, and in a background thread

Both should work. I prefer the background thread

> run-once option needs global lock during it's run.
> no session can be started while a collection is in
> progress.
> The background thread instead needs some way to keep track of
> changes in the binary properties.

I don't think this is required. Let's say a large object is deleted
while the garbage collection runs. In this case it will not be
collected, which is OK in my view (it will be collected in the next GC
run). If a new object is inserted, it will not be collected because
the last modified date is newer.

My plan is to scan in the persistence manager, using the new method
getAllNodeIds(), if a bundle persistence manager is used. This should
speed up the GC scan.

> We can contribute our code,

That would be great of course!

> but it's gonna take same time to extract it
> our code it's more of a POC than production-ready for now.

No problem. We can work together on fixing the problems.

Thanks for your help!

View raw message