jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Mueller" <thomas.tom.muel...@gmail.com>
Subject Re: [jira] Resolved: (JCR-926) Global data store for binaries
Date Fri, 28 Sep 2007 07:52:23 GMT
Hi,

> > What about RepositoryException?
> Yes, that would work too. But we wanted to be able to indentify the
> specific exception thrown from the DS. In a few places we wrapped DSE
> inside a RE.

What about if DataStoreException extends RepositoryException? Like
that you don't need to catch the exception and re-throw a different
one; but identifying is possible.

> > What about updating the time when getLength() is called?
>
> Sorry, I don't understand this.

Currently the modified date/time is updated when modifiedDateOnRead is
set and getStream() is called. You said you want to avoid calling
getStream().close(). So I suggest to update the modified date/time
when modifiedDateOnRead is set and getLength() is called.

> > There is already DataStore.updateModifiedDateOnRead(long
> > before); a separate method is not required in my view.
> This didn't work in our testing.

Sorry, what does not work, and why?

> The FileDataRecord always queries the file object for it's properties.

There is only getLength and getStream. Caching the length is possible.
The length could be part of the identifier. For example, you could do
that for the DatabaseDataStore (where accessing this information would
be slow). For the FileDataStore, I think it is not required as new
File(..).length() is quite fast; but I may be wrong. So basically the
DataIdentifier for DatabaseDataStore could be the digest plus the
length.

> Well, we found that it is necesary, since the GC runs on a different
> session than the users (we're using system sessions for this).

For the data store, it doesn't matter what session created or accessed
the data record. Also it doesn't matter when save() was called - the
data record is stored before that. As long as the GC has access all
nodes it will touch all 'not recently used' data records. New data
records (that were created after the GC was started) will have a more
recent modified date / time and therefore the GC will not delete them.
There is probably a misunderstanding somewhere. It would help if you
could post your current source code, maybe there is a bug. Did you
update the modified date / time in addRecord() when the object already
exists, as done in the file data store?

> So, the user adds a binary property in one session, after the file is
> uploaded but before the user save()s the session, the GC on the system
> session starts reading all nodes from the workspace. Since the changes
> are not yet written to persistent storage, the file is assumed to be a
> deleted property, and is in fact deleted.

Is this using the database data store or the file data store? If this
is the case, it would be a bug.

> We needed to make RepositoryImpl.getWorkspaceNames() public for this to
> work.

Yes that would be good. Did you have a look at
PersistenceManagerIteratorTest.testGetAllNodeIds()? getWorkspaceNames
is called using reflection (also.
RepositoryImpl.class.getDeclaredMethod("getWorkspaceNames", new
Class[0]).setAccessible(true)).

> Will this change have any effect on the issue I just mentioned?

I don't know - first I need to understand the issue...

> Ok then, our current implementation is against a patched-up 1.3.1. What
> do you think is the best way to isolate the code?

What we need is the implementations of the
org.apache.jackrabbit.code.data interfaces.

Thanks,
Thomas

Mime
View raw message