jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ard Schrijvers (JIRA)" <j...@apache.org>
Subject [jira] Commented: (JCR-1214) DocId.UUIDDocId should not have a string attr uuid, but two long's lsb and msb
Date Wed, 14 Nov 2007 09:26:43 GMT

    [ https://issues.apache.org/jira/browse/JCR-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12542368

Ard Schrijvers commented on JCR-1214:

>no, I'm afraid there isn't, but it's definitively a good idea 

:-) Perhaps I can make one (though i am terrible at making pics), because I want to have it
here at the office as well, for a common understanding of the jackrabbit indexing 

>> ... if it is ever possible that a parent of a node can be found in the parent index.

>that's not possible. the parent index contains the nodes under /jcr:system (including
the /jcr:system node). the opposite is possible, >though just for one node, the mentioned
jcr:system node. this one will have a UUIDDocId, which references the root node of the >workspace.

That is good news, and makes it a little easier. Am thinking about a two step check, where
first a reference to the entire MultiIndexReader  is checked. 

IF : check reference to the entire MultiIndexReader  instance is positive, return cached results.
ELSE IF :check the index reader segment instance the parent docnumber was in: if  instance
present, recompute docNumber with respect to the new offsets in MultiIndexReader and return
(almost) cached result.  
ELSE : recompute docNumber by search in MultiIndexReader  (the uncached case)

I will try to implement it during the weekend because the next days I am really occupied.
Will share my findings and tests (and performance issues) hopefully on sunday. 

> DocId.UUIDDocId should not have a string attr uuid, but two long's lsb and msb 
> -------------------------------------------------------------------------------
>                 Key: JCR-1214
>                 URL: https://issues.apache.org/jira/browse/JCR-1214
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: query
>    Affects Versions: 1.3.3
>            Reporter: Ard Schrijvers
>             Fix For: 1.4
> After JCR-1213 will be solved, lots of DocId.UUIDDocId can be cached, and not being cleaned
after every gc(). The number of cached UUIDDocId can grow very large, depending on the size
of the repository.  Therefor, instead of storing the private String uuid; we can make it more
memory efficient by storing 2 long's, the lsb and msb of the uuid.  Storing 1.000.000 of parent
UUIDDocId might differ about 100Mb of memory. 
> I even did test by removing the entire uuid string, and not use msb or lsb, because,
when everything works properly (with references to index reader segments (See JCR-1213)),
the uuid is never needed again: in 
> UUIDDocId getDocumentNumber(IndexReader reader) throws IOException {
> we could set uuid = null just before the return. It works perfectly well, because when
an index reader is recreated, the CachingIndexReader will be recreated, hence DocId[] parents
will be recreated. 
> So, IMO, I think we might be able to remove the uuid entirely when the docNumber is found
in DocId.UUIDDocId (obviously after JCR-1213)

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message