jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ard Schrijvers (JIRA)" <j...@apache.org>
Subject [jira] Commented: (JCR-1214) DocId.UUIDDocId should not have a string attr uuid, but two long's lsb and msb
Date Tue, 13 Nov 2007 09:39:50 GMT

    [ https://issues.apache.org/jira/browse/JCR-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12542084
] 

Ard Schrijvers commented on JCR-1214:
-------------------------------------

"For simplicity I would rather use an instance of UUID instead of two longs. "

Yes, that is easier

"hope that makes sense... ;)"

Yes it totally does (goog explanation also), and I actually saw this exact behavior already
 occuring in my test setups.  I was having a little misunderstanding about the fact that,
logically though, parents can be found in a different index (segment) and this index might
be recreated while the index holding the UUIDDocId to this parent. I also have noticed while
working on JCR-1213 I found that it only worked when all indexes did not change, while it
went wrong when I was having changing/merging indexes. Exactly the way you describe.  I am
about to get all things sorted out regarding the combined index reader, which contains cachingmulti
index readers, which contain in their turn lucene indexes. 

Might be an idea to picture this a little for documentation, or is there already something
like it? 

I have only one thing I am still not yet sure about, which you might know from the top of
your head: it is regarding JCR-1213 and this one:

I have the idea that the CombinedIndexReader when doing a search in a workspace, normally
contains 2 index reader: the one for workspace and a parent index (containing repository index
info like node types isn't?), right? Now, I need to find out, wether when looking up a parent,
if it is ever possible that a parent of a node can be found in the parent index (hope it is
clear waht i mean?). I think it is not, and this might be quite important for me to implement
JCR-1213, by caching regarding references to an index segment.


> DocId.UUIDDocId should not have a string attr uuid, but two long's lsb and msb 
> -------------------------------------------------------------------------------
>
>                 Key: JCR-1214
>                 URL: https://issues.apache.org/jira/browse/JCR-1214
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: query
>    Affects Versions: 1.3.3
>            Reporter: Ard Schrijvers
>             Fix For: 1.4
>
>
> After JCR-1213 will be solved, lots of DocId.UUIDDocId can be cached, and not being cleaned
after every gc(). The number of cached UUIDDocId can grow very large, depending on the size
of the repository.  Therefor, instead of storing the private String uuid; we can make it more
memory efficient by storing 2 long's, the lsb and msb of the uuid.  Storing 1.000.000 of parent
UUIDDocId might differ about 100Mb of memory. 
> I even did test by removing the entire uuid string, and not use msb or lsb, because,
when everything works properly (with references to index reader segments (See JCR-1213)),
the uuid is never needed again: in 
> UUIDDocId getDocumentNumber(IndexReader reader) throws IOException {
> we could set uuid = null just before the return. It works perfectly well, because when
an index reader is recreated, the CachingIndexReader will be recreated, hence DocId[] parents
will be recreated. 
> So, IMO, I think we might be able to remove the uuid entirely when the docNumber is found
in DocId.UUIDDocId (obviously after JCR-1213)
> WDOT?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message