jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcel Reutegger (JIRA)" <j...@apache.org>
Subject [jira] Commented: (JCR-169) Make Jackrabbit clusterable
Date Fri, 01 Sep 2006 07:42:27 GMT
    [ http://issues.apache.org/jira/browse/JCR-169?page=comments#action_12432083 ] 
            
Marcel Reutegger commented on JCR-169:
--------------------------------------

Ian, thanks a lot for your comments.

Here are my current thoughts on clustering the search index in jackrabbit:

I think the prefered approach is to put the index into the repository itself. See: http://article.gmane.org/gmane.comp.apache.jackrabbit.devel/8530
and following messages
This would also allow us to distribute index updates to cluster nodes using the repository
internal observation mechanism. e.g. the update of a deleted documents file or new index segments.

> I found the best indexing strategy was to have local copies of segments, stored centrally
as masters.

I agree. Specifically the design of lucene where index files are only created but never modified
supports this approach very nicely.

> Im the search application, speed of update of segments is not that critical,
> you probably have a different requirement in JCR. 

JCR is more restrictive in that respect, at least if we want to be compliant with the specification.
As soon as a node is created in the workspace it must be searchable using a query. For most
real life systems this is not a hard requirement though. E.g. when a document is added to
a repository, it usually doesn't matter if it is retrievable by query only after a couple
of seconds and not right away.


> Make Jackrabbit clusterable
> ---------------------------
>
>                 Key: JCR-169
>                 URL: http://issues.apache.org/jira/browse/JCR-169
>             Project: Jackrabbit
>          Issue Type: New Feature
>          Components: core
>            Reporter: Marcel Reutegger
>            Priority: Minor
>
> This jira issue discusses the technical implications on the current design of Jackrabbit
to introduce clustering.
> Particularly the following areas require thorough investigation:
> - SharedItemStateManager and its cache
>     - cache integrity
>     - cache design: look aside, write through?
>     - hook for distributed cache, interface?
>     - isolation level
>     - transaction integrity within Jackrabbit, interaction with transient layer
> - VirtualItemStateProvider
>     - same strategy as SharedItemStateManager?
> - Search index
>     - single or per cluster node index?
> - Observation
> Please state more areas if needed.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message