jackrabbit-oak-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Mueller (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (OAK-181) Observation / indexing: don't create events for index updates
Date Mon, 16 Jul 2012 09:11:35 GMT

    [ https://issues.apache.org/jira/browse/OAK-181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414944#comment-13414944
] 

Thomas Mueller edited comment on OAK-181 at 7/16/12 9:09 AM:
-------------------------------------------------------------

Revision 1361938: the index content, as well as the internal index data, is now stored in
a child node. The name of that child node is currently ":data", but that can be changed later
if required. There is one such a node per index, and one for the internal index data and temporary
storage (used for move operations). The internal index data is currently just the revision
id of the latest indexed revision.

> the index should be part of the repository (e.g. as binary nt:files), 
> so you can easily back them up and copy over using the JCR API 
> (and package systems on top of it)
> IIUC, that is one of the major reasons to put indexes into the repository.

How visible the index data should be is a good question. I think we should leave it somewhat
open currently, and decide once we have more experience. I think the main reasons to put the
index data in the repository are:

- to simplify backup / storage / maintenance
- scalability (so the index can scale in the same way the repository can scale)
- reduce complexity associated with separate storage for indexes

But making the index accessible over the JCR API wasn't a goal so far (as far as I'm aware).
What you describe is uses cases I didn't think about so far. Within relational databases,
I never heard about a use case to copy index data from one database to another. You generally
just copy the data, and then let the database reindex it. If you want to copy the index data,
then you do a full database backup.

                
      was (Author: tmueller):
    Revision 1361938: the index content, as well as the internal index data, is now stored
in a child node. The name of that child node is currently ":data", but that can be changed
later if required. There is one such a node per index, and one for the internal index data
and temporary storage (used for move operations). The internal index data is currently just
the revision id of the latest indexed revision.

> the index should be part of the repository (e.g. as binary nt:files), 
> so you can easily back them up and copy over using the JCR API 
> (and package systems on top of it)
> IIUC, that is one of the major reasons to put indexes into the repository.

How visible the index data should be is a good question. I don't think we should leave it
somewhat open currently, and decide once we have more experience. I think the main reasons
to put the index data in the repository are:

- to simplify backup / storage / maintenance
- scalability (so the index can scale in the same way the repository can scale)
- reduce complexity associated with separate storage for indexes

But making the index accessible over the JCR API wasn't a goal so far (as far as I'm aware).
What you describe is uses cases I didn't think about so far. Within relational databases,
I never heard about a use case to copy index data from one database to another. You generally
just copy the data, and then let the database reindex it. If you want to copy the index data,
then you do a full database backup.

                  
> Observation / indexing: don't create events for index updates
> -------------------------------------------------------------
>
>                 Key: OAK-181
>                 URL: https://issues.apache.org/jira/browse/OAK-181
>             Project: Jackrabbit Oak
>          Issue Type: New Feature
>            Reporter: Thomas Mueller
>
> If index data is stored in the repository (for example under jcr:system/oak:indexes),
then each change in the content might result in one or multiple changed in the affected indexes.
> Observation events should only be created for content changes, not for index changes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message