jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Frédéric Esnault <f...@legisway.com>
Subject RE: atomic vs group node creation/storage
Date Wed, 20 Jun 2007 08:32:24 GMT
Of course, here is the repository config :

//////////////////////////////////////////////////
// START REPOSITORY.XML//
//////////////////////////////////////////////////

<?xml version="1.0"?>
<!DOCTYPE Repository PUBLIC "-//The Apache Software Foundation//DTD Jackrabbit 1.2//EN"
				"http://jackrabbit.apache.org/dtd/repository-1.2.dtd">

<Repository>
	<FileSystem class="org.apache.jackrabbit.core.fs.local.LocalFileSystem">
		<param name="path" value="${rep.home}/repository"/>
	</FileSystem>

	<Security appName="Jackrabbit">

		<!--
			access manager:
			class: FQN of class implementing the AccessManager interface
		-->
		<AccessManager class="org.apache.jackrabbit.core.security.SimpleAccessManager">
			<!-- <param name="config" value="${rep.home}/access.xml"/> -->
		</AccessManager>

		<LoginModule class="org.apache.jackrabbit.core.security.SimpleLoginModule">
			<!-- anonymous user name ('anonymous' is the default value) -->
			<param name="anonymousId" value="anonymous"/>
			<!--
				default user name to be used instead of the anonymous user
				when no login credentials are provided (unset by default)
			-->
			<!-- <param name="defaultUserId" value="superuser"/> -->
		</LoginModule>

	</Security>

	<!--
		location of workspaces root directory and name of default workspace
	-->
	<Workspaces rootPath="${rep.home}/workspaces" defaultWorkspace="default"/>

	<!--
		workspace configuration template:
		used to create the initial workspace if there's no workspace yet
	-->
	<Workspace name="${wsp.name}">

		<!--
			virtual file system of the workspace:
			class: FQN of class implementing the FileSystem interface
		-->
		<FileSystem class="org.apache.jackrabbit.core.fs.local.LocalFileSystem">
			<param name="path" value="${wsp.home}"/>
		</FileSystem>

		<!--
			persistence manager of the workspace:
			class: FQN of class implementing the PersistenceManager interface
		-->
		<PersistenceManager class="org.apache.jackrabbit.core.persistence.db.SimpleDbPersistenceManager">
			<param name="driver" value="com.mysql.jdbc.Driver"/>
			<param name="url" value="jdbc:mysql:///testJack?autoReconnect=true"/>
			<param name="schema" value="mysql"/>
			<param name="schemaObjectPrefix" value="${wsp.name}_"/>
			<param name="externalBLOBs" value="false"/>
			<param name="user" value="root"/>
			<param name="password" value="password"/>
		</PersistenceManager>

		<!--
			Search index and the file system it uses.
			class: FQN of class implementing the QueryHandler interface
		-->
		<SearchIndex class="org.apache.jackrabbit.core.query.lucene.SearchIndex">
			<param name="path" value="${wsp.home}/index"/>
		</SearchIndex>

	</Workspace>

	<!--
		Configures the versioning
	-->
	<Versioning rootPath="${rep.home}/version">

		<!--
			Configures the filesystem to use for versioning for the respective
			persistence manager
		-->
		<FileSystem class="org.apache.jackrabbit.core.fs.local.LocalFileSystem">
			<param name="path" value="${rep.home}/version"/>
		</FileSystem>

		<!--
			Configures the persistence manager to be used for persisting version state.
			Please note that the current versioning implementation is based on
			a 'normal' persistence manager, but this could change in future
			implementations.
		-->

		<PersistenceManager class="org.apache.jackrabbit.core.persistence.db.SimpleDbPersistenceManager">
			<param name="driver" value="com.mysql.jdbc.Driver"/>
			<param name="url" value="jdbc:mysql:///testJackVer?autoReconnect=true"/>
			<param name="schema" value="mysql"/>
			<param name="schemaObjectPrefix" value="version_"/>
			<param name="externalBLOBs" value="false"/>
			<param name="user" value="root"/>
			<param name="password" value="password"/>
		</PersistenceManager>
	</Versioning>

	<!--
		Search index for content that is shared repository wide
		(/jcr:system tree, contains mainly versions)
	-->
	<SearchIndex class="org.apache.jackrabbit.core.query.lucene.SearchIndex">
		<param name="path" value="${rep.home}/repository/index"/>
	</SearchIndex>

</Repository>

///////////////////////////////////////////////
// END  REPOSITORY.XML//
//////////////////////////////////////////////

And the code doing the creation, I give you the two algortihm implementations :


/////////////////////////////////////////////////////////////////
// FIRST ALGORITHM : Node by Node//
////////////////////////////////////////////////////////////////

Node contractors = (Node) session.getItem("/lgw:root/lgw:contractors");
int count = number_of_nodes; // whatever,  put the number of nodes to create
for (int i = 0; i < count; i++) {
	Node contractor = contractors.addNode("lgw:contractor");
	initializeContractor(session, contractor);
	created++;
}
session.save();

////////////////////////////////////////////////
// END FIRST ALGORITHM //
////////////////////////////////////////////////

/////////////////////////////////////////////////////////////////////
// SECOND ALGORITHM : Node by Node//
/////////////////////////////////////////////////////////////////////

Node contractors = (Node) session.getItem("/lgw:root/lgw:contractors");
int count = number_of_nodes; // whatever,  put the number of nodes to create
for (int i = 0; i < count; i++) {
	Node contractor = contractors.addNode("lgw:contractor");
	initializeContractor(session, contractor);
	created++;
	session.save();
}

/////////////////////////////////////////////////////
// END SECOND ALGORITHM //
////////////////////////////////////////////////////



Frédéric Esnault - Ingénieur R&D


-----Message d'origine-----
De : Thomas Mueller [mailto:thomas.tom.mueller@gmail.com] 
Envoyé : mercredi 20 juin 2007 09:51
À : dev@jackrabbit.apache.org
Objet : Re: atomic vs group node creation/storage

Hi,

Could you send the configuration (repository.xml file), and the code
if possible (so I don't have to write it again). Just recently I
though I saw a similar problem, but I am not sure if it's related.

Thanks,
Thomas


On 6/20/07, Frédéric Esnault <fesn@legisway.com> wrote:
> Hello there !
>
>
>
> It seems to me that there is a storage problem, when you create a lot of nodes, one by
one, using this algorithm :
>
> 1.      for each node to create
>
>         a.      create node
>         b.      fill node properties/child nodes
>         c.      save session
>
> 2.      end for
>
>
>
> The default_node and default_prop tables number of rows (and size) increases very fast,
and in an unacceptable way.
>
> I had a 35 million default_node table after inserting like this 27 000 nodes in a repository.
>
>
>
> Then I used the other algorithm :
>
> 1.      for each node to create
>
>         a.      create node
>         b.      fill node properties/child nodes
>
> 2.      end for
> 3.      save session
>
>
>
> And this gives a much better situation (currently I have a 36 000 content repository,
and my tables are correct - 60 000 rows for node table,
>
> 576 000 rows for properties).
>
>
>
> The problem here is that in a production environment, users are going to create their
nodes one by one, day after day, never by full blocks.
>
> So is there a storage problem ?
>
>
>
> Frederic Esnault
>
>

Mime
View raw message