Mailing-List: contact users-help@jackrabbit.apache.org; run by ezmlm
Precedence: bulk
Reply-To: users@jackrabbit.apache.org
Received-SPF: pass (nike.apache.org: domain of lists@nabble.com designates
 216.139.236.158 as permitted sender)
Message-ID: <15224920.post@talk.nabble.com>
Date: Fri, 1 Feb 2008 03:49:52 -0800 (PST)
From: zevon <sateeshl@expedia.com>
To: users@jackrabbit.apache.org
Subject: Re: Performance as compared to simple sql db query is quite bad
In-Reply-To: <15218031.post@talk.nabble.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
References: <15218031.post@talk.nabble.com>


Thanks guys. I do understand that it's not a fair comparision, but the order
of magnitude is what was concerning and given my data would eaily fit into a
sql table, I gave it a shot to see how it fares.

Alessandro, to clear a couple of things, the query in code is correct and
structure I mentioned is incorrect. It should read
/data/componentIds/type*/* instead.

Per the suggestion I tried '//*' and it did dramatically well. For 8000
items, first run took ~4000ms and second run ~80ms (probably everything is
in cache). 

I am afraid I might not be able to use '//*' everytime, the given example
was the simple one, but for each of these type nodes, I would have nodes
which carry other information and one referring the other. Doing a  scan
everytime and filtering out unwanted would be expensive too.

Can I specify the schema for the tables, with the current tables with just 2
columns it seems like it won't scale well(storing hierarchy in 2
columns/flat table)

Ard your documentation will definitely help. Thanks.

Are there any benchmarks that I can look at?

-Sateesh.


zevon wrote:
> 
> The project I am looking into needs to store friendly ids to a file, and
> there could be multiple types of file. I want to show the result which
> lists all the id's. Below is the JR structure and SQL table and their
> performance. As per configuration, I am using MSSqlPersistenceManager and
> using DataStore for files. Otherwise rest of the config is defaults.
> 
> JR structure:
> 
> /content/data/Ids/type1/*
> /content/data/Ids/type2/*
> /content/data/Ids/type3/*
> /content/data/Ids/type4/*
> /content/data/Ids/type5/*
>  
> All the ids are equally distributed under the specific type.
> 
> SQL structure:
> 
> A table with columns: Name, Type
> 
> Performance numbers in ms (items are spread equally among types), when
> using the below query. As the numbers show, it's pretty bad. Is this
> expected? any way to better this?
> 
> Items:   JR        SQL     
> 150       551       15      
> 1000     2969     78      
> 2000     6470     94      
> 4000     16816   94     
> 8000     58966   125  
> 
>                         Workspace workSpace = session.getWorkspace();
> 			QueryManager queryManager = workSpace.getQueryManager();
> 
> 			 StringBuffer queryStr = new
> 			 StringBuffer("//data/componentIds/*/*");
> 			 Query query = queryManager.createQuery(queryStr.toString(),
> 			 Query.XPATH);
> 
> 			Query query = queryManager.createQuery(queryStr.toString(),
> 					Query.XPATH);
> 
> 			long begin = System.currentTimeMillis();
> 			QueryResult queryResult = query.execute();
> 			int iSize = 0;
> 			NodeIterator queryResultNodeIterator = queryResult.getNodes();
> 			while (queryResultNodeIterator.hasNext()) {
> 
> 				Node componentIdNode = queryResultNodeIterator.nextNode();
> 				iSize++;
> 				// System.out.println(componentIdNode.getName());
> 			}
> 			long end = System.currentTimeMillis();
> 			System.out.println("**** time for: " + iSize + " : "
> 					+ (end - begin));
> 
> For SQL, it's a simple JDBC call:
>                 long begin = System.currentTimeMillis();
> 
> 		Statement stmt = con.createStatement();
> 		ResultSet rs = stmt.executeQuery("SELECT * FROM ArtifactFriendlyName");
> 		int iSize = 0;
> 		while (rs.next()) {
> 			iSize++;
> 			String s = rs.getString("FriendlyName");
> 		}
> 
> 		long end = System.currentTimeMillis();
> 
> 		System.out.println("Time taken for: " + iSize + " : " + (end - begin));
> 
> 
> Thanks,
> Sateesh.
> 
> 

-- 
View this message in context: http://www.nabble.com/Performance-as-compared-to-simple-sql-db-query-is-quite-bad-tp15218031p15224920.html
Sent from the Jackrabbit - Users mailing list archive at Nabble.com.