jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ard Schrijvers" <a.schrijv...@hippo.nl>
Subject RE: Performance as compared to simple sql db query is quite bad
Date Fri, 01 Feb 2008 09:11:24 GMT

Hello Sateesh,

first of all I agree with Alessandro that you cannot compare it the way
you do. Furthermore, you implemented a query involving hierarchical
structures, which are expensive (the first execute, the second is much
faster). Futhermore, obviously 58 seconds is not an option, and therefor
I am still scheduled in due time to write an faq / best practice on
querying.  You're sql query, 'SELECT * FROM ArtifactFriendlyName' which
took 125ms in sql with database, if you're xpath would look like '//*'
it would be really fast. Now, iterating over the result set of 8.000
nodes will still be slow, if you configured that 'respectDocumentOrder'
is true (it is the default). Test you're query like '//* order by
@someprop-youhave'.

Anyway, read my mail [1] and you'll see exactlt the two reasons why your
query + iterating was really slow. Do realize this is *not* necessary
and when configuration is correct, and you are not measuring the first
execution of a hierarchical query (try that by the way in sql database
as well (hierarchical data in one table), it won't be really fast i
think)

Anyway, wiki will follow, for now read [1]

[1]
http://mail-archives.apache.org/mod_mbox/jackrabbit-users/200801.mbox/%3
cF8E386B54CE3E6408F3A32ABB9A7908A7E69CC@hai02.hippointern.lan%3e
 
> 
> The project I am looking into needs to store friendly ids to 
> a file, and there could be multiple types of file. I want to 
> show the result which lists all the id's. Below is the JR 
> structure and SQL table and their performance.
> As per configuration, I am using MSSqlPersistenceManager and 
> using DataStore for files. Otherwise rest of the config is defaults.
> 
> JR structure:
> 
> /content/data/Ids/type1/*
> /content/data/Ids/type2/*
> /content/data/Ids/type3/*
> /content/data/Ids/type4/*
> /content/data/Ids/type5/*
>  
> All the ids are equally distributed under the specific type.
> 
> SQL structure:
> 
> A table with columns: Name, Type
> 
> Performance numbers in ms (items are spread equally among 
> types), when using the below query. As the numbers show, it's 
> pretty bad. Is this expected? any way to better this?
> 
> Items:   JR        SQL     
> 150       551       15      
> 1000     2969     78      
> 2000     6470     94      
> 4000     16816   94     
> 8000     58966   125  
> 
>                         Workspace workSpace = session.getWorkspace();
> 			QueryManager queryManager = 
> workSpace.getQueryManager();
> 
> 			 StringBuffer queryStr = new
> 			 StringBuffer("//data/componentIds/*/*");
> 			 Query query = 
> queryManager.createQuery(queryStr.toString(),
> 			 Query.XPATH);
> 
> 			Query query = 
> queryManager.createQuery(queryStr.toString(),
> 					Query.XPATH);
> 
> 			long begin = System.currentTimeMillis();
> 			QueryResult queryResult = query.execute();
> 			int iSize = 0;
> 			NodeIterator queryResultNodeIterator = 
> queryResult.getNodes();
> 			while (queryResultNodeIterator.hasNext()) {
> 
> 				Node componentIdNode = 
> queryResultNodeIterator.nextNode();
> 				iSize++;
> 				// 
> System.out.println(componentIdNode.getName());
> 			}
> 			long end = System.currentTimeMillis();
> 			System.out.println("**** time for: " + 
> iSize + " : "
> 					+ (end - begin));
> 
> For SQL, it's a simple JDBC call:
>                 long begin = System.currentTimeMillis();
> 
> 		Statement stmt = con.createStatement();
> 		ResultSet rs = stmt.executeQuery("SELECT * FROM 
> ArtifactFriendlyName");
> 		int iSize = 0;
> 		while (rs.next()) {
> 			iSize++;
> 			String s = rs.getString("FriendlyName");
> 		}
> 
> 		long end = System.currentTimeMillis();
> 
> 		System.out.println("Time taken for: " + iSize + 
> " : " + (end - begin));
> 
> 
> Thanks,
> Sateesh.
> 
> --
> View this message in context: 
> http://www.nabble.com/Performance-as-compared-to-simple-sql-db
> -query-is-quite-bad-tp15218031p15218031.html
> Sent from the Jackrabbit - Users mailing list archive at Nabble.com.
> 
> 

Mime
View raw message