Return-Path: Delivered-To: apmail-jackrabbit-users-archive@locus.apache.org Received: (qmail 81533 invoked from network); 1 Feb 2008 11:50:24 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 1 Feb 2008 11:50:24 -0000 Received: (qmail 88070 invoked by uid 500); 1 Feb 2008 11:50:14 -0000 Delivered-To: apmail-jackrabbit-users-archive@jackrabbit.apache.org Received: (qmail 88050 invoked by uid 500); 1 Feb 2008 11:50:14 -0000 Mailing-List: contact users-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@jackrabbit.apache.org Delivered-To: mailing list users@jackrabbit.apache.org Received: (qmail 88041 invoked by uid 99); 1 Feb 2008 11:50:14 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 Feb 2008 03:50:14 -0800 X-ASF-Spam-Status: No, hits=2.6 required=10.0 tests=DNS_FROM_OPENWHOIS,SPF_HELO_PASS,SPF_PASS,WHOIS_MYPRIVREG X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of lists@nabble.com designates 216.139.236.158 as permitted sender) Received: from [216.139.236.158] (HELO kuber.nabble.com) (216.139.236.158) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 Feb 2008 11:49:59 +0000 Received: from isper.nabble.com ([192.168.236.156]) by kuber.nabble.com with esmtp (Exim 4.63) (envelope-from ) id 1JKuP2-0005xC-63 for users@jackrabbit.apache.org; Fri, 01 Feb 2008 03:49:52 -0800 Message-ID: <15224920.post@talk.nabble.com> Date: Fri, 1 Feb 2008 03:49:52 -0800 (PST) From: zevon To: users@jackrabbit.apache.org Subject: Re: Performance as compared to simple sql db query is quite bad In-Reply-To: <15218031.post@talk.nabble.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Nabble-From: sateeshl@expedia.com References: <15218031.post@talk.nabble.com> X-Virus-Checked: Checked by ClamAV on apache.org Thanks guys. I do understand that it's not a fair comparision, but the order of magnitude is what was concerning and given my data would eaily fit into a sql table, I gave it a shot to see how it fares. Alessandro, to clear a couple of things, the query in code is correct and structure I mentioned is incorrect. It should read /data/componentIds/type*/* instead. Per the suggestion I tried '//*' and it did dramatically well. For 8000 items, first run took ~4000ms and second run ~80ms (probably everything is in cache). I am afraid I might not be able to use '//*' everytime, the given example was the simple one, but for each of these type nodes, I would have nodes which carry other information and one referring the other. Doing a scan everytime and filtering out unwanted would be expensive too. Can I specify the schema for the tables, with the current tables with just 2 columns it seems like it won't scale well(storing hierarchy in 2 columns/flat table) Ard your documentation will definitely help. Thanks. Are there any benchmarks that I can look at? -Sateesh. zevon wrote: > > The project I am looking into needs to store friendly ids to a file, and > there could be multiple types of file. I want to show the result which > lists all the id's. Below is the JR structure and SQL table and their > performance. As per configuration, I am using MSSqlPersistenceManager and > using DataStore for files. Otherwise rest of the config is defaults. > > JR structure: > > /content/data/Ids/type1/* > /content/data/Ids/type2/* > /content/data/Ids/type3/* > /content/data/Ids/type4/* > /content/data/Ids/type5/* > > All the ids are equally distributed under the specific type. > > SQL structure: > > A table with columns: Name, Type > > Performance numbers in ms (items are spread equally among types), when > using the below query. As the numbers show, it's pretty bad. Is this > expected? any way to better this? > > Items: JR SQL > 150 551 15 > 1000 2969 78 > 2000 6470 94 > 4000 16816 94 > 8000 58966 125 > > Workspace workSpace = session.getWorkspace(); > QueryManager queryManager = workSpace.getQueryManager(); > > StringBuffer queryStr = new > StringBuffer("//data/componentIds/*/*"); > Query query = queryManager.createQuery(queryStr.toString(), > Query.XPATH); > > Query query = queryManager.createQuery(queryStr.toString(), > Query.XPATH); > > long begin = System.currentTimeMillis(); > QueryResult queryResult = query.execute(); > int iSize = 0; > NodeIterator queryResultNodeIterator = queryResult.getNodes(); > while (queryResultNodeIterator.hasNext()) { > > Node componentIdNode = queryResultNodeIterator.nextNode(); > iSize++; > // System.out.println(componentIdNode.getName()); > } > long end = System.currentTimeMillis(); > System.out.println("**** time for: " + iSize + " : " > + (end - begin)); > > For SQL, it's a simple JDBC call: > long begin = System.currentTimeMillis(); > > Statement stmt = con.createStatement(); > ResultSet rs = stmt.executeQuery("SELECT * FROM ArtifactFriendlyName"); > int iSize = 0; > while (rs.next()) { > iSize++; > String s = rs.getString("FriendlyName"); > } > > long end = System.currentTimeMillis(); > > System.out.println("Time taken for: " + iSize + " : " + (end - begin)); > > > Thanks, > Sateesh. > > -- View this message in context: http://www.nabble.com/Performance-as-compared-to-simple-sql-db-query-is-quite-bad-tp15218031p15224920.html Sent from the Jackrabbit - Users mailing list archive at Nabble.com.