Return-Path: Delivered-To: apmail-jackrabbit-users-archive@minotaur.apache.org Received: (qmail 72775 invoked from network); 3 Apr 2009 10:08:05 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 3 Apr 2009 10:08:05 -0000 Received: (qmail 54750 invoked by uid 500); 3 Apr 2009 10:08:04 -0000 Delivered-To: apmail-jackrabbit-users-archive@jackrabbit.apache.org Received: (qmail 54711 invoked by uid 500); 3 Apr 2009 10:08:04 -0000 Mailing-List: contact users-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@jackrabbit.apache.org Delivered-To: mailing list users@jackrabbit.apache.org Received: (qmail 54700 invoked by uid 99); 3 Apr 2009 10:08:04 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 03 Apr 2009 10:08:04 +0000 X-ASF-Spam-Status: No, hits=0.2 required=10.0 tests=SPF_PASS,WHOIS_MYPRIVREG X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [213.38.165.101] (HELO SMTP.butterworths.co.uk) (213.38.165.101) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 03 Apr 2009 10:07:55 +0000 Received: from SMTP.butterworths.co.uk (localhost.localdomain [127.0.0.1]) by localhost (Email Security Appliance) with SMTP id B733B4A6C38_9D5DE21B for ; Fri, 3 Apr 2009 10:00:01 +0000 (GMT) Received: from lngwokexcp002.legal.regn.net (unknown [10.63.41.238]) by SMTP.butterworths.co.uk (Sophos Email Appliance) with ESMTP id 13A3C4A6BFA_9D5DE21F for ; Fri, 3 Apr 2009 09:59:59 +0000 (GMT) Received: from LNGWOKEXCP01VA.legal.regn.net ([10.63.41.215]) by lngwokexcp002.legal.regn.net with Microsoft SMTPSVC(6.0.3790.3959); Fri, 3 Apr 2009 11:07:17 +0100 X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Subject: RE: Help with performace again Date: Fri, 3 Apr 2009 11:07:16 +0100 Message-ID: <466EDF15B3EE964CB2E59849400E40E10FB1EA4A@LNGWOKEXCP01VA.legal.regn.net> In-Reply-To: <22865321.post@talk.nabble.com> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Help with performace again thread-index: Acm0Qt6sMiGIg0kGR16offs+R+yBIgAAMypw From: "Connor, Brett (LNG-TWY)" To: X-OriginalArrivalTime: 03 Apr 2009 10:07:17.0156 (UTC) FILETIME=[F7E50640:01C9B443] X-Virus-Checked: Checked by ClamAV on apache.org What Jackrabbit version are you using? One possibility is the "respectDocumentOrder" SearchIndex parameter. The default for this is false currently but in earlier jackrabbit versions it defaulted to true. Settings this to false in the section might help. =20 > -----Original Message----- > From: daveg0 [mailto:bagel10002000@googlemail.com]=20 > Sent: 03 April 2009 10:57 > To: users@jackrabbit.apache.org > Subject: Help with performace again >=20 >=20 > Hi, >=20 > I am getting some really poor search performance out of our=20 > repository and need some help to try understand what I am doing wrong. >=20 > I wil try and give as much detail as possible: >=20 > 1) We are trying to implement an atom repository (see cnd=20 > file)=20 > http://www.nabble.com/file/p22865321/atom_node_types.cnd=20 > atom_node_types.cnd=20 >=20 > 2) Search performance is VERY slow using this test below: It=20 > appears that the problem is not the searches themselves, but=20 > the loading of the nods in the returned NodeIterator, this is=20 > extemely slow. Does this not lazilly load? >=20 > @Test > public final void testSearchPerformance() throws Exception { > // Test jcr:like > SpringJCRNodeDAO nodeQuery =3D (SpringJCRNodeDAO)=20 > daoFactory.getDAO(Node.class); > StopWatch watch =3D new StopWatch(); > String pathPrefix =3D > "//portal/portal/test-collection//element(*,atom:Entry)"; > String searchPrefix =3D > "//element(*,atom:Service)[@atom:title=3D'portal']/element(*,ato > m:Workspace)[@atom:title=3D'portal']/element(*,atom:Collection)[ > atom:titletext=3D'test-collection']//element(*,atom:Entry)"; >=20 > String rootPrefix =3D "//element(*,atom:Entry)"; >=20 > Limits limits =3D new Limits(); > limits.setMaxRows(10000); > String query =3D null; > QueryResultWrapper> results =3D null; >=20 > query =3D pathPrefix + > "[jcr:like(fn:lower-case(@atom:titletext),'title_1%')]"; > watch.start(); > results =3D nodeQuery.get(query, limits.getMaxRows(),=20 > limits.getOffset()); > displayTime(query, watch); > displayResults(query, results); > } >=20 > results: >=20 > 09:43:06,257 [main] INFO SearchPerformanceTest : Query: > //portal/portal/test-collection//element(*,atom:Entry)[jcr:lik e(fn:lower-case(@atom:titletext),'title_1%')] > time: 4598 > 09:43:06,257 [main] INFO SearchPerformanceTest : Results for: > //portal/portal/test-collection//element(*,atom:Entry)[jcr:lik e(fn:lower-case(@atom:titletext),'title_1%')] > size: 1110 > 09:43:07,972 [main] INFO SearchPerformanceTest : Query: > //portal/portal/test-collection//element(*,atom:Entry)[jcr:con tains(@atom:titletext,'title_1*')] > time: 1715 > 09:43:07,972 [main] INFO SearchPerformanceTest : Results for: > //portal/portal/test-collection//element(*,atom:Entry)[jcr:con tains(@atom:titletext,'title_1*')] > size: 1110 > 09:43:09,639 [main] INFO SearchPerformanceTest : Query: > //portal/portal/test-collection//element(*,atom:Entry)[jcr:lik e(fn:lower-case(@atom:titletext),'title_1%')] > time: 1667 > 09:43:09,639 [main] INFO SearchPerformanceTest : Results for: > //portal/portal/test-collection//element(*,atom:Entry)[jcr:lik e(fn:lower-case(@atom:titletext),'title_1%')] > size: 1110 > 09:43:11,260 [main] INFO SearchPerformanceTest : Query: > //element(*,atom:Entry)[jcr:contains(@atom:titletext,'title_1* ')] time: 1605 09:43:11,260 [main] INFO SearchPerformanceTest : > Results for: > //element(*,atom:Entry)[jcr:contains(@atom:titletext,'title_1* ')] size: 1110 > 09:43:12,881 [main] INFO SearchPerformanceTest : Query: > //element(*,atom:Entry)[jcr:like(fn:lower-case(@atom:titletext ),'title_1%')] > time: 1621 > 09:43:12,881 [main] INFO SearchPerformanceTest : Results for: > //element(*,atom:Entry)[jcr:like(fn:lower-case(@atom:titletext ),'title_1%')] > size: 1110 > 09:43:14,518 [main] INFO SearchPerformanceTest : Query: > //element(*,atom:Service)[@atom:title=3D'portal']/element(*,atom > :Workspace)[@atom:title=3D'portal']/element(*,atom:Collection)[a > tom:titletext=3D'test-collection']//element(*,atom:Entry)[jcr:co ntains(@atom:titletext,'title_1*')] > time: 1637 > 09:43:14,518 [main] INFO SearchPerformanceTest : Results for: > //element(*,atom:Service)[@atom:title=3D'portal']/element(*,atom > :Workspace)[@atom:title=3D'portal']/element(*,atom:Collection)[a > tom:titletext=3D'test-collection']//element(*,atom:Entry)[jcr:co ntains(@atom:titletext,'title_1*')] > size: 1110 > 09:43:16,186 [main] INFO SearchPerformanceTest : Query: > //element(*,atom:Service)[@atom:title=3D'portal']/element(*,atom > :Workspace)[@atom:title=3D'portal']/element(*,atom:Collection)[a > tom:titletext=3D'test-collection']//element(*,atom:Entry)[jcr:li ke(fn:lower-case(@atom:titletext),'title_1%')] > time: 1668 > 09:43:16,186 [main] INFO SearchPerformanceTest : Results for: > //element(*,atom:Service)[@atom:title=3D'portal']/element(*,atom > :Workspace)[@atom:title=3D'portal']/element(*,atom:Collection)[a > tom:titletext=3D'test-collection']//element(*,atom:Entry)[jcr:li ke(fn:lower-case(@atom:titletext),'title_1%')] > size: 1110 >=20 > The node query does something like: >=20 > public QueryResultWrapper> get(String queryString,=20 > long limit, long offset, String userId) throws DAOException { >=20 > // check user privs code etc removed > try { > QueryManager queryManager =3D=20 > session.getWorkspace().getQueryManager(); >=20 > // This code is tied to JackRabbbit as it allows limits > // and offsets on the queries. The uncommented line is JCR > // implementation agnostic > Query query =3D queryManager.createQuery(queryString,=20 > Query.XPATH); > // QueryResult queryresult =3D query.execute(); > // =3D=3D=3D=3D=3D=3D=3D=3D=3D JackRabbit-specific code > QueryImpl jackRabbitQuery =3D (QueryImpl) query; > jackRabbitQuery.setLimit(limit); > jackRabbitQuery.setOffset(offset); > jackrabbitQueryResult =3D (QueryResultImpl)=20 > jackRabbitQuery.execute(); > // QueryResult queryresult =3D jackRabbitQuery.execute(); > // =3D=3D=3D=3D=3D End of JakcRabbit-specific code >=20 > // NodeIterator nodes =3D queryresult.getNodes(); > NodeIterator nodes =3D jackrabbitQueryResult.getNodes(); > while (nodes.hasNext()) { > returnList.add(nodes.nextNode());=09=09 > } > } catch (Exception e) { > LOG.error(e.getMessage(), e); > throw new DAOException(e.getMessage()); > } > } >=20 > The equivalent Lucene searches through all of the index=20 > subdirectories in workspaces/default/index say is: >=20 > @Test > public void testTitle() throws Exception { > File[] indexes =3D new File(indexDir).listFiles(new=20 > FileFilter() { > @Override > public boolean accept(File f) { > return f.isDirectory(); > } > }); >=20 > StopWatch watch =3D new StopWatch(); > for (File file : indexes) { > Directory directory =3D FSDirectory.getDirectory(file); > IndexSearcher searcher =3D new IndexSearcher(directory); > try { > System.out.println("Directory: " + directory); > Term t =3D new Term("12:FULL:titletext", "*"); > Query query =3D new WildcardQuery(t); > watch.start(); > Hits hits =3D searcher.search(query); > //showDocs(hits); > watch.stop(); > System.out.println("Hits: " + hits.length() +=20 > " query: " + query + " time: " + watch.getTime()); > watch.reset(); > } finally { > searcher.close(); > directory.close(); > } > } > } >=20 > private void showDocs(Hits hits) throws=20 > CorruptIndexException, IOException { > Document doc; > for (int i =3D 0; i < hits.length(); i++) { > doc =3D hits.doc(i); > System.out.println("doc: " +=20 > doc.getField("_:UUID").stringValue()); > } > } >=20 > This returns very quickly: >=20 > Directory: > org.apache.lucene.store.FSDirectory@C:\repository\workspaces\d efault\index\_5 > Hits: 601 query: 12:FULL:titletext:* time: 47 > Directory: > org.apache.lucene.store.FSDirectory@C:\repository\workspaces\d efault\index\_b > Hits: 699 query: 12:FULL:titletext:* time: 16 > Directory: > org.apache.lucene.store.FSDirectory@C:\repository\workspaces\d efault\index\_h > Hits: 811 query: 12:FULL:titletext:* time: 47 > Directory: > org.apache.lucene.store.FSDirectory@C:\repository\workspaces\d efault\index\_y > Hits: 199 query: 12:FULL:titletext:* time: 15 > Directory: > org.apache.lucene.store.FSDirectory@C:\repository\workspaces\d efault\index\_z > Hits: 1000 query: 12:FULL:titletext:* time: 31 >=20 > I have inserted some debug statements into the code and it=20 > appears that the NodeIterator is the problem, the first call=20 > to NodeIterator.hasNext() seems to take seconds with only=20 > 1000 nodes, is there something that can be done about this as=20 > the searches are quick but the loading of the nodes is VERY slow, >=20 > regards, >=20 > Dave >=20 >=20 >=20 >=20 >=20 > -- > View this message in context:=20 > http://www.nabble.com/Help-with-performace-again-tp22865321p22 865321.html > Sent from the Jackrabbit - Users mailing list archive at Nabble.com. >=20 >=20 LexisNexis is a trading name of REED ELSEVIER (UK) LIMITED - Registered off= ice - 1-3 STRAND, LONDON WC2N 5JR Registered in England - Company No. 02746621