jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From msl...@email.cz
Subject Re: Re: XPath query performance question
Date Fri, 03 Feb 2012 15:24:33 GMT

No property 'calais' is not used anywhere else. So if I use query without path info it will
return the same result.

Marek

> ------------ Původní zpráva ------------
> Od: Alessandro <alessandro.bologna@gmail.com>
> Předmět: Re: XPath query performance question
> Datum: 03.2.2012 16:12:15
> ----------------------------------------
> If you were running the query without path restrictions, would it return more
> than one node? In other words, outside the /companies tree, are there other
> company nodes with the same calais attribute value?
> Results are generated from the predicate, and then filtered by the path.
> 
> Alessandro 
> 
> On Feb 3, 2012, at 7:13 AM, mslama@email.cz wrote:
> 
> > Hi,
> > 
> > I have following use case:
> > 
> > I have about 2000 company nodes under node companies:
> > /companies/company[1]
> > /companies/company[2]
> > ....
> > /companies/company[N]
> > 
> > I query for one company by property value - exact match, no wildcards. And
> result should contain just one node. For example I use query:
> > 
> >
> //companies/company[@calais='http://d.opencalais.com/er/company/ralg-tr1r/2c970a55-e08d-3af8-ad1d-3c46f341e749']
> > 
> > and then one call of NodeIterator.next to get unique (or first as there is no
> constraint on uniqueness) result. So there is no big resultset.
> > 
> > Property 'calais' is string type and when set it is unique ie. small number of
> company nodes may have this property either empty or missing. Property value can
> be up to 100chars long if it can make any difference for index.
> > 
> > When only one thread is running it takes 100-200ms. When 4 threads are running
> it is about 500ms on average. I used
> > profiler with sampling to get some profiling data. I seems to be too much
> provided that number on nodes is not that high
> > and it is using Lucene index. Calls of query.execute and nodeIterator.next
> take both about the same time.
> > When I checked thread dumps it uses Lucene index so it does not look like it
> scans all nodes.
> > 
> > Question: Is there any way how speedup this kind of lookup? The only way I
> found so far is to incorporate the most often property used for lookup to node
> path as session.getNode(path) is much faster.
> > 
> > I use Jackrabbit 2.2.9 and Postgres 9.1 for saving all data but Lucene index.
> It runs on JBoss 7.
> > 
> > I searched for Jackrabbit XPath performance but no match for my use case: 
> > a) exact property match without like/wildcards
> > b) small resultset - just one result item
> > 
> > Thanks
> > 
> > Marek
> 
> 
> 

Marek Slama
mslama@email.cz

Mime
View raw message