jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gadbury <gadb...@googlemail.com>
Subject XPath Query Performance with Path Constraints - Index Aggregate?
Date Thu, 09 Sep 2010 14:04:01 GMT

Hi all,

I've noticed (and read) that XPath queries which supply a path constraint
are slow due to the extra lookup required to account for the hierachical
structure.

I have Order nodes which represent customer orders.
I have OrderElement nodes which represent each item (product) and quantity
that makes up the Order.  

Here is an excerpt from my custom node definition file:

    [jpg:order] > nt:hierarchyNode, mix:referenceable
    - jpg:transactionUUID (STRING)                    
    - jpg:customerAccountUUID (STRING) mandatory    
    - jpg:orderStatus (LONG) mandatory                
    - jpg:cost (DOUBLE) mandatory
    + orderElements (jpg:default)                  
    = jpg:orderElement primary copy

    [jpg:orderElement] > jpg:default
    - jpg:productUUID (STRING) mandatory    
    - jpg:quantity (LONG) mandatory
    - jpg:cost (DOUBLE) mandatory
    - jpg:orderUUID (STRING) mandatory        
    - jpg:isRefunded (BOOLEAN) = 'false'
    - jpg:refundedDate (DATE) 

The image below shows the structure:

http://jackrabbit.510166.n4.nabble.com/file/n2532925/shop_orders_structure.png 

I need to get the Orders for a particular customer where the Order contains
an OrderElement for a particular product.  Here is my XPath query:

	//element(*,
jpg:order)[@jpg:customerAccountUUID='aaa1bbab-ee77-48a5-8e83-214689f73deb']/orderElements/element(*,
jpg:orderElement)[@jpg:productUUID = '23b5b2d6-1b63-42c9-8398-8cac8c296267']

For customers with many orders, the query is really slow (a few seconds).  I
have 5000 Order nodes each of which have 1 - 6 OrderElement nodes.  So I
could have approximately 20,000 customer order related nodes.  

Is there any way at all to improve the use of a query that supplies a path?

I cannot see another way to do this unless I include the customerAccountUUID
property in OrderElement nodes, then I could use the query:

	//element(*,
jpg:orderElement)[@jpg:customerAccountUUID='aaa1bbab-ee77-48a5-8e83-214689f73deb'
and @jpg:productUUID = '23b5b2d6-1b63-42c9-8398-8cac8c296267']

... which I assume would be faster as I am not supplying path constraints. 
It seems strange to me that queries with path constraints are slow but one
of the main points of JCR is to offer hierachical structure.

I read about Index Aggregates - would these help?  It sounds like they might
- the description: "Sometimes it is useful to include the contents of
descendant nodes into a single node to easier search on content that is
scattered across multiple nodes."

Sometimes it is useful to include the contents of descendant nodes into a
single node to easier search on content that is scattered across multiple
nodes.
-- 
View this message in context: http://jackrabbit.510166.n4.nabble.com/XPath-Query-Performance-with-Path-Constraints-Index-Aggregate-tp2532925p2532925.html
Sent from the Jackrabbit - Users mailing list archive at Nabble.com.

Mime
View raw message