jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Klimetschek <aklim...@day.com>
Subject Re: Faceted Search Implementation
Date Tue, 24 Aug 2010 13:07:01 GMT
On Mon, Aug 23, 2010 at 13:08, Gadbury <gadbury@googlemail.com> wrote:
>
> Hi all,
>
> I am trying to work out a good way to implement faceted search for products
> in an ecommerce solution.  Please consider the following diagram which shows
> the structure of my categories and products:
>
> http://jackrabbit.510166.n4.nabble.com/file/n2334944/category-product_structure.png
>
> I have the following custom node types which implement mix:referenceable so
> they eachhave a unique UUID:
>
> - Category (i.e. hardware)
> - FacetType (i.e. manufacturer, warranty)
> - FacetValue (i.e. amd, intel, samsung, 1 year, 2 years, 3 years)
>
> A product has a number of properties but of importance here are the
> following properties which are weak references (a String representing the
> UUID(s) ) to the nodes Category and FacetValue:
>
> - categoryUUIDs
> - facetvalueUUIDs
>
> Currently I am tracking the facet type values the user has selected and
> adding them to a query, which retrieves the relevant products.  This works
> although it may be slow with many products!  Here is an example of the XPath
> query:
>
> //element(*,
> jpg:product)[@jpg:categoryUUIDs='d93681a3-8b4e-4c2a-9dcb-a219848f8f3a' and
> ((@jpg:facetvalueUUIDs='70588aa9-6cb1-4ee1-af95-a21f78968e74') and
> (@jpg:facetvalueUUIDs='bec141e8-f4c5-41c5-9cef-560dab296750'))] order by
> @jpg:cost
>
> Once the query is executed, I am iterating over all products, and getting:
>
> each unique facet type UUID and name
> each unique facet value UUID and name
> a count of each occurence of a facetValueUUID
>
> This data is presented back to the user to offer them a selection of facets
> to filter by.  For example:
>
> Manufacturer:
> amd [2]
> intel [3]
> samsung [5]
>
> Warranty
> 1 year [3]
> 3 years [7]
>
> I know this works but I am sure there must be a more efficient / refined way
> to do this... perhaps I am completely misunderstanding Jackrabbit and how to
> get the most out of Lucene.  Is there another way that I should consider
> doing this?  I would really appreciate any suggestions / improvements.
>
> Thanks for reading and kind regards,

I would not use UUIDs, but rather use the paths of the facets. See
also David's Model, rule #7 [1]. Paths are already unique, if you
avoid SNS, and if you don't expect frequent move or merge operations
on the facets (because you have to update all the content nodes then -
which might be ok).

Finally you can leverage the hierarchy on facets to avoid the
distinction of facet categories and values. For the manufacturer you'd
have these facets:

/facets/manufacturer/amd
/facets/manufacturer/intel
/facets/manufacturer/samsung

And on the content (products), you'd only have a multi-value string
property "facets", containing the paths of all facets. You can search
for facet values directly (@facets='/facets/manufacturer/intel') but
using jcr:like you can also search for facet categories:
jcr:like(@facets, '/facets/manufacturer/%').

[1] http://wiki.apache.org/jackrabbit/DavidsModel

Regards,
Alex

-- 
Alexander Klimetschek
alexander.klimetschek@day.com

Mime
View raw message