jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ard Schrijvers <a.schrijv...@onehippo.com>
Subject Re: Faceted Search Implementation
Date Wed, 25 Aug 2010 06:55:24 GMT
Hello James et al,

I start with a disclaimer: sry for the coming blunt sales (of however
an Apache licensed Hippo Repository on top of Jackrabbit) but:

*You'd like to expose you node structure over Faceted Navigation*

This is one of the virtual layers Hippo Repository provides: We expose
faceted navigation over JCR. You can configure any node of type
'hippofacnav:facetnavigation' and configure some properties [1] on it,
that characterise the faceted navigation (like which facets to
expose).

You can checkout the hippo demo here [2], showing the cms, repository
and website, were a lot is done with faceted navigation. You have the
online version, and a download if you want (for linux we still need to
add an installer, but you can also checkout the source and build it).

Also note that the faceted navigation is exposed with including an
authorization filter: thus, we expose authorized correct counts
faceted navigation, all blistering fast as it is all in Lucene.

Currently, we are finalizing that next to the 'fixed' faceted
navigation configuration, we can inject xpath/sql/sql2 to 'narrow'
down the faceted navigation.

After this, I probably will start in not too long provide
configuration options to include 'lookups' : in other words, the facet
of a document is a reference path to some other node containing the,
say, name of the author. We want to show the name of the author as
facet. This will be added by 'lookup' methods.

Finally, to stop bragging to much about Hippo Repository, I know Jahia
implemented some complete different way of exposing faceted navigation
on top of Jackrabbit. They do not expose it over JCR, but do it more
client side, but use a SolrSearchIndex wrapper kind of thing in
Jackrabbit.

So, yes, Solr can do faceted navigation, but Solr is more about flat
structures and does not care to much about authorisation. Also, it is
a separate dependency again. It is like saying Jackrabbit does not
need internal caches as it can be perfectly done by some cache
framework like jcr, whirlycache, ehcache, infinispan, etc etc. It does
not always make sense: I think Solr has a very different purpose, but
people tend to always think about Solr when faceted navigation is
brought up.

Also, we at Hippo should make time to try to see what we can and what
people would like to be moved in to Jackrabbit.

Regards Ard

[1] https://wiki.onehippo.com/display/CMS7/Faceted+Navigation+Configuration
[2] http://www.onehippo.com/en/products/cms/try

On Tue, Aug 24, 2010 at 7:26 PM, Korbinian Bachl - privat
<korbinian.bachl@whiskyworld.de> wrote:
> Hi James,
>
> IMHO this construct is not what Jackrabbit should do or can do well. You can
> save the data that way or others, however to query it later (usually for
> display on a website etc.) you might want to have a look at SOLR
> (http://lucene.apache.org/solr/). An example for facting is under
> http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Faceted-Search-Solr
> explained.
>
> Think that faceting isn't only the type/ product path - its the n-relevant
> feature/type path on a possible subsize/ submass of the original set.
> Usually faceting therefore is unique per page visit. Meaning you will have
> many concurrent facets from different sources on a different (sub)set. This
> could be done in jackrabbit, but IMHO won't scale nearly as well as a
> specialized search solution like SOLR where it won't matter if you have 10
> products or 10 million products.
>
> Best,
>
> Korbinian
>
> PS: SOLR is Lucene, too :)
>
> Am 23.08.10 13:08, schrieb Gadbury:
>>
>> Hi all,
>>
>> I am trying to work out a good way to implement faceted search for
>> products
>> in an ecommerce solution.  Please consider the following diagram which
>> shows
>> the structure of my categories and products:
>>
>>
>> http://jackrabbit.510166.n4.nabble.com/file/n2334944/category-product_structure.png
>>
>> I have the following custom node types which implement mix:referenceable
>> so
>> they eachhave a unique UUID:
>>
>> - Category (i.e. hardware)
>> - FacetType (i.e. manufacturer, warranty)
>> - FacetValue (i.e. amd, intel, samsung, 1 year, 2 years, 3 years)
>>
>> A product has a number of properties but of importance here are the
>> following properties which are weak references (a String representing the
>> UUID(s) ) to the nodes Category and FacetValue:
>>
>> - categoryUUIDs
>> - facetvalueUUIDs
>>
>> Currently I am tracking the facet type values the user has selected and
>> adding them to a query, which retrieves the relevant products.  This works
>> although it may be slow with many products!  Here is an example of the
>> XPath
>> query:
>>
>> //element(*,
>> jpg:product)[@jpg:categoryUUIDs='d93681a3-8b4e-4c2a-9dcb-a219848f8f3a' and
>> ((@jpg:facetvalueUUIDs='70588aa9-6cb1-4ee1-af95-a21f78968e74') and
>> (@jpg:facetvalueUUIDs='bec141e8-f4c5-41c5-9cef-560dab296750'))] order by
>> @jpg:cost
>>
>> Once the query is executed, I am iterating over all products, and getting:
>>
>> each unique facet type UUID and name
>> each unique facet value UUID and name
>> a count of each occurence of a facetValueUUID
>>
>> This data is presented back to the user to offer them a selection of
>> facets
>> to filter by.  For example:
>>
>> Manufacturer:
>> amd [2]
>> intel [3]
>> samsung [5]
>>
>> Warranty
>> 1 year [3]
>> 3 years [7]
>>
>> I know this works but I am sure there must be a more efficient / refined
>> way
>> to do this... perhaps I am completely misunderstanding Jackrabbit and how
>> to
>> get the most out of Lucene.  Is there another way that I should consider
>> doing this?  I would really appreciate any suggestions / improvements.
>>
>> Thanks for reading and kind regards,
>>
>> James
>
>

Mime
View raw message