lucene-lucene-net-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From André Maldonado <andre.maldon...@gmail.com>
Subject Re: Category count.
Date Mon, 09 Nov 2009 14:55:29 GMT
Solr has a XML API, correct? So it can be used with .net.

Or I'm wrong?

Thank's

"Então aproximaram-se os que estavam no barco, e adoraram-no, dizendo: És
verdadeiramente o Filho de Deus." (Mateus 14:33)


On Mon, Nov 9, 2009 at 12:14, Erik Hatcher <erik.hatcher@gmail.com> wrote:

> Note that Solr has faceted built-in, and uses Lucene's goodness too.  And
> it scales quite well.
>
>        Erik
>
>
>
> On Nov 9, 2009, at 8:12 AM, Moray McConnachie wrote:
>
>  This is basically Lucene for faceted search I think?
>>
>> Most approaches I have seen to this involve caching results and/or
>> duplicating the facet information in an alternate data store.
>>
>> The best resource I have seen using caching results. It permits you to
>> drill down into multiple facets and get the no. of documents per facet
>> updated easily without going back to the Lucene engine multiple queries.
>>
>>
>> http://www.devatwork.nl/index.php/articles/lucenenet/faceted-search-and-drill-down-lucenenet/
>>
>> 1) at initialisation (and/or at set points) step through all the potential
>> facet values and store the matching results in some kind of cached
>> dictionary of bit arrays
>> 2) the user drills down into whatever facets
>> 3) you AND together the bit arrays representing each facet the user is in
>> 4) You count the number of positive bits in the resulting bit array to get
>> the number of articles matched.
>>
>> At 3) you could clearly AND this together with any other Lucene result set
>> to get accurate counts when you are integrating facets and non-faceted
>> search results.
>>
>> The approach works best the higher the ratio of queries to updates - it
>> will work poorly for applications with any or all of
>>
>> a) very frequent updating
>> b) the need for facets to be 100% accurate in real time
>> c) a large number of potential facet values (initialisation could be very
>> slow)
>>
>> With a little extra work on the indexing end you could conquer a) and b)
>> and hopefully get round the need to reinitialise from scratch.
>>
>> I'm not sure how well it would work with very large datasets either,
>> particularly where the number of matches in some facet is very large - I've
>> never had to work with bit arrays of millions of bits!
>>
>> I like this approach because it is a 100% lucene solution and it is
>> (relatively) fast compared to your approach so far and other similar
>> approaches.
>>
>> Faceting is such a common meme for search, I can foresee someone porting
>> faceting functionality into the back end if indeed it is not already
>> happening?
>>
>> Yours,
>> Moray
>>
>>
>> -------------------------------------
>> Moray McConnachie
>> Director of IT    +44 1865 261 600
>> Oxford Analytica  http://www.oxan.com
>>
>> -----Original Message-----
>> From: André Maldonado [mailto:andre.maldonado@gmail.com]
>> Sent: 09 November 2009 12:44
>> To: lucene-net-user@incubator.apache.org
>> Subject: Category count.
>>
>> Hy all. I have a problem that is exactly like this (that was wrote from
>> another developer)
>>
>> "I am trying to use Lucene Java 2.3.2 to implement search on a catalog of
>> products. Apart from the regular fields for a product, there is field called
>> 'Category'. A product can fall in multiple categories. Currently, I use
>> FilteredQuery to search for the same search term with every Category to get
>> the number of results per category.
>>
>> This results in 20-30 internal search calls per query to display the
>> results. This is slowing down the search considerably. Is there a faster way
>> of achieving the same result using Lucene?"
>> But in the thread that I found this question, I didn't found any good
>> solution.
>>
>> Can you help me?
>>
>> Thank's
>>
>> "Então aproximaram-se os que estavam no barco, e adoraram-no, dizendo: És
>> verdadeiramente o Filho de Deus." (Mateus 14:33)
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message