lucenenet-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ben West <bwsithspaw...@yahoo.com>
Subject Re: [Lucene.Net] Faceted Search in Lucene.NET without using the Bits-method
Date Wed, 25 May 2011 15:38:52 GMT
Thanks for sharing your code!

Two suggestions about performance:
1. It doesn't look to me like you're caching them. You might want to use a CachingWrapperFilter. 
2. Also, SpanTermQueries are slower than TermQueries and SpanQueryFilters slower than QueryWrapperFilters
- unless you have in your custom code some reason why you want to do a span query, it's probably
quicker to just use a TermQuery.

Something I'm not sure about: if your categoryBitSet only has MaxSearchResultSize hits, then
the intersection of search and category might have much less than MaxSearchResultSize, right?
So you might want to find all hits for the categoryBitSet and only limit the query itself.

hth
-ben


----- Original Message -----
From: lars aslin <larsaaslin@gmail.com>
To: lucene-net-user@lucene.apache.org
Cc: 
Sent: Wednesday, May 25, 2011 3:00 AM
Subject: Re: [Lucene.Net] Faceted Search in Lucene.NET without using the Bits-method

This is the generic part of my implementation

        public IList<KeyValuePair<string, int>> GetFacets(OpenBitSet
searchBitSet, IndexReader indexReader, string category, IEnumerable<string>
categoryValues)
        {
            return categoryValues.Select(categoryValue =>
FacetedSearch(searchBitSet, indexReader, category, categoryValue))
                .Select(facet => facet).Where(facet => facet.Value >
0).ToList();
        }

        public KeyValuePair<string,int> FacetedSearch(OpenBitSet
searchBitSet, IndexReader indexReader, string category, string
categoryValue)
        {
            var categoryQuery = new SpanTermQuery(new Term(category,
categoryValue));
            var categoryQueryFilter = new SpanQueryFilter(categoryQuery);
            var docSetIterator =
categoryQueryFilter.GetDocIdSet(indexReader).Iterator();
            var categoryBitSet = new OpenBitSetDISI(docSetIterator,
FullTextSearchSettings.GetMaxSearchResultSize);

            categoryBitSet.And(searchBitSet);
            return new KeyValuePair<string, int>(categoryValue,
(int)categoryBitSet.Cardinality());
        }
Not sure this is the best way of doing it. My understanding is that the
solutions performance rely on the cach functionality for the filters. So the
first search will be a lot more slower than the following. Is this a correct
assumption?


On Mon, May 23, 2011 at 12:16 PM, Marco Dissel <marco.dissel@gmail.com>wrote:

> Can you share your implementation (if it's generic)
>
> Thanks
> Op 23 mei 2011 11:47 schreef "lars aslin" <larsaaslin@gmail.com> het
> volgende:
> > Thanks Diggy, have built a solution for faceted searches based on
> OpenBitSet
> > now and it seems to run really smooth. Haven't been able run any actual
> > tests yet but the overall experience is good.
> >
> >
> > On Fri, May 20, 2011 at 2:57 PM, digy digy <digydigy@gmail.com> wrote:
> >
> >> This is a good example of thread hijacking.
> >>
> >> These kinds of problems are mostly related to the analyzer used. Try to
> use
> >> different analyzers like KeywordAnalyzer, WhitespaceAnalyzer(or a custom
> >> one) (Also don't forget to use the same analyzer while indexing and
> >> searching)
> >>
> >> If your field is "not analyzed", you can also try to use TermQuery while
> >> searching.
> >>
> >> DIGY
> >>
> >> On Fri, May 20, 2011 at 3:36 PM, K a r n a v <karunakerreddyv@gmail.com
> >> >wrote:
> >>
> >> > How can we handle special characters like .;: /
> >> \?"\<>~`*!@#$%^&-_+={}[]|(
> >> > When I search with hyderabad-india...internally it is removing the
> char
> >> '-'
> >> > and constructing the query like hyderabad india.
> >> > (Even I've tried with escape char also...but I've failed)
> >> > could you please give me small example to handle special chars while
> >> > searching (and while constructing query) ...
> >> > ...Please help me in this regard.
> >> >
> >> >
> >> >
> >> > On Fri, May 20, 2011 at 5:56 PM, K a r n a v <
> karunakerreddyv@gmail.com
> >> > >wrote:
> >> >
> >> > > wow...thx DIGY.. let me chk this ...if it can improve the
> >> performance...i
> >> > > will replace my code with OpenBitSitDISI
> >> > >
> >> > >
> >> > > On Fri, May 20, 2011 at 4:31 PM, digy digy <digydigy@gmail.com>
> wrote:
> >> > >
> >> > >> I prepared a sample. Maybe, this can help
> >> > >>
> >> > >> http://people.apache.org/~digy/FacetedSearch.cs
> >> > >>
> >> > >> DIGY
> >> > >>
> >> > >>
> >> > >> On Fri, May 20, 2011 at 10:43 AM, K a r n a v <
> >> > karunakerreddyv@gmail.com
> >> > >> >wrote:
> >> > >>
> >> > >> > I've tried with OpenBitSetDISI...its not a performance effective
> >> > one....
> >> > >> > but
> >> > >> > able generate results with BitArray fast enough....
> >> > >> > ... check the demo app
> >> > >> >
> >> > >> >
> >> > >>
> >> >
> >>
>
> http://demo.wisestepp.com/jobs/JobSearchResults.aspx?am9ibmFtZSxqYXZh-yCfmbUTwmeY%3d
> >> > >> >
> >> > >> > If you have an example or sample code to work with
> >> > >> OpenBitSetDISI..please
> >> > >> > fwd it to me...
> >> > >> >
> >> > >> > Thank you in advance.
> >> > >> >
> >> > >> > Regards,
> >> > >> > Karunaker Reddy V
> >> > >> >
> >> > >> >
> >> > >> > On Fri, May 20, 2011 at 1:05 PM, digy digy <digydigy@gmail.com>
> >> > wrote:
> >> > >> >
> >> > >> > > Take a look at OpenBitSetDISI in Lucene.Net.Util
> >> > >> > >
> >> > >> > > DIGY
> >> > >> > >
> >> > >> > > On Fri, May 20, 2011 at 10:20 AM, lars aslin <
> >> larsaaslin@gmail.com>
> >> > >> > wrote:
> >> > >> > >
> >> > >> > > > Hi
> >> > >> > > > Im building a search function with Lucene.NET where
I want
> >> faceted
> >> > >> > > > searches to be supported. To accomplish this in
an efficient
> way
> >> I
> >> > >> > > > thought a good idea would be to have a filter for
each
> category
> >> > and
> >> > >> > > > then use the AND-operation between each category-filters
> >> BitArray
> >> > >> and
> >> > >> > > > the BitArray for the search result, thus getting
the search
> >> result
> >> > >> > > > grouped by category by just using the AND-operation.
Heres a
> >> > >> blog-post
> >> > >> > > > describing it better
> >> > >> > > >
> >> > >> > > >
> >> > >> > >
> >> > >> >
> >> > >>
> >> >
> >>
>
> http://www.devatwork.nl/articles/lucenenet/faceted-search-and-drill-down-lucenenet/
> >> > >> > > > .
> >> > >> > > >
> >> > >> > > > Now, I see that the Bits-method is tagged as "obsolete"
in
> the
> >> API
> >> > >> so
> >> > >> > > > I guess it will be removed sometime in the future.
So is
> there
> >> > >> another
> >> > >> > > > way of accomplish faceted searches in Lucene in
a simular way
> or
> >> > >> > > > should stick to the Bits-method to receive a BitArray
anyway.
> >> > >> > > > WebRep
> >> > >> > > > Overall rating
> >> > >> > > >
> >> > >> > >
> >> > >> >
> >> > >>
> >> > >
> >> > >
> >> > >
> >> > > --
> >> > > *Thanks & Regards*,
> >> > > *Karunaker Reddy V
> >> > >
> >> > > *http://www.flickr.com/photos/karnav/
> >> > >
> >> > > *Ooh!!*, and one more thing: *no matter who you are, you were built
> to
> >> be
> >> > > brilliant and designed to make a difference in this world*.* PLEASE
> DOT
> >> > IT
> >> > > *!
> >> > >
> >> >
> >> >
> >> >
> >> > --
> >> > *Thanks & Regards*,
> >> > *Karunaker Reddy V
> >> >
> >> > *http://www.flickr.com/photos/karnav/
> >> >
> >> > *Ooh!!*, and one more thing: *no matter who you are, you were built to
> be
> >> > brilliant and designed to make a difference in this world*.* PLEASE
> DOT
> >> > IT*!
> >> >
> >>
> >
> > WebRep
> > Overall rating
> >
> > WebRep
> > Overall rating
>

WebRep
Overall rating


Mime
View raw message